Thank you for your positive feedback Seth !
Would you please vote in the voting mail thread. Thank you!

Best,
Jincheng


Seth Wiesman <sjwies...@gmail.com> 于2020年8月10日周一 下午10:34写道:

> I think this sounds good. +1
>
> On Wed, Aug 5, 2020 at 8:37 PM jincheng sun <sunjincheng...@gmail.com>
> wrote:
>
>> Hi David, Thank you for sharing the problems with the current document,
>> and I agree with you as I also got the same feedback from Chinese users. I
>> am often contacted by users to ask questions such as whether PyFlink
>> supports "Java UDF" and whether PyFlink supports "xxxConnector". The root
>> cause of these problems is that our existing documents are based on Java
>> users (text and API mixed part). Since Python is newly added from 1.9, many
>> document information is not friendly to Python users. They don't want to
>> look for Python content in unfamiliar Java documents. Just yesterday, there
>> were complaints from Chinese users about where is all the document entries
>> of  Python API. So, have a centralized entry and clear document structure,
>> which is the urgent demand of Python users. The original intention of FLIP
>> is do our best to solve these user pain points.
>>
>> Hi Xingbo and Wei Thank you for sharing PySpark's status on document
>> optimization. You're right. PySpark already has a lot of Python user
>> groups. They also find that Python user community is an important position
>> for multilingual support. The centralization and unification of Python
>> document content will reduce the learning cost of Python users, and good
>> document structure and content will also reduce the Q & A burden of the
>> community, It's a once and for all job.
>>
>> Hi Seth, I wonder if your concerns have been resolved through the
>> previous discussion?
>>
>> Anyway, the principle of FLIP is that in python document should only
>> include Python specific content, instead of making a copy of the Java
>> content. And would be great to have you to join in the improvement for
>> PyFlink (Both PRs and Review PRs).
>>
>> Best,
>> Jincheng
>>
>>
>> Wei Zhong <weizhong0...@gmail.com> 于2020年8月5日周三 下午5:46写道:
>>
>>> Hi Xingbo,
>>>
>>> Thanks for your information.
>>>
>>> I think the PySpark's documentation redesigning deserves our attention.
>>> It seems that the Spark community has also begun to treat the user
>>> experience of Python documentation more seriously. We can continue to pay
>>> attention to the discussion and progress of the redesigning in the Spark
>>> community. It is so similar to our working that there should be some ideas
>>> worthy for us.
>>>
>>> Best,
>>> Wei
>>>
>>>
>>> 在 2020年8月5日,15:02,Xingbo Huang <hxbks...@gmail.com> 写道:
>>>
>>> Hi,
>>>
>>> I found that the spark community is also working on redesigning pyspark
>>> documentation[1] recently. Maybe we can compare the difference between our
>>> document structure and its document structure.
>>>
>>> [1] https://issues.apache.org/jira/browse/SPARK-31851
>>>
>>> http://apache-spark-developers-list.1001551.n3.nabble.com/Need-some-help-and-contributions-in-PySpark-API-documentation-td29972.html
>>>
>>> Best,
>>> Xingbo
>>>
>>> David Anderson <da...@alpinegizmo.com> 于2020年8月5日周三 上午3:17写道:
>>>
>>>> I'm delighted to see energy going into improving the documentation.
>>>>
>>>> With the current documentation, I get a lot of questions that I believe
>>>> reflect two fundamental problems with what we currently provide:
>>>>
>>>> (1) We have a lot of contextual information in our heads about how
>>>> Flink works, and we are able to use that knowledge to make reasonable
>>>> inferences about how things (probably) work in cases we aren't so familiar
>>>> with. For example, I get a lot of questions of the form "If I use <this
>>>> feature> will I still have exactly once guarantees?" The answer is always
>>>> yes, but they continue to have doubts because we have failed to clearly
>>>> communicate this fundamental, underlying principle.
>>>>
>>>> This specific example about fault tolerance applies across all of the
>>>> Flink docs, but the general idea can also be applied to the Table/SQL and
>>>> PyFlink docs. The guiding principles underlying these APIs should be
>>>> written down in one easy-to-find place.
>>>>
>>>> (2) The other kind of question I get a lot is "Can I do <X> with <Y>?"
>>>> E.g., "Can I use the JDBC table sink from PyFlink?" These questions can be
>>>> very difficult to answer because it is frequently the case that one has to
>>>> reason about why a given feature doesn't seem to appear in the
>>>> documentation. It could be that I'm looking in the wrong place, or it could
>>>> be that someone forgot to document something, or it could be that it can in
>>>> fact be done by applying a general mechanism in a specific way that I
>>>> haven't thought of -- as in this case, where one can use a JDBC sink from
>>>> Python if one thinks to use DDL.
>>>>
>>>> So I think it would be helpful to be explicit about both what is, and
>>>> what is not, supported in PyFlink. And to have some very clear organizing
>>>> principles in the documentation so that users can quickly learn where to
>>>> look for specific facts.
>>>>
>>>> Regards,
>>>> David
>>>>
>>>>
>>>> On Tue, Aug 4, 2020 at 1:01 PM jincheng sun <sunjincheng...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Seth and David,
>>>>>
>>>>> I'm very happy to have your reply and suggestions. I would like to
>>>>> share my thoughts here:
>>>>>
>>>>> The main motivation we want to refactor the PyFlink doc is that we
>>>>> want to make sure that the Python users could find all they want starting
>>>>> from the PyFlink documentation mainpage. That’s, the PyFlink documentation
>>>>> should have a catalogue which includes all the functionalities available 
>>>>> in
>>>>> PyFlink. However, this doesn’t mean that we will make a copy of the 
>>>>> content
>>>>> of the documentation in the other places. It may be just a reference/link
>>>>> to the other documentation if needed. For the documentation added under
>>>>> PyFlink mainpage, the principle is that it should only include Python
>>>>> specific content, instead of making a copy of the Java content.
>>>>>
>>>>> >>  I'm concerned that this proposal duplicates a lot of content that
>>>>> will quickly get out of sync. It feels like it is documenting PyFlink
>>>>> separately from the rest of the project.
>>>>>
>>>>> Regarding the concerns about maintainability, as mentioned above, The
>>>>> goal of this FLIP is to provide an intelligible entrance of Python API, 
>>>>> and
>>>>> the content in it should only contain the information which is useful for
>>>>> Python users. There are indeed many agenda items that duplicate the Java
>>>>> documents in this FLIP, but it doesn't mean the content would be copied
>>>>> from Java documentation. i.e, if the content of the document is the same 
>>>>> as
>>>>> the corresponding Java document, we will add a link to the Java document.
>>>>> e.g. the "Built-in functions" and "SQL". We only create a page for the
>>>>> Python-only content, and then redirect to the Java document if there is
>>>>> something shared with Java. e.g. "Connectors" and "Catalogs". If the
>>>>> document is Python-only and already exists, we will move it from the old
>>>>> python document to the new python document, e.g. "Configurations". If the
>>>>> document is Python-only and not exists before, we will create a new page
>>>>> for it. e.g. "DataTypes".
>>>>>
>>>>> The main reason we create a new page for Python Data Types is that it
>>>>> is only conceptually one-to-one correspondence with Java Data Types, but
>>>>> the actual document content would be very different from Java DataTypes.
>>>>> Some detailed difference are as following:
>>>>>
>>>>>
>>>>>   - The text in the Java Data Types document is written for JVM-based
>>>>> language users, which is incomprehensible to users who only understand
>>>>> python.
>>>>>   - Currently the Python Data Types does not support the "bridgedTo"
>>>>> method, DataTypes.RAW, DataTypes.NULL and User Defined Types.
>>>>>   - The section "Planner Compatibility" and "Data Type Extraction" are
>>>>> only useful for Java/Scala users.
>>>>>   - We want to add sections which may only apply for Python such as
>>>>> which Data Types are currently supported in Python, the mapping between
>>>>> DataType and Python object type, etc.
>>>>>
>>>>> I think the root cause of such a difference with existing documents is
>>>>> that, Python is the first non-JVM language we support in flink. This means
>>>>> our previous method of sharing documents between Java and Scala may not be
>>>>> suitable for Python. So we will adopt some very different methods to
>>>>> provide documentation for Python users. Of course, we should reduce
>>>>> maintenance costs as much as possible while ensuring user experience.
>>>>> Furthermore, python is the first step of flink multi-language support, and
>>>>> there may be R, Go, etc in future. it is very necessary for us to form 
>>>>> main
>>>>> page for each language, so that users of each type of language can focus 
>>>>> on
>>>>> the content which they care about.
>>>>>
>>>>> >> Things like the cookbook and tutorial should be under the Try Flink
>>>>> section of the documentation.
>>>>>
>>>>> Regarding the position of the "Cookbook" section, in my sense the "Try
>>>>> Flink" is for the new users and the "Cookbook" is for more advanced users,
>>>>> i.e., In “Try Flink” can be the simplest end-to-end example, such as 
>>>>> “Hello
>>>>> World” and In “Cookbook” we can add more use cases closer to production
>>>>> business, Such as, CDN log analysis, PV / UV of e-commerce. So I prefer to
>>>>> keep the current structure.
>>>>>
>>>>> >>  it's relatively straightforward to compare the Python API with the
>>>>> Java and Scala versions.
>>>>>
>>>>> Regarding the comparison between Python API and Java/Scala API, I
>>>>> think the majority of users, especially the beginner users, would not have
>>>>> this demand. The priority of increasing user experience for beginner users
>>>>> seems higher than it from my side. Would you please add more inputs for 
>>>>> why
>>>>> user want to compare? How much impact will the comparison be if we put it
>>>>> on multiple pages :)
>>>>>
>>>>> Thanks for all of your feedback and suggestions, any follow-up
>>>>> feedback is welcome.
>>>>>
>>>>> Best,
>>>>> Jincheng
>>>>>
>>>>>
>>>>> David Anderson <da...@alpinegizmo.com> 于2020年8月3日周一 下午10:49写道:
>>>>>
>>>>>> Jincheng,
>>>>>>
>>>>>> One thing that I like about the way that the documentation is
>>>>>> currently organized is that it's relatively straightforward to compare 
>>>>>> the
>>>>>> Python API with the Java and Scala versions. I'm concerned that if the
>>>>>> PyFlink docs are more independent, it will be challenging to respond to
>>>>>> questions about which features from the other APIs are available from
>>>>>> Python.
>>>>>>
>>>>>> David
>>>>>>
>>>>>> On Mon, Aug 3, 2020 at 8:07 AM jincheng sun <sunjincheng...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Would be great if you could join the contribution of PyFlink
>>>>>>> documentation @Marta !
>>>>>>> Thanks for all of the positive feedback. I will start a formal vote
>>>>>>> then
>>>>>>> later...
>>>>>>>
>>>>>>> Best,
>>>>>>> Jincheng
>>>>>>>
>>>>>>>
>>>>>>> Shuiqiang Chen <acqua....@gmail.com> 于2020年8月3日周一 上午9:56写道:
>>>>>>>
>>>>>>> > Hi jincheng,
>>>>>>> >
>>>>>>> > Thanks for the discussion. +1 for the FLIP.
>>>>>>> >
>>>>>>> > A well-organized documentation will greatly improve the efficiency
>>>>>>> and
>>>>>>> > experience for developers.
>>>>>>> >
>>>>>>> > Best,
>>>>>>> > Shuiqiang
>>>>>>> >
>>>>>>> > Hequn Cheng <he...@apache.org> 于2020年8月1日周六 上午8:42写道:
>>>>>>> >
>>>>>>> >> Hi Jincheng,
>>>>>>> >>
>>>>>>> >> Thanks a lot for raising the discussion. +1 for the FLIP.
>>>>>>> >>
>>>>>>> >> I think this will bring big benefits for the PyFlink users.
>>>>>>> Currently,
>>>>>>> >> the Python TableAPI document is hidden deeply under the
>>>>>>> TableAPI&SQL tab
>>>>>>> >> which makes it quite unreadable. Also, the PyFlink documentation
>>>>>>> is mixed
>>>>>>> >> with Java/Scala documentation. It is hard for users to have an
>>>>>>> overview of
>>>>>>> >> all the PyFlink documents. As more and more functionalities are
>>>>>>> added into
>>>>>>> >> PyFlink, I think it's time for us to refactor the document.
>>>>>>> >>
>>>>>>> >> Best,
>>>>>>> >> Hequn
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> On Fri, Jul 31, 2020 at 3:43 PM Marta Paes Moreira <
>>>>>>> ma...@ververica.com>
>>>>>>> >> wrote:
>>>>>>> >>
>>>>>>> >>> Hi, Jincheng!
>>>>>>> >>>
>>>>>>> >>> Thanks for creating this detailed FLIP, it will make a big
>>>>>>> difference in
>>>>>>> >>> the experience of Python developers using Flink. I'm interested
>>>>>>> in
>>>>>>> >>> contributing to this work, so I'll reach out to you offline!
>>>>>>> >>>
>>>>>>> >>> Also, thanks for sharing some information on the adoption of
>>>>>>> PyFlink,
>>>>>>> >>> it's
>>>>>>> >>> great to see that there are already production users.
>>>>>>> >>>
>>>>>>> >>> Marta
>>>>>>> >>>
>>>>>>> >>> On Fri, Jul 31, 2020 at 5:35 AM Xingbo Huang <hxbks...@gmail.com>
>>>>>>> wrote:
>>>>>>> >>>
>>>>>>> >>> > Hi Jincheng,
>>>>>>> >>> >
>>>>>>> >>> > Thanks a lot for bringing up this discussion and the proposal.
>>>>>>> >>> >
>>>>>>> >>> > Big +1 for improving the structure of PyFlink doc.
>>>>>>> >>> >
>>>>>>> >>> > It will be very friendly to give PyFlink users a unified
>>>>>>> entrance to
>>>>>>> >>> learn
>>>>>>> >>> > PyFlink documents.
>>>>>>> >>> >
>>>>>>> >>> > Best,
>>>>>>> >>> > Xingbo
>>>>>>> >>> >
>>>>>>> >>> > Dian Fu <dian0511...@gmail.com> 于2020年7月31日周五 上午11:00写道:
>>>>>>> >>> >
>>>>>>> >>> >> Hi Jincheng,
>>>>>>> >>> >>
>>>>>>> >>> >> Thanks a lot for bringing up this discussion and the
>>>>>>> proposal. +1 to
>>>>>>> >>> >> improve the Python API doc.
>>>>>>> >>> >>
>>>>>>> >>> >> I have received many feedbacks from PyFlink beginners about
>>>>>>> >>> >> the PyFlink doc, e.g. the materials are too few, the Python
>>>>>>> doc is
>>>>>>> >>> mixed
>>>>>>> >>> >> with the Java doc and it's not easy to find the docs he wants
>>>>>>> to know.
>>>>>>> >>> >>
>>>>>>> >>> >> I think it would greatly improve the user experience if we
>>>>>>> can have
>>>>>>> >>> one
>>>>>>> >>> >> place which includes most knowledges PyFlink users should
>>>>>>> know.
>>>>>>> >>> >>
>>>>>>> >>> >> Regards,
>>>>>>> >>> >> Dian
>>>>>>> >>> >>
>>>>>>> >>> >> 在 2020年7月31日,上午10:14,jincheng sun <sunjincheng...@gmail.com>
>>>>>>> 写道:
>>>>>>> >>> >>
>>>>>>> >>> >> Hi folks,
>>>>>>> >>> >>
>>>>>>> >>> >> Since the release of Flink 1.11, users of PyFlink have
>>>>>>> continued to
>>>>>>> >>> grow.
>>>>>>> >>> >> As far as I know there are many companies have used PyFlink
>>>>>>> for data
>>>>>>> >>> >> analysis, operation and maintenance monitoring business has
>>>>>>> been put
>>>>>>> >>> into
>>>>>>> >>> >> production(Such as 聚美优品[1](Jumei),  浙江墨芷[2] (Mozhi) etc.).
>>>>>>> According
>>>>>>> >>> to
>>>>>>> >>> >> the feedback we received, current documentation is not very
>>>>>>> friendly
>>>>>>> >>> to
>>>>>>> >>> >> PyFlink users. There are two shortcomings:
>>>>>>> >>> >>
>>>>>>> >>> >> - Python related content is mixed in the Java/Scala
>>>>>>> documentation,
>>>>>>> >>> which
>>>>>>> >>> >> makes it difficult for users who only focus on PyFlink to
>>>>>>> read.
>>>>>>> >>> >> - There is already a "Python Table API" section in the Table
>>>>>>> API
>>>>>>> >>> document
>>>>>>> >>> >> to store PyFlink documents, but the number of articles is
>>>>>>> small and
>>>>>>> >>> the
>>>>>>> >>> >> content is fragmented. It is difficult for beginners to learn
>>>>>>> from it.
>>>>>>> >>> >>
>>>>>>> >>> >> In addition, FLIP-130 introduced the Python DataStream API.
>>>>>>> Many
>>>>>>> >>> >> documents will be added for those new APIs. In order to
>>>>>>> increase the
>>>>>>> >>> >> readability and maintainability of the PyFlink document, Wei
>>>>>>> Zhong
>>>>>>> >>> and me
>>>>>>> >>> >> have discussed offline and would like to rework it via this
>>>>>>> FLIP.
>>>>>>> >>> >>
>>>>>>> >>> >> We will rework the document around the following three
>>>>>>> objectives:
>>>>>>> >>> >>
>>>>>>> >>> >> - Add a separate section for Python API under the "Application
>>>>>>> >>> >> Development" section.
>>>>>>> >>> >> - Restructure current Python documentation to a brand new
>>>>>>> structure to
>>>>>>> >>> >> ensure complete content and friendly to beginners.
>>>>>>> >>> >> - Improve the documents shared by Python/Java/Scala to make
>>>>>>> it more
>>>>>>> >>> >> friendly to Python users and without affecting Java/Scala
>>>>>>> users.
>>>>>>> >>> >>
>>>>>>> >>> >> More detail can be found in the FLIP-133:
>>>>>>> >>> >>
>>>>>>> >>>
>>>>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-133%3A+Rework+PyFlink+Documentation
>>>>>>> >>> >>
>>>>>>> >>> >> Best,
>>>>>>> >>> >> Jincheng
>>>>>>> >>> >>
>>>>>>> >>> >> [1] https://mp.weixin.qq.com/s/zVsBIs1ZEFe4atYUYtZpRg
>>>>>>> >>> >> [2] https://mp.weixin.qq.com/s/R4p_a2TWGpESBWr3pLtM2g
>>>>>>> >>> >>
>>>>>>> >>> >>
>>>>>>> >>> >>
>>>>>>> >>>
>>>>>>> >>
>>>>>>>
>>>>>>
>>>

Reply via email to