Sorry for chiming in so late. I would be in favor of option #2.

I guess that the PMC would need to give the credentials to the release
manager for option #1. Hence, the PMC could also add the release manager as
a maintainer which makes sure that only the PMC can delete artifacts.

Cheers,
Till

On Wed, Jul 24, 2019 at 12:33 PM jincheng sun <sunjincheng...@gmail.com>
wrote:

> Hi all,
>
> Thanks for all of your reply!
>
> Hi Stephan, thanks for the reply and prove the details we need to pay
> attention to. such as: Readme and Trademark compliance. Regarding the PyPI
> account for release,  #1 may have some risk that our release package can be
> deleted by anyone who know the password of the account. And in this case
> PMC would not have means to correct problems. So, I think the #2 is pretty
> safe for flink community.
>
> Hi Jeff&Dian, thanks for share your thoughts. Python API just a language
> entry point. I think which binary should be contained in the release we
> should make consistency with Java release policy.  So, currently we do not
> add the Hadoop, connectors JARs into the release package.
>
> Hi Chesnay, agree that we should ship the very common binary in feature if
> Java side already make the decision.
>
> So, our current consensus is:
> 1. Should we re publish the PyFlink into PyPI --> YES
> 2. PyPI Project Name ---> apache-flink
> 3. How to handle Scala_2.11 and Scala_2.12 ---> We only release one binary
> with the default Scala version same with flink default config.
>
> We still need discuss how to manage PyPI account for release:
> --------
> > 1) Create an account such as 'pyflink' as the owner share it with all the
> release managers and then release managers can publish the package to PyPI
> using this account.
>     2) Create an account such as 'pyflink' as owner(only PMC can manage it)
> and adds the release manager's account as maintainers of the project.
> Release managers publish the package to PyPI using their own account.
> --------
> Stephan like the #1 but want PMC can correct the problems. (sounds like #2)
> can you conform that ? @Stephan
> Chesnay and I prefer to #2
>
> Best, Jincheng
>
> Chesnay Schepler <ches...@apache.org> 于2019年7月24日周三 下午3:57写道:
>
> > if we ship a binary, we should ship the binary we usually ship, not some
> > highly customized version.
> >
> > On 24/07/2019 05:19, Dian Fu wrote:
> > > Hi Stephan & Jeff,
> > >
> > > Thanks a lot for sharing your thoughts!
> > >
> > > Regarding the bundled jars, currently only the jars in the flink binary
> > distribution is packaged in the pyflink package. That maybe a good idea
> to
> > also bundle the other jars such as flink-hadoop-compatibility. We may
> need
> > also consider whether to bundle the format jars such as flink-avro,
> > flink-json, flink-csv and the connector jars such as
> flink-connector-kafka,
> > etc.
> > >
> > > If FLINK_HOME is set, the binary distribution specified by FLINK_HOME
> > will be used instead.
> > >
> > > Regards,
> > > Dian
> > >
> > >> 在 2019年7月24日,上午9:47,Jeff Zhang <zjf...@gmail.com> 写道:
> > >>
> > >> +1 for publishing pyflink to pypi.
> > >>
> > >> Regarding including jar, I just want to make sure which flink binary
> > >> distribution we would ship with pyflink since we have multiple flink
> > binary
> > >> distributions (w/o hadoop).
> > >> Personally, I prefer to use the hadoop-included binary distribution.
> > >>
> > >> And I just want to confirm whether it is possible for users to use a
> > >> different flink binary distribution as long as he set env FLINK_HOME.
> > >>
> > >> Besides that, I hope that there will be bi-direction link reference
> > between
> > >> flink doc and pypi doc.
> > >>
> > >>
> > >>
> > >> Stephan Ewen <se...@apache.org> 于2019年7月24日周三 上午12:07写道:
> > >>
> > >>> Hi!
> > >>>
> > >>> Sorry for the late involvement. Here are some thoughts from my side:
> > >>>
> > >>> Definitely +1 to publishing to PyPy, even if it is a binary release.
> > >>> Community growth into other communities is great, and if this is the
> > >>> natural way to reach developers in the Python community, let's do it.
> > This
> > >>> is not about our convenience, but reaching users.
> > >>>
> > >>> I think the way to look at this is that this is a convenience
> > distribution
> > >>> channel, courtesy of the Flink community. It is not an Apache
> release,
> > we
> > >>> make this clear in the Readme.
> > >>> Of course, this doesn't mean we don't try to uphold similar standards
> > as
> > >>> for our official release (like proper license information).
> > >>>
> > >>> Concerning credentials sharing, I would be fine with whatever option.
> > The
> > >>> PMC doesn't own it (it is an initiative by some community members),
> > but the
> > >>> PMC needs to ensure trademark compliance, so slight preference for
> > option
> > >>> #1 (PMC would have means to correct problems).
> > >>>
> > >>> I believe there is no need to differentiate between Scala versions,
> > because
> > >>> this is merely a convenience thing for pure Python users. Users that
> > mix
> > >>> python and scala (and thus depend on specific scala versions) can
> still
> > >>> download from Apache or build themselves.
> > >>>
> > >>> Best,
> > >>> Stephan
> > >>>
> > >>>
> > >>>
> > >>> On Thu, Jul 4, 2019 at 9:51 AM jincheng sun <
> sunjincheng...@gmail.com>
> > >>> wrote:
> > >>>
> > >>>> Hi All,
> > >>>>
> > >>>> Thanks for the feedback @Chesnay Schepler <ches...@apache.org>
> @Dian!
> > >>>>
> > >>>> I think using `apache-flink` for the project name also makes sense
> to
> > me.
> > >>>> due to we should always keep in mind that Flink is owned by Apache.
> > (And
> > >>>> beam also using this pattern `apache-beam` for Python API)
> > >>>>
> > >>>> Regarding the Python API release with the JAVA JARs, I think the
> > >>> principle
> > >>>> of consideration is the convenience of the user. So, Thanks for the
> > >>>> explanation @Dian!
> > >>>>
> > >>>> And your right @Chesnay Schepler <ches...@apache.org>  we can't
> make
> > a
> > >>>> hasty decision and we need more people's opinions!
> > >>>>
> > >>>> So, I appreciate it if anyone can give us feedback and suggestions!
> > >>>>
> > >>>> Best,
> > >>>> Jincheng
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> Chesnay Schepler <ches...@apache.org> 于2019年7月3日周三 下午8:46写道:
> > >>>>
> > >>>>> So this would not be a source release then, but a full-blown binary
> > >>>>> release.
> > >>>>>
> > >>>>> Maybe it is just me, but I find it a bit suspect to ship an entire
> > java
> > >>>>> application via PyPI, just because there's a Python API for it.
> > >>>>>
> > >>>>> We definitely need input from more people here.
> > >>>>>
> > >>>>> On 03/07/2019 14:09, Dian Fu wrote:
> > >>>>>> Hi Chesnay,
> > >>>>>>
> > >>>>>> Thanks a lot for the suggestions.
> > >>>>>>
> > >>>>>> Regarding “distributing java/scala code to PyPI”:
> > >>>>>> The Python Table API is just a wrapper of the Java Table API and
> > >>>> without
> > >>>>> the java/scala code, two steps will be needed to set up an
> > environment
> > >>> to
> > >>>>> execute a Python Table API program:
> > >>>>>> 1) Install pyflink using "pip install apache-flink"
> > >>>>>> 2) Download the flink distribution and set the FLINK_HOME to it.
> > >>>>>> Besides, users have to make sure that the manually installed Flink
> > is
> > >>>>> compatible with the pip installed pyflink.
> > >>>>>> Bundle the java/scala code inside the Python package will
> eliminate
> > >>>> step
> > >>>>> 2) and makes it more simple for users to install pyflink. There
> was a
> > >>>> short
> > >>>>> discussion <https://issues.apache.org/jira/browse/SPARK-1267> on
> > this
> > >>> in
> > >>>>> Spark community and they finally decide to package the java/scala
> > code
> > >>> in
> > >>>>> the python package. (BTW, PySpark only bundle the jars of scala
> > 2.11).
> > >>>>>> Regards,
> > >>>>>> Dian
> > >>>>>>
> > >>>>>>> 在 2019年7月3日,下午7:13,Chesnay Schepler <ches...@apache.org> 写道:
> > >>>>>>>
> > >>>>>>> The existing artifact in the pyflink project was neither released
> > by
> > >>>>> the Flink project / anyone affiliated with it nor approved by the
> > Flink
> > >>>> PMC.
> > >>>>>>> As such, if we were to use this account I believe we should
> delete
> > >>> it
> > >>>>> to not mislead users that this is in any way an apache-provided
> > >>>>> distribution. Since this goes against the users wishes, I would be
> in
> > >>>> favor
> > >>>>> of creating a separate account, and giving back control over the
> > >>> pyflink
> > >>>>> account.
> > >>>>>>> My take on the raised points:
> > >>>>>>> 1.1) "apache-flink"
> > >>>>>>> 1.2)  option 2
> > >>>>>>> 2) Given that we only distribute python code there should be no
> > >>> reason
> > >>>>> to differentiate between scala versions. We should not be
> > distributing
> > >>>> any
> > >>>>> java/scala code and/or modules to PyPi. Currently, I'm a bit
> confused
> > >>>> about
> > >>>>> this question and wonder what exactly we are trying to publish
> here.
> > >>>>>>> 3) The should be treated as any other source release; i.e., it
> > >>> needs a
> > >>>>> LICENSE and NOTICE file, signatures and a PMC vote. My suggestion
> > would
> > >>>> be
> > >>>>> to make this part of our normal release process. There will be
> _one_
> > >>>> source
> > >>>>> release on dist.apache.org encompassing everything, and a separate
> > >>>> python
> > >>>>> of focused source release that we push to PyPi. The LICENSE and
> > NOTICE
> > >>>>> contained in the python source release must also be present in the
> > >>> source
> > >>>>> release of Flink; so basically the python source release is just
> the
> > >>>>> contents of flink-python module the maven pom.xml, with no other
> > >>> special
> > >>>>> sauce added during the release process.
> > >>>>>>> On 02/07/2019 05:42, jincheng sun wrote:
> > >>>>>>>> Hi all,
> > >>>>>>>>
> > >>>>>>>> With the effort of FLIP-38 [1], the Python Table API(without UDF
> > >>>>> support
> > >>>>>>>> for now) will be supported in the coming release-1.9.
> > >>>>>>>> As described in "Build PyFlink"[2], if users want to use the
> > Python
> > >>>>> Table
> > >>>>>>>> API, they can manually install it using the command:
> > >>>>>>>> "cd flink-python && python3 setup.py sdist && pip install
> > >>>>> dist/*.tar.gz".
> > >>>>>>>> This is non-trivial for users and it will be better if we can
> > >>> follow
> > >>>>> the
> > >>>>>>>> Python way to publish PyFlink to PyPI
> > >>>>>>>> which is a repository of software for the Python programming
> > >>>> language.
> > >>>>> Then
> > >>>>>>>> users can use the standard Python package
> > >>>>>>>> manager "pip" to install PyFlink: "pip install pyflink". So,
> there
> > >>>> are
> > >>>>> some
> > >>>>>>>> topic need to be discussed as follows:
> > >>>>>>>>
> > >>>>>>>> 1. How to publish PyFlink to PyPI
> > >>>>>>>>
> > >>>>>>>> 1.1 Project Name
> > >>>>>>>>       We need to decide the project name of PyPI to use, for
> > >>> example,
> > >>>>>>>> apache-flink,  pyflink, etc.
> > >>>>>>>>
> > >>>>>>>>      Regarding to the name "pyflink", it has already been
> > >>> registered
> > >>>> by
> > >>>>>>>> @ueqt and there is already a package '1.0' released under this
> > >>>> project
> > >>>>>>>> which is based on flink-libraries/flink-python.
> > >>>>>>>>
> > >>>>>>>>     @ueqt has kindly agreed to give this project back to the
> > >>>>> community. And
> > >>>>>>>> he has requested that the released package '1.0' should not be
> > >>>> removed
> > >>>>> as
> > >>>>>>>> it has already been used in their company.
> > >>>>>>>>
> > >>>>>>>>      So we need to decide whether to use the name 'pyflink'?  If
> > >>> yes,
> > >>>>> we
> > >>>>>>>> need to figure out how to tackle with the package '1.0' under
> this
> > >>>>> project.
> > >>>>>>>>      From the points of my view, the "pyflink" is better for our
> > >>>>> project
> > >>>>>>>> name and we can keep the release of 1.0, maybe more people want
> to
> > >>>> use.
> > >>>>>>>> 1.2 PyPI account for release
> > >>>>>>>>      We need also decide on which account to use to publish
> > >>> packages
> > >>>>> to PyPI.
> > >>>>>>>>      There are two permissions in PyPI: owner and maintainer:
> > >>>>>>>>
> > >>>>>>>>      1) The owner can upload releases, delete files, releases or
> > >>> the
> > >>>>> entire
> > >>>>>>>> project.
> > >>>>>>>>      2) The maintainer can also upload releases. However, they
> > >>> cannot
> > >>>>> delete
> > >>>>>>>> files, releases, or the project.
> > >>>>>>>>
> > >>>>>>>>      So there are two options in my mind:
> > >>>>>>>>
> > >>>>>>>>      1) Create an account such as 'pyflink' as the owner share
> it
> > >>>> with
> > >>>>> all
> > >>>>>>>> the release managers and then release managers can publish the
> > >>>> package
> > >>>>> to
> > >>>>>>>> PyPI using this account.
> > >>>>>>>>      2) Create an account such as 'pyflink' as owner(only PMC
> can
> > >>>>> manage it)
> > >>>>>>>> and adds the release manager's account as maintainers of the
> > >>> project.
> > >>>>>>>> Release managers publish the package to PyPI using their own
> > >>> account.
> > >>>>>>>>      As I know, PySpark takes Option 1) and Apache Beam takes
> > >>> Option
> > >>>>> 2).
> > >>>>>>>>      From the points of my view, I prefer option 2) as it's
> pretty
> > >>>>> safer as
> > >>>>>>>> it eliminate the risk of deleting old releases occasionally and
> at
> > >>>> the
> > >>>>> same
> > >>>>>>>> time keeps the trace of who is operating.
> > >>>>>>>>
> > >>>>>>>> 2. How to handle Scala_2.11 and Scala_2.12
> > >>>>>>>>
> > >>>>>>>> The PyFlink package bundles the jars in the package. As we know,
> > >>>> there
> > >>>>> are
> > >>>>>>>> two versions of jars for each module: one for Scala 2.11 and the
> > >>>> other
> > >>>>> for
> > >>>>>>>> Scala 2.12. So there will be two PyFlink packages theoretically.
> > We
> > >>>>> need to
> > >>>>>>>> decide which one to publish to PyPI or both. If both packages
> will
> > >>> be
> > >>>>>>>> published to PyPI, we may need two projects, such as pyflink_211
> > >>> and
> > >>>>>>>> pyflink_212 separately. Maybe more in the future such as
> > >>> pyflink_213.
> > >>>>>>>>      (BTW, I think we should bring up a discussion for dorp
> > >>>> Scala_2.11
> > >>>>> in
> > >>>>>>>> Flink 1.10 release due to 2.13 is available in early June.)
> > >>>>>>>>
> > >>>>>>>>      From the points of my view, for now, we can only release
> the
> > >>>>> scala_2.11
> > >>>>>>>> version, due to scala_2.11 is our default version in Flink.
> > >>>>>>>>
> > >>>>>>>> 3. Legal problems of publishing to PyPI
> > >>>>>>>>
> > >>>>>>>> As @Chesnay Schepler <ches...@apache.org>  pointed out in
> > >>>>> FLINK-13011[3],
> > >>>>>>>> publishing PyFlink to PyPI means that we will publish binaries
> to
> > a
> > >>>>>>>> distribution channel not owned by Apache. We need to figure out
> if
> > >>>>> there
> > >>>>>>>> are legal problems. From my point of view, there are no problems
> > >>> as a
> > >>>>> few
> > >>>>>>>> Apache projects such as Spark, Beam, etc have already done it.
> > >>>> Frankly
> > >>>>>>>> speaking, I am not familiar with this problem, welcome any
> > feedback
> > >>>> on
> > >>>>> this
> > >>>>>>>> if somebody is more family with this.
> > >>>>>>>>
> > >>>>>>>> Great thanks to @ueqt for willing to dedicate PyPI's project
> name
> > >>>>> `pyflink`
> > >>>>>>>> to the Apache Flink community!!!
> > >>>>>>>> Great thanks to @Dian for the offline effort!!!
> > >>>>>>>>
> > >>>>>>>> Best,
> > >>>>>>>> Jincheng
> > >>>>>>>>
> > >>>>>>>> [1]
> > >>>>>>>>
> > >>>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-38%3A+Python+Table+API
> > >>>>>>>> [2]
> > >>>>>>>>
> > >>>
> >
> https://ci.apache.org/projects/flink/flink-docs-master/flinkDev/building.html#build-pyflink
> > >>>>>>>> [3] https://issues.apache.org/jira/browse/FLINK-13011
> > >>>>>>>>
> > >>>>>
> > >>
> > >> --
> > >> Best Regards
> > >>
> > >> Jeff Zhang
> > >
> >
> >
>

Reply via email to