OK. I think that one does not need voting then. I will proceed with my PR
for that :).

J.

On Wed, Mar 25, 2020 at 3:38 PM Daniel Imberman <daniel.imber...@gmail.com>
wrote:

> Agreed. Kind of a “best we can do” considering the current nature of
> python.
> On Mar 24, 2020, 2:45 PM -0700, Driesprong, Fokko <fo...@driesprong.frl>,
> wrote:
> > Yes, I'd be in favor of not having two packages, and just pinning the
> > versions then. In this case, all the versions will be pinned, so if a
> user
> > wants to install a newer version of elastic, they have to do it
> explicitly.
> > For Java, you have nice packages that will check if you break any public
> > API, but for Python this is impossible :'(
> >
> > Cheers, Fokko
> >
> >
> > Op di 24 mrt. 2020 om 11:11 schreef Jarek Potiuk <
> jarek.pot...@polidea.com>:
> >
> > > And yet another update: - after seeing how it works I will remove
> > > requirement generation from pre-commit - now that it needs to be
> > > generated separately for different versions of python it's a bit too
> > > much overhead (you'd need to have more images downloaded for different
> > > python versions for pre-commit). Instaad I will add breeze commands to
> > > re-ggenerate the requirements (and bash scripts if you do not use
> > > breeze), and anyone changing setup.py will have to do it (otherwise CI
> > > builds will fail). I think this workflow will be great to keep our
> > > requirements up-to-date and have a stable installation method.
> > >
> > > J.
> > >
> > >
> > > On Mon, Mar 23, 2020 at 5:54 PM Jarek Potiuk <jarek.pot...@polidea.com
> >
> > > wrote:
> > > >
> > > > Update - It seems that we won't need the -pinned version eventually.
> I
> > > realized that we need to have slightly different requirements for
> different
> > > python versions.
> > > >
> > > > I just added PR for that:
> https://github.com/apache/airflow/pull/7841
> > > >
> > > > I also found out (during production image exercise) that we can
> install
> > > airflow predictably in a very simple way (once we release the
> requirements
> > > in 1.10.10):
> > > >
> > > > pip install apache-airflow[gcp]==1.10.10 --constraint
> > >
> https://raw.githubusercontent.com/apache/airflow/1.10.10/requirements/requirements-python3.7.txt
> > > >
> > > > I think this is simple enough to be used as installation method. I
> added
> > > it to the documentation and I think I am ok with dropping -pinned
> package
> > > altogether.
> > > >
> > > > J.
> > > >
> > > >
> > > > On Sun, Mar 22, 2020 at 10:15 AM Jarek Potiuk <
> jarek.pot...@polidea.com>
> > > wrote:
> > > > >
> > > > > Yesterday we had another master breakage - this time from
> elasticsearch
> > > releasing MINOR version 7.6 breaking our builds (not it was MINOR
> version
> > > so should be compatible .... it was not for us). I fixed it quickly
> > > yesterday by limiting it to < 7.6 but for me - this is quite clear that
> > > trying to rely on SemVer being followed by others is a futile effort
> (at
> > > least in python's world).
> > > > >
> > > > > The theory is nice, but it breaks in practice. And it's not really
> a
> > > fault of the library maintainers. It's simply sometimes not so easy to
> see
> > > how your APIs are used - and in Python, you cannot prevent using stuff
> that
> > > you think is an internal detail. This is what happened in elasticsearch
> > > case yesterday - apparently, our plugin was using an "internal" API
> > > unknowingly and some parameters from that API were dropped during
> > > refactoring of elasticsearch library.
> > > > >
> > > > > My observation (it's anecdotal though) is that the COVID-19
> situation
> > > made people have more time, fewer distractions, fewer things to do,
> and we
> > > have higher frequency of OSS packages being released recently so we
> should
> > > protect a bit from more often breakages.
> > > > >
> > > > > I think learning from yesterday is:
> > > > >
> > > > > * we should merge the requirements.txt solution quickly to prevent
> > > further breakages (I am reading and testing it now) - I think everyone
> > > agrees it's good to have it
> > > > > * I think we can continue discussing whether apache-airflow-pinned
> > > package should be released or not. I can leave the code building the
> > > package but we can decide about it after some more discussion
> > > > >
> > > > > Does it sound good?
> > > > >
> > > > > J
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Mar 20, 2020 at 2:47 PM Jarek Potiuk <
> jarek.pot...@polidea.com>
> > > wrote:
> > > > > >
> > > > > > And rebased it right now and fixed automated requirements update.
> > > > > >
> > > > > > On Fri, Mar 20, 2020 at 2:28 PM Jarek Potiuk <
> jarek.pot...@polidea.com>
> > > wrote:
> > > > > > >
> > > > > > > Ah BTW. I just noticed that for some reason I pasted an old PR
> > > earlier in the thread :(.
> > > > > > > This is the one with requirements.txt I am talking about:
> > > https://github.com/apache/airflow/pull/7730
> > > > > > >
> > > > > > > On Fri, Mar 20, 2020 at 2:26 PM Jarek Potiuk <
> > > jarek.pot...@polidea.com> wrote:
> > > > > > > >
> > > > > > > > Nope. Not blocking. I can work with my branch just
> requirements.txt
> > > is enough for that :)
> > > > > > > >
> > > > > > > > I think the problem with semver is that it is loosely
> followed - we
> > > had a number of breakages in the past with minor version upgrades :(.
> > > > > > > >
> > > > > > > > J.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Mar 20, 2020 at 1:27 PM Kaxil Naik <
> kaxiln...@gmail.com>
> > > wrote:
> > > > > > > > >
> > > > > > > > > Thanks for the detailed explanation Jarek.
> > > > > > > > >
> > > > > > > > > How about we have an upper limit for all our dependencies,
> example
> > > instead
> > > > > > > > > of "google-cloud-storage>=1.16", we have
> > > "google-cloud-storage>=1.16,<2.0" ?
> > > > > > > > >
> > > > > > > > > If a dependency breaks compatibility in minor versions, we
> can't do
> > > > > > > > > anything about it but if they follow SemVer, we should be
> safe and
> > > the
> > > > > > > > > first-time installers would have a non-breaking package.
> WDYT?
> > > > > > > > >
> > > > > > > > > Btw I hope this is not blocking you in building a
> production image
> > > as I
> > > > > > > > > think requirements.txt is solving that? Please let me know
> if it is
> > > > > > > > > blocking.
> > > > > > > > >
> > > > > > > > > PS: I am also just dumping my ideas to solve this issue.
> Love to
> > > hear what
> > > > > > > > > others think too.
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Kaxil
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Mar 19, 2020 at 2:43 PM Jarek Potiuk <
> > > jarek.pot...@polidea.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I think we have similar understanding. But let me just
> clarify
> > > because I
> > > > > > > > > > think we think about we think about solving two
> different problems
> > > > > > > > > > My proposal is not solving all problems with
> dependencies - quite
> > > the
> > > > > > > > > > contrary, I want to solve just one specific
> "repeatability"
> > > problem - read
> > > > > > > > > > on :)..
> > > > > > > > > >
> > > > > > > > > > 1. A potential source of confusion: using "-pinned" for
> > > installation but
> > > > > > > > > > > using "non-pinned" for DAG development.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > This could be confusing indeed - but they are the same
> in fact -
> > > > > > > > > > just deps might be different over time.
> > > > > > > > > >
> > > > > > > > > > 2. Most of the users would still try to install
> > > "apache-airflow" package
> > > > > > > > > > > that might have been broken for example because of a
> > > dependency
> > > > > > > > > > release,
> > > > > > > > > > > either way, we would still have to suggest them to use
> > > "pinned"
> > > > > > > > > > version
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > True. I thought we might describe it in the README and
> make it
> > > prominently
> > > > > > > > > > explained. Usually people look at the readme in PyPI
> when they are
> > > > > > > > > > installing
> > > > > > > > > > stuff and it does not work:
> > > https://pypi.org/project/apache-airflow/.
> > > > > > > > > >
> > > > > > > > > > Also - we could of course explain how to use
> requirements.txt
> > > from the
> > > > > > > > > > released
> > > > > > > > > > version when they are installing it. That would be an
> extra
> > > friction point
> > > > > > > > > > though
> > > > > > > > > > and maybe having "always installable" version of airflow
> is a
> > > better
> > > > > > > > > > choice.
> > > > > > > > > >
> > > > > > > > > > 3. If they install "pinned" version, it is no longer a
> library
> > > again,
> > > > > > > > > > > that is users won't be able to use new NumPy release or
> > > matplotlib for
> > > > > > > > > > > example. In which case we are just circling back to
> the same
> > > problem,
> > > > > > > > > > > "either we risk broken package" while releasing or we
> risk
> > > potentially
> > > > > > > > > > > incompatible versions.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Yep. But maybe it's just a question of naming. Maybe
> even we
> > > could name
> > > > > > > > > > this package differently to indicate that this version
> is a way
> > > to quickly
> > > > > > > > > > install
> > > > > > > > > > airflow but not to do any serious development with it.
> > > > > > > > > >
> > > > > > > > > > So speaking about THE problem I want to solve with the
> > > > > > > > > > requirements.txt and apache-airflow-pinned package:
> > > > > > > > > >
> > > > > > > > > > I really only want to solve "first-time-user" experience
> here -
> > > nothing
> > > > > > > > > > more. I
> > > > > > > > > > definitely do not want to replace the current
> installation method
> > > for
> > > > > > > > > > experienced
> > > > > > > > > > users - for them using --constraint requirements.txt is
> exactly
> > > what they
> > > > > > > > > > need.
> > > > > > > > > > The only problem I am trying to solve with that is
> > > "repeatability" of
> > > > > > > > > > installation.
> > > > > > > > > >
> > > > > > > > > > Maybe "apache-airflow-quickinstall" or something like
> that would
> > > be better
> > > > > > > > > > than "apache-airflow-pinned" or
> > > "apache-airflow-repeatable-install" or
> > > > > > > > > > something like that. I think about it as a "flavour" of
> ariflow
> > > rather than
> > > > > > > > > > anything else. I even originally implemented it as
> [pinned] extra
> > > where I
> > > > > > > > > > pinned all requirements. Unfortunately I found that if
> you have
> > > > > > > > > > main requirement without limits, adding the same
> requirement as
> > > extra with
> > > > > > > > > > == does not make it pinned :(. That was my original plan.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > Btw I have been on "we should have pinned dependency"
> camp as
> > > Airflow
> > > > > > > > > > > should definitely install without breaking since day-1
> but I
> > > think a
> > > > > > > > > > > separate "-pinned" package won't solve that issue.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Ah yeah we went the same route. I do not think we can
> solve the
> > > > > > > > > > "library vs. app" problem easily. This is a bit of
> > > "eat-and-have-cake"
> > > > > > > > > > at the same time. I know people have problems
> > > > > > > > > > with conflicting dependencies when they are trying to
> install
> > > libraries
> > > > > > > > > > with different requirements. And I am not even trying to
> solve
> > > that
> > > > > > > > > > problem now. Not even close. This requires some other
> solution
> > > > > > > > > > (for example separate virtualenvs with different
> dependencies
> > > > > > > > > > build from wheels on per-task basis). But that's
> something much
> > > further
> > > > > > > > > > in the future (if at all).
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > WDYT? Also please do let me know if I have
> misunderstood
> > > something
> > > > > > > > > > > (definitely possible :D).
> > > > > > > > > > >
> > > > > > > > > > > Regards,
> > > > > > > > > > > Kaxil
> > > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Jarek Potiuk
> > > > > > > > Polidea | Principal Software Engineer
> > > > > > > >
> > > > > > > > M: +48 660 796 129
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Jarek Potiuk
> > > > > > > Polidea | Principal Software Engineer
> > > > > > >
> > > > > > > M: +48 660 796 129
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Jarek Potiuk
> > > > > > Polidea | Principal Software Engineer
> > > > > >
> > > > > > M: +48 660 796 129
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Jarek Potiuk
> > > > > Polidea | Principal Software Engineer
> > > > >
> > > > > M: +48 660 796 129
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Jarek Potiuk
> > > > Polidea | Principal Software Engineer
> > > >
> > > > M: +48 660 796 129
> > > >
> > > >
> > >
> > >
> > > --
> > >
> > > Jarek Potiuk
> > > Polidea | Principal Software Engineer
> > >
> > > M: +48 660 796 129
> > >
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to