Agreed. Kind of a “best we can do” considering the current nature of python.
On Mar 24, 2020, 2:45 PM -0700, Driesprong, Fokko <fo...@driesprong.frl>, wrote:
> Yes, I'd be in favor of not having two packages, and just pinning the
> versions then. In this case, all the versions will be pinned, so if a user
> wants to install a newer version of elastic, they have to do it explicitly.
> For Java, you have nice packages that will check if you break any public
> API, but for Python this is impossible :'(
>
> Cheers, Fokko
>
>
> Op di 24 mrt. 2020 om 11:11 schreef Jarek Potiuk <jarek.pot...@polidea.com>:
>
> > And yet another update: - after seeing how it works I will remove
> > requirement generation from pre-commit - now that it needs to be
> > generated separately for different versions of python it's a bit too
> > much overhead (you'd need to have more images downloaded for different
> > python versions for pre-commit). Instaad I will add breeze commands to
> > re-ggenerate the requirements (and bash scripts if you do not use
> > breeze), and anyone changing setup.py will have to do it (otherwise CI
> > builds will fail). I think this workflow will be great to keep our
> > requirements up-to-date and have a stable installation method.
> >
> > J.
> >
> >
> > On Mon, Mar 23, 2020 at 5:54 PM Jarek Potiuk <jarek.pot...@polidea.com>
> > wrote:
> > >
> > > Update - It seems that we won't need the -pinned version eventually. I
> > realized that we need to have slightly different requirements for different
> > python versions.
> > >
> > > I just added PR for that: https://github.com/apache/airflow/pull/7841
> > >
> > > I also found out (during production image exercise) that we can install
> > airflow predictably in a very simple way (once we release the requirements
> > in 1.10.10):
> > >
> > > pip install apache-airflow[gcp]==1.10.10 --constraint
> > https://raw.githubusercontent.com/apache/airflow/1.10.10/requirements/requirements-python3.7.txt
> > >
> > > I think this is simple enough to be used as installation method. I added
> > it to the documentation and I think I am ok with dropping -pinned package
> > altogether.
> > >
> > > J.
> > >
> > >
> > > On Sun, Mar 22, 2020 at 10:15 AM Jarek Potiuk <jarek.pot...@polidea.com>
> > wrote:
> > > >
> > > > Yesterday we had another master breakage - this time from elasticsearch
> > releasing MINOR version 7.6 breaking our builds (not it was MINOR version
> > so should be compatible .... it was not for us). I fixed it quickly
> > yesterday by limiting it to < 7.6 but for me - this is quite clear that
> > trying to rely on SemVer being followed by others is a futile effort (at
> > least in python's world).
> > > >
> > > > The theory is nice, but it breaks in practice. And it's not really a
> > fault of the library maintainers. It's simply sometimes not so easy to see
> > how your APIs are used - and in Python, you cannot prevent using stuff that
> > you think is an internal detail. This is what happened in elasticsearch
> > case yesterday - apparently, our plugin was using an "internal" API
> > unknowingly and some parameters from that API were dropped during
> > refactoring of elasticsearch library.
> > > >
> > > > My observation (it's anecdotal though) is that the COVID-19 situation
> > made people have more time, fewer distractions, fewer things to do, and we
> > have higher frequency of OSS packages being released recently so we should
> > protect a bit from more often breakages.
> > > >
> > > > I think learning from yesterday is:
> > > >
> > > > * we should merge the requirements.txt solution quickly to prevent
> > further breakages (I am reading and testing it now) - I think everyone
> > agrees it's good to have it
> > > > * I think we can continue discussing whether apache-airflow-pinned
> > package should be released or not. I can leave the code building the
> > package but we can decide about it after some more discussion
> > > >
> > > > Does it sound good?
> > > >
> > > > J
> > > >
> > > >
> > > >
> > > > On Fri, Mar 20, 2020 at 2:47 PM Jarek Potiuk <jarek.pot...@polidea.com>
> > wrote:
> > > > >
> > > > > And rebased it right now and fixed automated requirements update.
> > > > >
> > > > > On Fri, Mar 20, 2020 at 2:28 PM Jarek Potiuk 
> > > > > <jarek.pot...@polidea.com>
> > wrote:
> > > > > >
> > > > > > Ah BTW. I just noticed that for some reason I pasted an old PR
> > earlier in the thread :(.
> > > > > > This is the one with requirements.txt I am talking about:
> > https://github.com/apache/airflow/pull/7730
> > > > > >
> > > > > > On Fri, Mar 20, 2020 at 2:26 PM Jarek Potiuk <
> > jarek.pot...@polidea.com> wrote:
> > > > > > >
> > > > > > > Nope. Not blocking. I can work with my branch just 
> > > > > > > requirements.txt
> > is enough for that :)
> > > > > > >
> > > > > > > I think the problem with semver is that it is loosely followed - 
> > > > > > > we
> > had a number of breakages in the past with minor version upgrades :(.
> > > > > > >
> > > > > > > J.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Mar 20, 2020 at 1:27 PM Kaxil Naik <kaxiln...@gmail.com>
> > wrote:
> > > > > > > >
> > > > > > > > Thanks for the detailed explanation Jarek.
> > > > > > > >
> > > > > > > > How about we have an upper limit for all our dependencies, 
> > > > > > > > example
> > instead
> > > > > > > > of "google-cloud-storage>=1.16", we have
> > "google-cloud-storage>=1.16,<2.0" ?
> > > > > > > >
> > > > > > > > If a dependency breaks compatibility in minor versions, we 
> > > > > > > > can't do
> > > > > > > > anything about it but if they follow SemVer, we should be safe 
> > > > > > > > and
> > the
> > > > > > > > first-time installers would have a non-breaking package. WDYT?
> > > > > > > >
> > > > > > > > Btw I hope this is not blocking you in building a production 
> > > > > > > > image
> > as I
> > > > > > > > think requirements.txt is solving that? Please let me know if 
> > > > > > > > it is
> > > > > > > > blocking.
> > > > > > > >
> > > > > > > > PS: I am also just dumping my ideas to solve this issue. Love to
> > hear what
> > > > > > > > others think too.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Kaxil
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Mar 19, 2020 at 2:43 PM Jarek Potiuk <
> > jarek.pot...@polidea.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I think we have similar understanding. But let me just clarify
> > because I
> > > > > > > > > think we think about we think about solving two different 
> > > > > > > > > problems
> > > > > > > > > My proposal is not solving all problems with dependencies - 
> > > > > > > > > quite
> > the
> > > > > > > > > contrary, I want to solve just one specific "repeatability"
> > problem - read
> > > > > > > > > on :)..
> > > > > > > > >
> > > > > > > > > 1. A potential source of confusion: using "-pinned" for
> > installation but
> > > > > > > > > > using "non-pinned" for DAG development.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > This could be confusing indeed - but they are the same in 
> > > > > > > > > fact -
> > > > > > > > > just deps might be different over time.
> > > > > > > > >
> > > > > > > > > 2. Most of the users would still try to install
> > "apache-airflow" package
> > > > > > > > > > that might have been broken for example because of a
> > dependency
> > > > > > > > > release,
> > > > > > > > > > either way, we would still have to suggest them to use
> > "pinned"
> > > > > > > > > version
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > True. I thought we might describe it in the README and make it
> > prominently
> > > > > > > > > explained. Usually people look at the readme in PyPI when 
> > > > > > > > > they are
> > > > > > > > > installing
> > > > > > > > > stuff and it does not work:
> > https://pypi.org/project/apache-airflow/.
> > > > > > > > >
> > > > > > > > > Also - we could of course explain how to use requirements.txt
> > from the
> > > > > > > > > released
> > > > > > > > > version when they are installing it. That would be an extra
> > friction point
> > > > > > > > > though
> > > > > > > > > and maybe having "always installable" version of airflow is a
> > better
> > > > > > > > > choice.
> > > > > > > > >
> > > > > > > > > 3. If they install "pinned" version, it is no longer a library
> > again,
> > > > > > > > > > that is users won't be able to use new NumPy release or
> > matplotlib for
> > > > > > > > > > example. In which case we are just circling back to the same
> > problem,
> > > > > > > > > > "either we risk broken package" while releasing or we risk
> > potentially
> > > > > > > > > > incompatible versions.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Yep. But maybe it's just a question of naming. Maybe even we
> > could name
> > > > > > > > > this package differently to indicate that this version is a 
> > > > > > > > > way
> > to quickly
> > > > > > > > > install
> > > > > > > > > airflow but not to do any serious development with it.
> > > > > > > > >
> > > > > > > > > So speaking about THE problem I want to solve with the
> > > > > > > > > requirements.txt and apache-airflow-pinned package:
> > > > > > > > >
> > > > > > > > > I really only want to solve "first-time-user" experience here 
> > > > > > > > > -
> > nothing
> > > > > > > > > more. I
> > > > > > > > > definitely do not want to replace the current installation 
> > > > > > > > > method
> > for
> > > > > > > > > experienced
> > > > > > > > > users - for them using --constraint requirements.txt is 
> > > > > > > > > exactly
> > what they
> > > > > > > > > need.
> > > > > > > > > The only problem I am trying to solve with that is
> > "repeatability" of
> > > > > > > > > installation.
> > > > > > > > >
> > > > > > > > > Maybe "apache-airflow-quickinstall" or something like that 
> > > > > > > > > would
> > be better
> > > > > > > > > than "apache-airflow-pinned" or
> > "apache-airflow-repeatable-install" or
> > > > > > > > > something like that. I think about it as a "flavour" of 
> > > > > > > > > ariflow
> > rather than
> > > > > > > > > anything else. I even originally implemented it as [pinned] 
> > > > > > > > > extra
> > where I
> > > > > > > > > pinned all requirements. Unfortunately I found that if you 
> > > > > > > > > have
> > > > > > > > > main requirement without limits, adding the same requirement 
> > > > > > > > > as
> > extra with
> > > > > > > > > == does not make it pinned :(. That was my original plan.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > Btw I have been on "we should have pinned dependency" camp 
> > > > > > > > > > as
> > Airflow
> > > > > > > > > > should definitely install without breaking since day-1 but I
> > think a
> > > > > > > > > > separate "-pinned" package won't solve that issue.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Ah yeah we went the same route. I do not think we can solve 
> > > > > > > > > the
> > > > > > > > > "library vs. app" problem easily. This is a bit of
> > "eat-and-have-cake"
> > > > > > > > > at the same time. I know people have problems
> > > > > > > > > with conflicting dependencies when they are trying to install
> > libraries
> > > > > > > > > with different requirements. And I am not even trying to solve
> > that
> > > > > > > > > problem now. Not even close. This requires some other solution
> > > > > > > > > (for example separate virtualenvs with different dependencies
> > > > > > > > > build from wheels on per-task basis). But that's something 
> > > > > > > > > much
> > further
> > > > > > > > > in the future (if at all).
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > WDYT? Also please do let me know if I have misunderstood
> > something
> > > > > > > > > > (definitely possible :D).
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > > Kaxil
> > > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Jarek Potiuk
> > > > > > > Polidea | Principal Software Engineer
> > > > > > >
> > > > > > > M: +48 660 796 129
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Jarek Potiuk
> > > > > > Polidea | Principal Software Engineer
> > > > > >
> > > > > > M: +48 660 796 129
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Jarek Potiuk
> > > > > Polidea | Principal Software Engineer
> > > > >
> > > > > M: +48 660 796 129
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Jarek Potiuk
> > > > Polidea | Principal Software Engineer
> > > >
> > > > M: +48 660 796 129
> > > >
> > > >
> > >
> > >
> > > --
> > >
> > > Jarek Potiuk
> > > Polidea | Principal Software Engineer
> > >
> > > M: +48 660 796 129
> > >
> > >
> >
> >
> > --
> >
> > Jarek Potiuk
> > Polidea | Principal Software Engineer
> >
> > M: +48 660 796 129
> >

Reply via email to