Thanks for the detailed explanation Jarek. How about we have an upper limit for all our dependencies, example instead of "google-cloud-storage>=1.16", we have "google-cloud-storage>=1.16,<2.0" ?
If a dependency breaks compatibility in minor versions, we can't do anything about it but if they follow SemVer, we should be safe and the first-time installers would have a non-breaking package. WDYT? Btw I hope this is not blocking you in building a production image as I think requirements.txt is solving that? Please let me know if it is blocking. PS: I am also just dumping my ideas to solve this issue. Love to hear what others think too. Regards, Kaxil On Thu, Mar 19, 2020 at 2:43 PM Jarek Potiuk <jarek.pot...@polidea.com> wrote: > I think we have similar understanding. But let me just clarify because I > think we think about we think about solving two different problems > My proposal is not solving all problems with dependencies - quite the > contrary, I want to solve just one specific "repeatability" problem - read > on :).. > > 1. A potential source of confusion: using "-pinned" for installation but > > using "non-pinned" for DAG development. > > > > This could be confusing indeed - but they are the same in fact - > just deps might be different over time. > > 2. Most of the users would still try to install "apache-airflow" package > > that might have been broken for example because of a dependency > release, > > either way, we would still have to suggest them to use "pinned" > version > > > > True. I thought we might describe it in the README and make it prominently > explained. Usually people look at the readme in PyPI when they are > installing > stuff and it does not work: https://pypi.org/project/apache-airflow/. > > Also - we could of course explain how to use requirements.txt from the > released > version when they are installing it. That would be an extra friction point > though > and maybe having "always installable" version of airflow is a better > choice. > > 3. If they install "pinned" version, it is no longer a library again, > > that is users won't be able to use new NumPy release or matplotlib for > > example. In which case we are just circling back to the same problem, > > "either we risk broken package" while releasing or we risk potentially > > incompatible versions. > > > > Yep. But maybe it's just a question of naming. Maybe even we could name > this package differently to indicate that this version is a way to quickly > install > airflow but not to do any serious development with it. > > So speaking about THE problem I want to solve with the > requirements.txt and apache-airflow-pinned package: > > I really only want to solve "first-time-user" experience here - nothing > more. I > definitely do not want to replace the current installation method for > experienced > users - for them using --constraint requirements.txt is exactly what they > need. > The only problem I am trying to solve with that is "repeatability" of > installation. > > Maybe "apache-airflow-quickinstall" or something like that would be better > than "apache-airflow-pinned" or "apache-airflow-repeatable-install" or > something like that. I think about it as a "flavour" of ariflow rather than > anything else. I even originally implemented it as [pinned] extra where I > pinned all requirements. Unfortunately I found that if you have > main requirement without limits, adding the same requirement as extra with > == does not make it pinned :(. That was my original plan. > > > > Btw I have been on "we should have pinned dependency" camp as Airflow > > should definitely install without breaking since day-1 but I think a > > separate "-pinned" package won't solve that issue. > > > > Ah yeah we went the same route. I do not think we can solve the > "library vs. app" problem easily. This is a bit of "eat-and-have-cake" > at the same time. I know people have problems > with conflicting dependencies when they are trying to install libraries > with different requirements. And I am not even trying to solve that > problem now. Not even close. This requires some other solution > (for example separate virtualenvs with different dependencies > build from wheels on per-task basis). But that's something much further > in the future (if at all). > > > > > > WDYT? Also please do let me know if I have misunderstood something > > (definitely possible :D). > > > > Regards, > > Kaxil >