[GitHub] [airflow] potiuk commented on pull request #28442: Re-enable azure service bus on ARM as it now builds cleanly
potiuk commented on PR #28442: URL: https://github.com/apache/airflow/pull/28442#issuecomment-1358388757 > We could try to do with simple cases. It is not mandatory I'd rather say it nice to have, if we know that package could be install on Linux (glib-based, not musl) but required install from sources then would be nice to have this information which save some time. A better option will be to add links to installation instructions for those packages that we know might be problematic. For example for Plyvel linking https://plyvel.readthedocs.io/en/latest/installation.html would be fine. NOTE - even Plyvel developer limited those to "Ubuntu and Debian". > I thought you have some kind template about how many contributors we already have and would be nice if someone who find become a contributor and improve this part 🤣 Well. I do.. But this one is tricky :). Any contribution there will at most explain what is needed for the OS/distribution of that particular user. Which might be even more misleading for other distros/MacOS/ARM. It's super-hard to write a generic installation instructions even if we limit to apt/yum (debian/RedHat). Because the same OS packages are often even named differently. This is a true rabbit hole we want to avoid. For some reason even creators of the libraries are super vague sometimes and only limit it to some distros. > I think this would be nice if it not required a lot of effort from our side. > BTW, do we have some statistic about downloads particular image from Docker Hub? Very little. We know the total for apache/airflow: ``` curl -s https://hub.docker.com/v2/repositories/apache/airflow | jq -r ".pull_count" ``` Result: 95575880 Also we can see "per tag" last pull - and usually it is between few seconds and 2 hours (8-10 hrs sometimes) last when I checked. But this is unscientific really. We do not pull the images during the CI (we only use ghcr.io) so we know it's not skewed by our CI though. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #28442: Re-enable azure service bus on ARM as it now builds cleanly
potiuk commented on PR #28442: URL: https://github.com/apache/airflow/pull/28442#issuecomment-1356858609 > Oh, I just repeat step which probably most of user do first, just try to install it by `pip` into the regular image. I see what you mean. But the same error might happen for a number of other packages - not even our providers. There are plenty of packages there that require being compiled (unfortunately). And even some of our providers do. And some of them require some system packages to be installed. I do not think we can reasonably describe it for each provider separately - especially that many of our users do not use the image at all - they install airflow in their own virtualenv and on different distributions and there different dependencies are needed for those different distributions. I am afraid (but maybe I am exaggerating) that if we start describing what else is needed, we might make people think that we should describe all the prerequisites for all the possible distributions, and that this description is "complete" - i.e. explains all the necessary requirements. For example you won't be able to install mysql provider without having mysql client libraries installed - and there are different ways how those can be installed on centos, redhat, debian, mint, etc. not even mentioning MacOS installation - which brings the whole new host of problems (additionally M1 + Intel). I think if we start describing things like that in the docs, we are going down the rabbit hole. Should we verify and check all the prerequisites and describe them for all providers? Doing it for one provider will - pretty inevitably - lead to people asking: > My provider fails to install and I see those other providers have instructions - surely you should have instruction here and I should not need to solve it myself when I am installing it on . - what are the instructions?" I think it's a dangerous path to walk. BuI just thought about another idea. Something that I have thought about before. Maybe - on top of having optimized apache/airflow image and "slim" image we should have "fat" image (maybe a bit better named) that will have build-essentials installed and all of the librarires CI image has installed in "dev". Then the solution for anyone who wants to install such package will be - "customize" (you will have optimized image), "extend with build essentials" (less optimized) and "use fat image" (huge image, but will install any provider by default). Then we can even add a generic FAQ on "build error" when adding your package (or maybe even we could try to invent some smart way of capturing the error in the PROD image and displaying the instruction to the user. I **think** it could be done by changing shelll in the image to our custom command. I think building such "generic" message what to do in case of build error and three options to follow - one of them as easy as changing `FROM apache/airflow:2.5.0` to `FROM apache/airflow:fat-2.5.0` - should do the job nicely. WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #28442: Re-enable azure service bus on ARM as it now builds cleanly
potiuk commented on PR #28442: URL: https://github.com/apache/airflow/pull/28442#issuecomment-1356844403 > Hmmm.. I've unable to install it into the regular airflow image We have a number of requirements that require customising image rather than extending it. We have good instructions on it including how to customize image (if you want optimal size of the image): https://airflow.apache.org/docs/docker-stack/build.html#extending-the-image Or how to add a dependcy that requires compilation (I guess if you follow it, you will be able to install it): https://airflow.apache.org/docs/docker-stack/build.html#example-when-you-add-packages-requiring-compilation > Maybe better keep requirement without any changes for now, create additional extra where we do not limit platform and later I could investigate and add into the Azure provider documentation about additional requirements for install it on Linux ARM. I think this is rather fine when a package has no wheel to expect it is going to be compiled - and when it compiles cleanly without too much of a hassle, I think it's perfectly fine. We have a number of other deps and providers that expect the "customize/add build-essentials" approach. And this one is only in case of ARM so I am not too worried - someone who wants to install azure provider on ARM is anyhow walking an experimental path and should be well aware of what they are doing. But maybe indeed in case of Azure we should not make it a requirement and make it optional extra of the provider - same as plyvel for google provider? That would be - however - backwards incompatible for AMD users - because installing the provider will not pull azure-service-bus as requirement, so I am not too sure (we could likely come with some mixed approach where we have != aarch64 as "core" requirement and then have an extra to install it without that limitation - but I feel that would be rather cumbersome. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #28442: Re-enable azure service bus on ARM as it now builds cleanly
potiuk commented on PR #28442: URL: https://github.com/apache/airflow/pull/28442#issuecomment-1356840195 Actually - when I tried it on ARM now, the azure-service-bus package built cleanly on ARM without any setup and special libraries. I think it is quite ok to add it - we have quite a few packages that are built during installation on ARM, and this generally is not a big problem as long as the packages CAN be cleanly built without special workarounds or libraries. Similarly plyvel - plyvel is optional dependency of Google Package and as long as we can build it cleanly (will try your solution from #28432 @Taragolis) I will re-add it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org