[GitHub] [airflow] potiuk commented on pull request #28442: Re-enable azure service bus on ARM as it now builds cleanly

2022-12-19 Thread GitBox


potiuk commented on PR #28442:
URL: https://github.com/apache/airflow/pull/28442#issuecomment-1358388757

   > We could try to do with simple cases. It is not mandatory I'd rather say 
it nice to have, if we know that package could be install on Linux (glib-based, 
not musl) but required install from sources then would be nice to have this 
information which save some time.
   
   A better option will be to add links to installation instructions for those 
packages that we know might be problematic. For example for Plyvel linking 
https://plyvel.readthedocs.io/en/latest/installation.html would be fine. NOTE - 
even Plyvel developer limited those to "Ubuntu and Debian". 
   
   > I thought you have some kind template about how many contributors we 
already have and would be nice if someone who find become a contributor and 
improve this part 🤣
   
   Well. I do.. But this one is tricky :). Any contribution there will at most 
explain what is needed for the OS/distribution of that particular user. Which 
might be even more misleading for other distros/MacOS/ARM. It's super-hard to 
write a generic installation instructions even if we limit to apt/yum 
(debian/RedHat). Because the same OS packages are often even named differently. 
This is a true rabbit hole we want to avoid. For some reason even creators of 
the libraries are super vague sometimes and only limit it to some distros.
   
   > I think this would be nice if it not required a lot of effort from our 
side.
   > BTW, do we have some statistic about downloads particular image from 
Docker Hub?
   
   Very little. We know the total for apache/airflow:
   
   ```
   curl -s https://hub.docker.com/v2/repositories/apache/airflow | jq -r 
".pull_count"
   ```
   
   Result: 95575880
   
   Also we can see "per tag" last pull - and usually it is between few seconds 
and 2 hours (8-10 hrs sometimes) last when I checked. But this is unscientific 
really. We do not pull the images during the CI (we only use ghcr.io) so we 
know it's not skewed by our CI though.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #28442: Re-enable azure service bus on ARM as it now builds cleanly

2022-12-18 Thread GitBox


potiuk commented on PR #28442:
URL: https://github.com/apache/airflow/pull/28442#issuecomment-1356858609

   > Oh, I just repeat step which probably most of user do first, just try to 
install it by `pip` into the regular image.
   
   I see what you mean. But the same error might happen for a number of other 
packages - not even our providers. There are plenty of packages there that 
require being compiled (unfortunately). And even some of our providers do. And 
some of them require some system packages to be installed. I do not think we 
can reasonably describe it for each provider separately - especially that many 
of our users do not use the image at all - they install airflow in their own 
virtualenv and on different distributions and there different dependencies are 
needed for those different distributions.
   
   I am afraid (but maybe I am exaggerating) that if we start describing what 
else is needed, we might make people think that we should describe all the 
prerequisites for all the possible distributions, and that this description is 
"complete" - i.e. explains all the necessary requirements. For example you 
won't be able to install mysql provider without having mysql client libraries 
installed - and there are different ways how those can be installed on centos, 
redhat, debian, mint, etc. not even mentioning MacOS installation - which 
brings the whole new host of problems (additionally M1 + Intel).
   
   I think if we start describing things like that in the docs, we are going 
down the rabbit hole. Should we verify and check all the prerequisites and 
describe them for all providers? Doing it for one provider will - pretty 
inevitably - lead to people asking:
   
   > My provider fails to install and I see those other providers have 
instructions - surely you should have instruction here and I should not need to 
solve it myself when I am installing it on . 
- what are the instructions?"
   
   I think it's a dangerous path to walk.
   
   BuI just thought about another idea.
   
   Something that I have thought about before. Maybe - on top of having 
optimized apache/airflow image and "slim" image we should have "fat" image 
(maybe a bit better named) that will have build-essentials installed and all of 
the librarires CI image has installed in "dev". Then the solution for anyone 
who wants to install such package will be - "customize" (you will have 
optimized image), "extend with build essentials" (less optimized) and "use fat 
image" (huge image, but will install any provider by default). 
   
   Then we can even add a generic FAQ on "build error" when adding your package 
(or maybe even we could try to invent some smart way of capturing the error in 
the PROD image and displaying the instruction to the user. I **think** it could 
be done by changing shelll in the image to our custom command. 
   
   I think building such "generic" message what to do in case of build error 
and three options to follow - one of them as easy as changing `FROM 
apache/airflow:2.5.0` to `FROM apache/airflow:fat-2.5.0` - should do the job 
nicely.
   
   WDYT? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #28442: Re-enable azure service bus on ARM as it now builds cleanly

2022-12-18 Thread GitBox


potiuk commented on PR #28442:
URL: https://github.com/apache/airflow/pull/28442#issuecomment-1356844403

   > Hmmm.. I've unable to install it into the regular airflow image
   
   We have a number of requirements that require customising image rather than 
extending it. We have good instructions on it including how to customize image 
(if you want optimal size of the image):
   https://airflow.apache.org/docs/docker-stack/build.html#extending-the-image
   
   Or how to add a dependcy that requires compilation (I guess if you follow 
it, you will be able to install it):
   
   
https://airflow.apache.org/docs/docker-stack/build.html#example-when-you-add-packages-requiring-compilation
   
   > Maybe better keep requirement without any changes for now, create 
additional extra where we do not limit platform and later I could investigate 
and add into the Azure provider documentation about additional requirements for 
install it on Linux ARM.
   
   I think this is rather fine when a package has no wheel to expect it is 
going to be compiled - and when it compiles cleanly without too much of a 
hassle, I think it's perfectly fine. We have a number of other deps and 
providers that expect the "customize/add build-essentials" approach. And this 
one is only in case of ARM so I am not too worried - someone who wants to 
install azure provider on ARM is anyhow walking an experimental path and should 
be well aware of what they are doing.
   
   But maybe indeed in case of Azure we should not make it a requirement and 
make it optional extra of the provider - same as plyvel for google provider? 
   
   That would be - however - backwards incompatible for AMD users - because 
installing the provider will not pull azure-service-bus as requirement, so I am 
not too sure (we could likely come with some mixed approach where we have != 
aarch64 as "core" requirement and then have an extra to install it without that 
limitation - but I feel that would be rather cumbersome. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #28442: Re-enable azure service bus on ARM as it now builds cleanly

2022-12-18 Thread GitBox


potiuk commented on PR #28442:
URL: https://github.com/apache/airflow/pull/28442#issuecomment-1356840195

   Actually - when I tried it on ARM now, the azure-service-bus package built 
cleanly on ARM without any setup and special libraries. I think it is quite ok 
to add it - we have quite a few packages that are built during installation on 
ARM, and this generally is not a big problem as long as the packages CAN be 
cleanly built without special workarounds or libraries.
   
   Similarly plyvel - plyvel is optional dependency of Google Package and as 
long as we can build it cleanly (will try your solution from #28432 @Taragolis) 
I will re-add it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org