potiuk commented on issue #19970: URL: https://github.com/apache/airflow/issues/19970#issuecomment-992774799
Very good questions :). I think for now start small (2.) and just get the images built with all the parameters. The reason why the logic is so complex because it is heavily optimized for rebuiild time and it handles some cases on linuux where permissions of the files have to be updated when they are generated inside the container etc. etc. This is one of the most complex bash code I ever wrote, so I certainly do not expect we should get all of that working immediately. Also I think I made one huge mistake when I developed it - I tried to use the same code to build CI images and PROD imges - but in case of Bash such "common" code becomes extremely unreadable and complex when you try to keep it serve various purposes. So I think what you should really focus is to just implement "build-ci-image" command (and I think we should move away from "build-image --production", but we should have separate `build-ci-image` and `build-prod-image` commands. There is enough difference between them to make them separate (and maybe reuse some python code which wil bey much easier than bash code reuse. For now I tihnk what we really need: * get some structure in Python that should keep all the necessary parameters to build the image (TypedDict ?) * get a function that will return the structure based on (initially) command line parameters/flag (go through the list of parameters and implement those that will be useful to get the full image build) - if in doubt whether a parameter should be used, ask * eventually this function should also take into account "last used" params for some of the parameters (some of those parameters are stored in .build - I think .PYTHON only is used for build. But let's not worry about that as well. This will be much more useful later when we get to the "shell" command. * do not worry about 'rebuild if needeed", computing md5, pulling remote etc. There is a separate issue to implement those, similarly fixing ownership etc. will be done later * skip "cache" parameters for now. we should use "local" cache for now only - this will speed up testing and iterations. Your goal should really be: let's be able to build CI image wiht "build-image --python NN --.... " command with a number of variants possible by specifiying the right parameters. Eventually it is all about converting the "simple" parameters into the "docker command" which should be run. Don't try to replicate the "structure" of code from Bash. The Python structure should be very different and much more pythonic, also a lot of the reuse will be done differently than it was done in Bash (partly due to Bash limitations, partly due to historical evolution of Breeze, partly due to mistakes I made when I designed it). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org