> On 15 May 2020, at 18:15, Luke Cwik <[email protected]> wrote: > > 1. Is there any internal Beam functionality to pre-stage or, at least cache, > already staged artifacts? Since the same pipeline will be executed many times > in a row, there is no reason to stage the same artifacts every run. > > > Part of artifact staging there is supposed to be deduplication but sometimes > minor changes in the files like the jar gets recreated with the same contents > but different times leads to a different hash breaking the deduplication.
Do you mean that it should work so by default or it should be explicitly enabled? Sounds like this is what I need. > > You can always embed your artifacts in your containers and try to make it so > that you have zero artifacts to stage/retrieve. Yes, this is what I called a “workaround” but in this case I would need to disable artifacts staging at all, I guess, by setting “--filesToStage” to empty directory? > > > 2. Is it possible to pre-run SDK Harness containers and reuse them for every > Portable Runner pipeline? I could win quite a lot of time on this for more > complicated pipelines. > > > > Well, I guess I can find some workarounds for that but I wished to ask before > that perhaps there is a better way to do that in Beam. > > > Regards, > Alexey
