Re: New Job Cacher plugin to cache dependencies of builds on docker based executors
Ok. Thanks. -- You received this message because you are subscribed to the Google Groups "Jenkins Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/97781456-775b-4141-ae09-4b9f710c76b9%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: New Job Cacher plugin to cache dependencies of builds on docker based executors
Probably you are looking for the External Workspace Manager plugin. Last I checked, it had not been extended to really support clouds. I suggest you raise this as an RFE with the PSE team, rather than discussing it here—unless you intend to try developing such an extension yourself, in which case you would likely want to hang out in https://gitter.im/jenkinsci/external-workspace-manager-plugin and ask for advice. -- You received this message because you are subscribed to the Google Groups "Jenkins Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/CANfRfr3HMGUEPP9PRReGLvY-37VfaE%2BsGdqe1SM_GXvB1EaOKQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: New Job Cacher plugin to cache dependencies of builds on docker based executors
Continuing to think on this a bit more - FilePath abstraction doesn't look like it would work as it assumes a computer on the other end. What if there was an "External Storage Plugin" extension point that could be backed by S3 and leveraged by other plugins for managing large files associated with jobs. Ideally it would share a job lifecycle so that when jobs are renamed / deleted, the related external storage area for the jobs would be managed as well. Is there an extension point for something like that? On Wednesday, November 30, 2016 at 3:36:43 PM UTC-5, Peter Hayes wrote: > > Thanks for the insight. I do see that this will cause a burden on the > master node. Since we are using CJP-PSE, that is mitigated somewhat as we > will be running quite a few masters so the ratio of jobs to masters won't > be terribly high. > > Reusing workspaces isn't an option for us due to the architecture of > CJP-PSE at the moment. I actually did start using an externally mounted > volume but as you note, we will run into concurrency issues with shared > caches on the host instance and there is no reliable way to separate the > caches while still getting the benefit of caches as there is no distinct > executor number (always 1). If there was some enhancement to CJP to > transparently manage workspaces across executor (and support parallel build > execution) then we could look at that. I did raise this with the PSE team > in any event a while back and I imagine that this will need to be addressed > as it is a step back in performance from classic persistent Jenkins > executors. > > The other thought that crossed my mind since we are running in AWS is to > leverage a more scalable file store within AWS like S3. Both artifact > archiving and dependency caching could be good candidates. It would be cool > if there was an S3 backing of FilePath abstraction and plugin developers > could seamlessly access it via Project.getStoragePath() or something like > that. Then a plugin like I am proposing could provide a more scalable > solution without hardwiring to S3. I'm guessing I'm not the first to think > of it so there are likely challenges in doing so. > > On Wednesday, November 30, 2016 at 2:04:03 PM UTC-5, Jesse Glick wrote: >> >> On Wed, Nov 30, 2016 at 10:18 AM, Peter Hayes wrote: >> > each time you run a job, you >> > start with a fresh container without any previously cached dependencies >> (we >> > use gradle generally). This increases the length of the build and adds >> > network traffic to our Artifactory instance. I looked around for >> existing >> > plugins but didn't find any so I have started a plugin[1] based on >> > SimpleBuildWrapper that stores a configured set of files on the master >> at >> > the end of the build and then on the next build downloads them to >> master in >> > the original location. >> >> This seems like a poor approach; rather than overloading Artifactory, >> you will be overloading the Jenkins master. Archiving artifacts via >> the Remoting channel can already wreck performance; you are talking >> about potentially orders of magnitude more traffic than that. >> >> There are two basic approaches to this kind of problem. One, which >> assumes that the agents reuse workspaces between builds, is to set the >> local repository/cache location to a workspace location. The >> `docker-workflow` demo does this: >> >> >> https://github.com/jenkinsci/docker-workflow-plugin/blob/46432bbe36af17dac93cfedcc93ffa51beba1343/demo/repo/flow.groovy#L20-L22 >> >> >> The other approach is to mount a volume containing the cache, letting >> the Docker daemon handle the storage, which the >> `parallel-test-executor` demo does: >> >> >> https://github.com/jenkinsci/parallel-test-executor-plugin/blob/3961df3784045df1f6f285bc2b685ead4bc8593b/demo/Makefile#L3-L27 >> >> >> The volume-based approach is probably the more scalable, though there >> are two points to beware: at least Maven’s `install:install` will dump >> locally built artifacts into the repository alongside downloaded >> releases (probably Gradle does something similar); and Maven’s Aether >> repository manager is by default not thread-safe (Takari fixes this). >> Maven 5 may allow the cache to be properly separated (again I am not >> sure how Gradle fares here); in the meantime you may need to ensure >> that there is a distinct volume for every potentially concurrent >> build, for example keyed by `${JOB_NAME}/${EXECUTOR_NUMBER}`. >> >> At any rate the exact solution chosen is going to depend on details of >> how agents are provisioned and workspaces managed, so at root this >> might simply be an RFE for CJP-PSE. >> > -- You received this message because you are subscribed to the Google Groups "Jenkins Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-dev+unsubscr...@googlegroups.com. To view this discussion on the web
Re: New Job Cacher plugin to cache dependencies of builds on docker based executors
Thanks for the insight. I do see that this will cause a burden on the master node. Since we are using CJP-PSE, that is mitigated somewhat as we will be running quite a few masters so the ratio of jobs to masters won't be terribly high. Reusing workspaces isn't an option for us due to the architecture of CJP-PSE at the moment. I actually did start using an externally mounted volume but as you note, we will run into concurrency issues with shared caches on the host instance and there is no reliable way to separate the caches while still getting the benefit of caches as there is no distinct executor number (always 1). If there was some enhancement to CJP to transparently manage workspaces across executor (and support parallel build execution) then we could look at that. I did raise this with the PSE team in any event a while back and I imagine that this will need to be addressed as it is a step back in performance from classic persistent Jenkins executors. The other thought that crossed my mind since we are running in AWS is to leverage a more scalable file store within AWS like S3. Both artifact archiving and dependency caching could be good candidates. It would be cool if there was an S3 backing of FilePath abstraction and plugin developers could seamlessly access it via Project.getStoragePath() or something like that. Then a plugin like I am proposing could provide a more scalable solution without hardwiring to S3. I'm guessing I'm not the first to think of it so there are likely challenges in doing so. On Wednesday, November 30, 2016 at 2:04:03 PM UTC-5, Jesse Glick wrote: > > On Wed, Nov 30, 2016 at 10:18 AM, Peter Hayes > wrote: > > each time you run a job, you > > start with a fresh container without any previously cached dependencies > (we > > use gradle generally). This increases the length of the build and adds > > network traffic to our Artifactory instance. I looked around for > existing > > plugins but didn't find any so I have started a plugin[1] based on > > SimpleBuildWrapper that stores a configured set of files on the master > at > > the end of the build and then on the next build downloads them to master > in > > the original location. > > This seems like a poor approach; rather than overloading Artifactory, > you will be overloading the Jenkins master. Archiving artifacts via > the Remoting channel can already wreck performance; you are talking > about potentially orders of magnitude more traffic than that. > > There are two basic approaches to this kind of problem. One, which > assumes that the agents reuse workspaces between builds, is to set the > local repository/cache location to a workspace location. The > `docker-workflow` demo does this: > > > https://github.com/jenkinsci/docker-workflow-plugin/blob/46432bbe36af17dac93cfedcc93ffa51beba1343/demo/repo/flow.groovy#L20-L22 > > > The other approach is to mount a volume containing the cache, letting > the Docker daemon handle the storage, which the > `parallel-test-executor` demo does: > > > https://github.com/jenkinsci/parallel-test-executor-plugin/blob/3961df3784045df1f6f285bc2b685ead4bc8593b/demo/Makefile#L3-L27 > > > The volume-based approach is probably the more scalable, though there > are two points to beware: at least Maven’s `install:install` will dump > locally built artifacts into the repository alongside downloaded > releases (probably Gradle does something similar); and Maven’s Aether > repository manager is by default not thread-safe (Takari fixes this). > Maven 5 may allow the cache to be properly separated (again I am not > sure how Gradle fares here); in the meantime you may need to ensure > that there is a distinct volume for every potentially concurrent > build, for example keyed by `${JOB_NAME}/${EXECUTOR_NUMBER}`. > > At any rate the exact solution chosen is going to depend on details of > how agents are provisioned and workspaces managed, so at root this > might simply be an RFE for CJP-PSE. > -- You received this message because you are subscribed to the Google Groups "Jenkins Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/0ab8e598-1bc9-4602-ab00-1fcdd33590d6%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: New Job Cacher plugin to cache dependencies of builds on docker based executors
On Wed, Nov 30, 2016 at 10:18 AM, Peter Hayes wrote: > each time you run a job, you > start with a fresh container without any previously cached dependencies (we > use gradle generally). This increases the length of the build and adds > network traffic to our Artifactory instance. I looked around for existing > plugins but didn't find any so I have started a plugin[1] based on > SimpleBuildWrapper that stores a configured set of files on the master at > the end of the build and then on the next build downloads them to master in > the original location. This seems like a poor approach; rather than overloading Artifactory, you will be overloading the Jenkins master. Archiving artifacts via the Remoting channel can already wreck performance; you are talking about potentially orders of magnitude more traffic than that. There are two basic approaches to this kind of problem. One, which assumes that the agents reuse workspaces between builds, is to set the local repository/cache location to a workspace location. The `docker-workflow` demo does this: https://github.com/jenkinsci/docker-workflow-plugin/blob/46432bbe36af17dac93cfedcc93ffa51beba1343/demo/repo/flow.groovy#L20-L22 The other approach is to mount a volume containing the cache, letting the Docker daemon handle the storage, which the `parallel-test-executor` demo does: https://github.com/jenkinsci/parallel-test-executor-plugin/blob/3961df3784045df1f6f285bc2b685ead4bc8593b/demo/Makefile#L3-L27 The volume-based approach is probably the more scalable, though there are two points to beware: at least Maven’s `install:install` will dump locally built artifacts into the repository alongside downloaded releases (probably Gradle does something similar); and Maven’s Aether repository manager is by default not thread-safe (Takari fixes this). Maven 5 may allow the cache to be properly separated (again I am not sure how Gradle fares here); in the meantime you may need to ensure that there is a distinct volume for every potentially concurrent build, for example keyed by `${JOB_NAME}/${EXECUTOR_NUMBER}`. At any rate the exact solution chosen is going to depend on details of how agents are provisioned and workspaces managed, so at root this might simply be an RFE for CJP-PSE. -- You received this message because you are subscribed to the Google Groups "Jenkins Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/CANfRfr3pkMzJ9MzMFhUPNRkePJCM3EeyDd3C1KrgAtXvnnZnWg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
New Job Cacher plugin to cache dependencies of builds on docker based executors
Hi, We are using Cloudbees Private SaaS Edition which utilizes docker containers as executors. A side effect of this is that each time you run a job, you start with a fresh container without any previously cached dependencies (we use gradle generally). This increases the length of the build and adds network traffic to our Artifactory instance. I looked around for existing plugins but didn't find any so I have started a plugin[1] based on SimpleBuildWrapper that stores a configured set of files on the master at the end of the build and then on the next build downloads them to master in the original location. I still have more work remaining but prior to investing more time, I wanted to check with this group to see if it makes sense to complete this or if there is a better option. I also had seen a post[2] on the user's list a few months ago looking for a similar capability that didn't come up with anything. Thanks, Pete [1] https://github.com/petehayes/jobcacher-plugin [2] https://groups.google.com/forum/#!topic/jenkinsci-users/n0A1qBLe2Is -- You received this message because you are subscribed to the Google Groups "Jenkins Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/381ee609-3568-4b4d-9930-978ec2378c7f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.