RE: Excessive copying

Gilles Scokart Wed, 10 Jan 2007 03:13:52 -0800

I have a similar story.  The aproach I took was different, but the problems
where similar even if probably on a smaller scale.


We also have quiet a lot of modules (+/- 60 developped modules and +/- 20
externals), with quiet a lot very often used (those with names like utils,
or core).
We also have a continuous build building frequently (up to 4 times per
hour), and a very limited space to stored the snapshot build.

The aproach I took was to rebuild all module at every continuous build run.
So I have a single CI build number for all modules build in the same run (I
have see that it make it much more easy to manage).

Before the CI build, I also clean the cache.  I cleaned the cache because we
initally had some problems to setup our repository.  Also, the cache size
was growing too quickly. In our case, the cost is small because the cache is
quickly be repopulated by the 20 externals module, and all the other are
anyway being rebuild.

During the build, the CI compile, test and publish every module into a local
repository.  

If all modules are build correctly, the content of the local repository is
exported (using install) to the share repository (a SNAPSHOT repository),
and all the modules older than 30 minutes are deleted from the share.  I do
that because of the limited size of our share, so that we only keep 1 or 2
build result.

Initially, the intention was to use the share to allow the developers to use
the retrieve the SNAPSHOT version of the jars they are not working for.  But
in practice, very few developpers do that.  Most of them prefer to have all
modules as project in their eclipse workspace.

Finally, I you also mentioned in an other mail that you where using the
cache to make sure you are always building against the same version.  I had
a similar problem.  If my module B use the version x of the module A, I
don't want the module Z using the version x+1.  I solmved that by using
complex ant script.  First my ivy files doesn't contains the revision
numbers directly, but contains tokens.  The build of the first module first
replace the tokens by default values (latest.integration).  Then I retrieve
the resolved version numbers using 'artifactproperty' and store the result
in a property file.  When the second module build, this files is reused to
preprocess the ivy file, and completed with the newly resolved revision.  

To conclude, I have also the feeling that the build could be quicker.  It
take now between 12 and 13 minutes, and I think it could take a few minutes
less.  But I don't think that the time is spend on copying the files.  I
think it is more spent on resolving again and again the low level module :
For example our Util module (the lowest level one) is a dependency of all
other modules.  Which means that it is transitively resolved a lot of time,
which involve quiet a lot of parsing. 


Gilles 


> -----Original Message-----
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of John Williams
> Sent: Tuesday, January 09, 2007 6:15 PM
> To: [email protected]
> Subject: Excessive copying
> 
> I've been thinking recently about how much copying Ivy does, 
> and I've filed some feature request to that effect (Thanks 
> for getting IVY-353 done, Maarten!), but I think chipping 
> away at the problem with feature requests is the wrong away 
> to address this issue, so I'm posting here to share ideas.
> 
> My experience with Ivy has been several iterations of solving 
> one problem only to create another.  I think my situation is 
> kind of a worst-case scenario for Ivy w.r.t. copying because 
> of a combination of factors.  My company has a lot of modules 
> with complicated dependencies between them.  At any given 
> time, many modules  are under active development.  Some of 
> our modules produce very large artifacts, and those tend to 
> be some of the most frequently used modules as well as the 
> most frequently modified.  While I think having such large 
> modules is generally a bad thing, it's not feasible for us to 
> split them up at the moment.
> 
> This situation affects our usage of Ivy in a number of ways, 
> starting with how we number our builds.  Originally I had our 
> continuous integration system publish each build with a 
> unique revision number, but saving every build took up far 
> too much disk space.  As a result, I switched to a system of 
> overwriting old builds with new ones.  We use the same 
> revision number for multiple builds because I think it's 
> probably easier to live with the same number referring to 
> multiple builds than to deal with most revision numbers 
> referring to builds that have been deleted to save space.
> 
> Because we use only file system repositories, I had wanted to 
> bypass the cache to save time and space, but since we 
> overwrite artifacts in our repository so frequently now, I 
> decided it was best to use the cache so that our developers 
> would not have to worry about their dependencies' artifacts 
> changing from minute to minute.
> 
> Because we re-use revision numbers, it is necessary for us to 
> manually clean out the cache to guarantee that the latest 
> version of every module is being used.  This is not a big 
> problem for developers because they don't clean out the cache 
> too often, but the CI system is not smart enough to know when 
> it must clean out the cache, so it does it for every build.  
> As a result, it spends as much time copying artifacts to its 
> cache as it does building.
> 
> I'm very open to solutions that avoid this problem entirely, 
> but I think I good way to fix it might be to support creating 
> hard links in the cache instead of copying artifacts, 
> preferably using a setting in the Ivy config file rather than 
> in ivy.xml files or build.xml files.
> Because I'm proposing yet another use of links, I also think 
> it would be worthwhile to revisit the fix for IVY-353 that 
> allows <ivy:retrieve> to create symlinks.  If both hard and 
> soft links are to be supported, and if links are to be 
> supported in multiple places in Ivy, I think it would be best 
> to change the "symlink" attribute of <ivy:retrieve> to 
> "linkType", with values "copy", "soft", and "hard".
> This same attribute could then be added to file system 
> resolvers to control how artifacts are copied to the cache, 
> and the value "none"
> could be used to implement IVY-360 ("Add property to control 
> useOrigin default").
> 
> Any thoughts?
> 
> Thanks,
> jw

RE: Excessive copying

Reply via email to