The idea was to include only part of the test-data which is necessary for the test excluding the integration test AND have a special corpus for integration-tests, which can be downloaded on demand.
The motivation was to keep the releases smaller. For the second part, it would be nice, if we have different collections, e.g. poi-basic (the additional files, which are currently used for integration test), tika (tika office corpus???), gov-docs, common-crawl, common-crawl-excel, common-crawl-10gb (only 10gb) Andi On 13.11.2016 18:05, Dominik Stadler wrote: > Hm, we are including the test-data directory in the sources as far as I > see, so you should be able to run test-integration when you download just > the source-package, or do I miss something here? > > Dominik > > On Sun, Nov 13, 2016 at 4:57 PM, Javen O'Neal <[email protected]> wrote: > >> +1 for this idea. >> >> Possible solutions: >> 1) Publish the commands for a sparse svn checkout on the website. It looks >> like Subversion doesn't have a simple "svn checkout >> https://svn.apache.org/repos/asf/poi/trunk poi --exclude-dir test-data", >> but we could get the same behavior with a checkout immediates, checkout >> infinity awk listdir exclude test-data. >> This could he packaged into a bat/shell script, ant target, or Gradle >> target. >> 2) retree test-data to be a sibling of trunk. We would need have some way >> of pinning test-data so that old releases could be run against these >> documents without breaking. >> 3) Migrate away from asking users to check out the source using a >> Subversion client, using Gradle to perform this checkout instead (solution >> 1). >> >> On Nov 13, 2016 5:41 AM, "Andreas Beeker" <[email protected]> wrote: >> >>> Hi, >>> >>> our test corpus is constantly growing and I think this is good, as this >>> covers the edge-cases in the integration tests. >>> >>> But I wonder if we need to include those files in the releases, e.g. we >>> could make those files downloadable in case >>> a users executes test-integration. Or maybe we find a way to have a >> common >>> corpus with tika ... but it should be >>> easy to download/test those with/-in the ant/gradle scripts. >>> >>> What do you think? >>> >>> Andi >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >>> --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
