Can all of the test-data files be excluded from the releases and svn checkout? This might be less effort to maintain than defining two sets: integration-tests and files used for unit tests which are required to generate poi-ooxml-schemas.
The test-data resources take the longest portion of an svn checkout, and could be lazy downloaded when running a test. For people wanting to merely browse the source code, these files are not necessary. On Nov 13, 2016 11:12 AM, "Andreas Beeker" <[email protected]> wrote: > The idea was to include only part of the test-data which is necessary for > the test > excluding the integration test AND have a special corpus for > integration-tests, > which can be downloaded on demand. > > The motivation was to keep the releases smaller. > > For the second part, it would be nice, if we have different collections, > e.g. poi-basic (the additional files, which are currently used for > integration test), > tika (tika office corpus???), gov-docs, common-crawl, common-crawl-excel, > common-crawl-10gb (only 10gb) > > Andi > > > > On 13.11.2016 18:05, Dominik Stadler wrote: > > Hm, we are including the test-data directory in the sources as far as I > > see, so you should be able to run test-integration when you download just > > the source-package, or do I miss something here? > > > > Dominik > > > > On Sun, Nov 13, 2016 at 4:57 PM, Javen O'Neal <[email protected]> wrote: > > > >> +1 for this idea. > >> > >> Possible solutions: > >> 1) Publish the commands for a sparse svn checkout on the website. It > looks > >> like Subversion doesn't have a simple "svn checkout > >> https://svn.apache.org/repos/asf/poi/trunk poi --exclude-dir > test-data", > >> but we could get the same behavior with a checkout immediates, checkout > >> infinity awk listdir exclude test-data. > >> This could he packaged into a bat/shell script, ant target, or Gradle > >> target. > >> 2) retree test-data to be a sibling of trunk. We would need have some > way > >> of pinning test-data so that old releases could be run against these > >> documents without breaking. > >> 3) Migrate away from asking users to check out the source using a > >> Subversion client, using Gradle to perform this checkout instead > (solution > >> 1). > >> > >> On Nov 13, 2016 5:41 AM, "Andreas Beeker" <[email protected]> wrote: > >> > >>> Hi, > >>> > >>> our test corpus is constantly growing and I think this is good, as this > >>> covers the edge-cases in the integration tests. > >>> > >>> But I wonder if we need to include those files in the releases, e.g. we > >>> could make those files downloadable in case > >>> a users executes test-integration. Or maybe we find a way to have a > >> common > >>> corpus with tika ... but it should be > >>> easy to download/test those with/-in the ant/gradle scripts. > >>> > >>> What do you think? > >>> > >>> Andi > >>> > >>> > >>> --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: [email protected] > >>> For additional commands, e-mail: [email protected] > >>> > >>> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
