Can all of the test-data files be excluded from the releases and svn
checkout? This might be less effort to maintain than defining two sets:
integration-tests and files used for unit tests which are required to
generate poi-ooxml-schemas.

The test-data resources take the longest portion of an svn checkout, and
could be lazy downloaded when running a test. For people wanting to merely
browse the source code, these files are not necessary.

On Nov 13, 2016 11:12 AM, "Andreas Beeker" <[email protected]> wrote:

> The idea was to include only part of the test-data which is necessary for
> the test
> excluding the integration test AND have a special corpus for
> integration-tests,
> which can be downloaded on demand.
>
> The motivation was to keep the releases smaller.
>
> For the second part, it would be nice, if we have different collections,
> e.g. poi-basic (the additional files, which are currently used for
> integration test),
> tika (tika office corpus???), gov-docs, common-crawl, common-crawl-excel,
> common-crawl-10gb (only 10gb)
>
> Andi
>
>
>
> On 13.11.2016 18:05, Dominik Stadler wrote:
> > Hm, we are including the test-data directory in the sources as far as I
> > see, so you should be able to run test-integration when you download just
> > the source-package, or do I miss something here?
> >
> > Dominik
> >
> > On Sun, Nov 13, 2016 at 4:57 PM, Javen O'Neal <[email protected]> wrote:
> >
> >> +1 for this idea.
> >>
> >> Possible solutions:
> >> 1) Publish the commands for a sparse svn checkout on the website. It
> looks
> >> like Subversion doesn't have a simple "svn checkout
> >> https://svn.apache.org/repos/asf/poi/trunk poi --exclude-dir
> test-data",
> >> but we could get the same behavior with a checkout immediates, checkout
> >> infinity awk listdir exclude test-data.
> >> This could he packaged into a bat/shell script, ant target, or Gradle
> >> target.
> >> 2) retree test-data to be a sibling of trunk. We would need have some
> way
> >> of pinning test-data so that old releases could be run against these
> >> documents without breaking.
> >> 3) Migrate away from asking users to check out the source using a
> >> Subversion client, using Gradle to perform this checkout instead
> (solution
> >> 1).
> >>
> >> On Nov 13, 2016 5:41 AM, "Andreas Beeker" <[email protected]> wrote:
> >>
> >>> Hi,
> >>>
> >>> our test corpus is constantly growing and I think this is good, as this
> >>> covers the edge-cases in the integration tests.
> >>>
> >>> But I wonder if we need to include those files in the releases, e.g. we
> >>> could make those files downloadable in case
> >>> a users executes test-integration. Or maybe we find a way to have a
> >> common
> >>> corpus with tika ... but it should be
> >>> easy to download/test those with/-in the ant/gradle scripts.
> >>>
> >>> What do you think?
> >>>
> >>> Andi
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: [email protected]
> >>> For additional commands, e-mail: [email protected]
> >>>
> >>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to