Hi,

On 10/19/07, Keith R. Bennett <[EMAIL PROTECTED]> wrote:
> As a short term marginal improvement though, until we can do the full
> solution, would it make sense to consider .csv plain text?  I'm assuming
> that's just a matter of adding to tika-mimetypes.xml a line:
>
>                 <glob pattern="*.csv" />

Sounds OK. File an improvement request for that and feel free to
commit the change.

> Would doing so cause other problems though?  Should we consider non-.txt and
> non-.asc files binary unless byte header detection reports they're plain
> text?  Perhaps we should wait until that's working instead of adding the
> .csv glob pattern?

I don't see how that could cause problems. Anything that's a sequence
of characters should be fine as text/plain unless we have a more
specific type available.

In fact, based on http://www.apache.org/dev/svn-eol-style.txt we coud
add the following as text/plain globs:

    <glob pattern="INSTALL"/>
    <glob pattern="KEYS"/>
    <glob pattern="Makefile"/>
    <glob pattern="README"/>
    <glob pattern="abs-linkmap"/>
    <glob pattern="abs-menulinks"/>
    <glob pattern="*.aart"/>
    <glob pattern="*.ac"/>
    <glob pattern="*.am"/>
    <glob pattern="*.bat"/>
    <glob pattern="*.c"/>
    <glob pattern="*.cat"/>
    <glob pattern="*.cgi"/>
    <glob pattern="*.classpath"/>
    <glob pattern="*.cmd"/>
    <glob pattern="*.conf"/>
    <glob pattern="*.config"/>
    <glob pattern="*.cpp"/>
    <glob pattern="*.css"/>
    <glob pattern="*.cwiki"/>
    <glob pattern="*.data"/>
    <glob pattern="*.dcl"/>
    <glob pattern="*.dtd"/>
    <glob pattern="*.egrm"/>
    <glob pattern="*.ent"/>
    <glob pattern="*.ft"/>
    <glob pattern="*.fn"/>
    <glob pattern="*.fv"/>
    <glob pattern="*.grm"/>
    <glob pattern="*.g"/>
    <glob pattern="*.h"/>
    <glob pattern=".htaccess"/>
    <glob pattern="*.ihtml"/>
    <glob pattern="*.in"/>
    <glob pattern="*.java"/>
    <glob pattern="*.jmx"/>
    <glob pattern="*.jsp"/>
    <glob pattern="*.js"/>
    <glob pattern="*.junit"/>
    <glob pattern="*.jx"/>
    <glob pattern="*.manifest"/>
    <glob pattern="*.m4"/>
    <glob pattern="*.mf"/>
    <glob pattern="*.MF"/>
    <glob pattern="*.meta"/>
    <glob pattern="*.mod"/>
    <glob pattern="*.n3"/>
    <glob pattern="*.pen"/>
    <glob pattern="*.pl"/>
    <glob pattern="*.pm"/>
    <glob pattern="*.pod"/>
    <glob pattern="*.pom"/>
    <glob pattern="*.project"/>
    <glob pattern="*.properties"/>
    <glob pattern="*.py"/>
    <glob pattern="*.rb"/>
    <glob pattern="*.rdf"/>
    <glob pattern="*.rnc"/>
    <glob pattern="*.rng"/>
    <glob pattern="*.rnx"/>
    <glob pattern="*.roles"/>
    <glob pattern="*.rss"/>
    <glob pattern="*.sh"/>
    <glob pattern="*.sql"/>
    <glob pattern="*.svg"/>
    <glob pattern="*.tld"/>
    <glob pattern="*.types"/>
    <glob pattern="*.vm"/>
    <glob pattern="*.vsl"/>
    <glob pattern="*.wsdd"/>
    <glob pattern="*.wsdl"/>
    <glob pattern="*.xargs"/>
    <glob pattern="*.xcat"/>
    <glob pattern="*.xconf"/>
    <glob pattern="*.xegrm"/>
    <glob pattern="*.xgrm"/>
    <glob pattern="*.xlex"/>
    <glob pattern="*.xlog"/>
    <glob pattern="*.xmap"/>
    <glob pattern="*.xroles"/>
    <glob pattern="*.xsamples"/>
    <glob pattern="*.xsd"/>
    <glob pattern="*.xsl"/>
    <glob pattern="*.xslt"/>
    <glob pattern="*.xsp"/>
    <glob pattern="*.xul"/>
    <glob pattern="*.xweb"/>
    <glob pattern="*.xwelcome"/>

BR,

Jukka Zitting

Reply via email to