[ 
https://issues.apache.org/jira/browse/TIKA-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jukka Zitting resolved TIKA-1260.
---------------------------------

       Resolution: Not A Problem
    Fix Version/s:     (was: 1.5)

What you're seeing is the result of using the file name as a hint of the type 
of the file. If the file name ends in {{.txt}} or some similar suffix, it 
probably should be treated as a text file, even if it doesn't contain anything. 
Only when no such hints are available will Tika fall back to 
{{application/octet-stream}}. See:

{code}
$ touch empty.txt
$ java -jar tika-app-1.5.jar --detect empty.txt
text/plain
$ java -jar tika-app-1.5.jar --detect < empty.txt
application/octet-stream
{code}

> Detection result for zero-byte files is text/plain
> --------------------------------------------------
>
>                 Key: TIKA-1260
>                 URL: https://issues.apache.org/jira/browse/TIKA-1260
>             Project: Tika
>          Issue Type: Bug
>          Components: detector
>    Affects Versions: 1.5
>         Environment: Linux Mint 16 
>            Reporter: Johan van der Knijff
>            Priority: Minor
>              Labels: empty, zero-length
>
> Running Tika with the -d (detection) option, any zero-byte files are 
> identified as "text/plain". I'm wondering if this is the intended behavior? I 
> know the Unix File tool reports "inode/x-empty" in such cases. Perhaps Tika 
> should do this as well?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to