[ 
https://issues.apache.org/jira/browse/TIKA-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148629#comment-15148629
 ] 

Nick Burch commented on TIKA-1856:
----------------------------------

Picking one of those files to look at,{{oggz-info}} processes it without 
warning. {{ogginfo}} warns about the EOS being missing on both streams, but 
otherwise gives no errors

Trying with mplayer, it reports some issues with the file:
{code}
[vorbis @ 0x7f1470f5cb00]partition out of bounds: type, begin, end, size, 
blocksize: 2, 0, 192, 16, 1024
[vorbis @ 0x7f1470f5cb00] Vorbis setup header packet corrupt (residues). 
[vorbis @ 0x7f1470f5cb00]Setup header corrupt.
Could not open codec.
{code}

Do you know where these files came from? It looks like they have been truncated 
some how, could that be the case? 

(If so, we'd probably just need to improve the truncation error handling)

> Error while parsing an ogg file
> -------------------------------
>
>                 Key: TIKA-1856
>                 URL: https://issues.apache.org/jira/browse/TIKA-1856
>             Project: Tika
>          Issue Type: Bug
>          Components: detector, parser
>    Affects Versions: 1.12
>         Environment: python
>            Reporter: Yash Tanna
>              Labels: newbie, tika
>         Attachments: 
> 1B7A7AE8FE999D22E2A677EFDA38982C8957CF77BEF33717777E48852F7D67A7, 
> 1DE811ACAB8432D526EFE9D941E5EFE58F3C89F1AAB6CB7152091961DD854431, 
> 4600B9FF184F6AB71AA0CF6873E580FB0A31D75CE1218998057E9A185A5FFBB2, 
> 5E5892EA6C2B4A07BE998403A04127C7924E5539DB3EB0D27B9BD34D11A1575B, 
> CA3065B754E6CE79E4BF128464F4A202B0F2CF0336FBE73FA33F13776CD01CE8, 
> F036789D92EE18032556D9D0ECAC75073CED52226E1833001E379740E23E183D, 
> F33BFE4B1AF562D40E5B9D9F5D4B34EA6734F8F3A06F99535F100F957958D9BA, 
> F47F833BFD4A7E55C128DD76DB3666EEFFD0F5EDA24BF3EEEE1D6F2427BA092D, 
> FA9D1D2B8D0FB50CFE306FA6024EC48BD771562878B9B70D38D106DF4E61147A
>
>
> Unable to detect a malformed ogg file. The error thrown was 
> Exception in thread "main" java.io.IOException: Asked to read 4335 bytes
> from 0 but hit EoF at 780
>         at org.gagravarr.ogg.IOUtils.readFully(IOUtils.java:39)
>         at org.gagravarr.ogg.IOUtils.readFully(IOUtils.java:31)
>         at org.gagravarr.ogg.OggPage.<init>(OggPage.java:82)
>         at
> org.gagravarr.ogg.OggPacketReader.getNextPacket(OggPacketReader.java:116)
>         at org.gagravarr.tika.OggDetector.detect(OggDetector.java:97)
>         at
> org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61)
>         at org.apache.tika.cli.TikaCLI$10.process(TikaCLI.java:291)
>         at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:477)
>         at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:134)
> [xdatadeploy@xdata upload]$



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to