[jira] [Created] (TIKA-851) M4V magic detection invalid

2012-01-27 Thread Alexander Chow (Created) (JIRA)
M4V magic detection invalid --- Key: TIKA-851 URL: https://issues.apache.org/jira/browse/TIKA-851 Project: Tika Issue Type: Bug Components: mime Affects Versions: 1.0 Reporter: Alexander Chow

[jira] [Commented] (TIKA-851) M4V magic detection invalid

2012-01-27 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194790#comment-13194790 ] Nick Burch commented on TIKA-851: - I'm not sure if we're going to be able to differentiate b

[jira] [Updated] (TIKA-851) M4V and M4A detection invalid

2012-01-27 Thread Alexander Chow (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Chow updated TIKA-851: Description: When the mime type of an M4V file is detected using its name only, it returns video/x-m

[jira] [Resolved] (TIKA-851) M4V and M4A detection invalid

2012-01-27 Thread Nick Burch (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-851. - Resolution: Fixed Fix Version/s: 1.1 > M4V and M4A detection invalid > -

[jira] [Commented] (TIKA-851) M4V and M4A detection invalid

2012-01-27 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194813#comment-13194813 ] Nick Burch commented on TIKA-851: - It looks like most files (not sure if it's all of them th

[jira] [Updated] (TIKA-851) M4V and M4A detection invalid

2012-01-27 Thread Alexander Chow (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Chow updated TIKA-851: Attachment: TIKA-851.patch I've added a patch file that I think should fix the problem for both M4V a

[jira] [Commented] (TIKA-851) M4V and M4A detection invalid

2012-01-27 Thread Alexander Chow (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194829#comment-13194829 ] Alexander Chow commented on TIKA-851: - Sorry Nick, I didn't notice you update the SVN.

[jira] [Commented] (TIKA-851) M4V and M4A detection invalid

2012-01-27 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194854#comment-13194854 ] Nick Burch commented on TIKA-851: - >From >http://developer.apple.com/library/mac/#documenta

[jira] [Commented] (TIKA-851) M4V and M4A detection invalid

2012-01-27 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194891#comment-13194891 ] Nick Burch commented on TIKA-851: - I've added the audio/x-m4a alias in r1236734.

buildbot failure in ASF Buildbot on tika-trunk

2012-01-27 Thread buildbot
The Buildbot has detected a new failure on builder tika-trunk while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/tika-trunk/builds/724 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: isis_ubuntu Build Reason: scheduler Build Source Stamp: [

[jira] [Commented] (TIKA-842) IPTC Properties Should be Defined Completely and Independently of the Drew Library

2012-01-27 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194916#comment-13194916 ] Nick Burch commented on TIKA-842: - Following the confirmation from the IPTC that we can use

Build failed in Jenkins: Tika-trunk #785

2012-01-27 Thread Apache Jenkins Server
See Changes: [nick] TIKA-842 IPTC Metadata Properties, including full descriptions of all the properties taken from the Specification, along with appropriate License/Notice information for this. [nick] TIKA-851 Another mp4 audio alias ---

buildbot success in ASF Buildbot on tika-trunk

2012-01-27 Thread buildbot
The Buildbot has detected a restored build on builder tika-trunk while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/tika-trunk/builds/725 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: isis_ubuntu Build Reason: scheduler Build Source Stamp

Jenkins build is back to normal : Tika-trunk #786

2012-01-27 Thread Apache Jenkins Server
See

[jira] [Commented] (TIKA-851) M4V and M4A detection invalid

2012-01-27 Thread Alexander Chow (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195001#comment-13195001 ] Alexander Chow commented on TIKA-851: - Thanks Nick for adding the alias.

[jira] [Commented] (TIKA-851) M4V and M4A detection invalid

2012-01-27 Thread Alexander Chow (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195027#comment-13195027 ] Alexander Chow commented on TIKA-851: - Nick, although you add the ftyp for M4B (the book

[jira] [Issue Comment Edited] (TIKA-851) M4V and M4A detection invalid

2012-01-27 Thread Alexander Chow (Issue Comment Edited) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195027#comment-13195027 ] Alexander Chow edited comment on TIKA-851 at 1/27/12 7:11 PM: --

% of different content types out there on the web

2012-01-27 Thread Mattmann, Chris A (388J)
(sorry for the cross post) Hey Guys, I'm trying to find a good citation or estimate (if anyone has done one) that estimates the breakout (by % or some other metric) of content types out there out the web (with a whole web crawl or a meaningful representative dataset) that are non HTML. Anyone