[ 
https://issues.apache.org/jira/browse/TIKA-3408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17348679#comment-17348679
 ] 

Danny McKinney commented on TIKA-3408:
--------------------------------------

ExifTool Version Number : 12.25
File Name : B1.PC.000161970-10-min.mp4
Directory : .
File Size : 22 MiB
File Modification Date/Time : 2021:05:19 16:10:28-05:00
File Access Date/Time : 2021:05:20 13:28:04-05:00
File Creation Date/Time : 2021:05:19 16:10:00-05:00
File Permissions : -rw-rw-rw-
File Type : MP4
File Type Extension : mp4
MIME Type : video/mp4
Major Brand : MP4 Base Media v1 [IS0 14496-12:2003]
Minor Version : 0.2.0
Compatible Brands : isom, iso2, avc1, mp41
Media Data Size : 22682524
Media Data Offset : 48
Movie Header Version : 0
{color:#ffab00}Create Date : 0000:00:00 00:00:00{color}
{color:#ffab00}Modify Date : 0000:00:00 00:00:00{color}
Time Scale : 1000
Duration : 0:00:50
Preferred Rate : 1
Preferred Volume : 100.00%
Preview Time : 0 s
Preview Duration : 0 s
Poster Time : 0 s
Selection Time : 0 s
Selection Duration : 0 s
Current Time : 0 s
Next Track ID : 3
Track Header Version : 0
{color:#ffab00}Track Create Date : 0000:00:00 00:00:00{color}
{color:#ffab00}Track Modify Date : 0000:00:00 00:00:00{color}
Track ID : 1
Track Duration : 0:00:50
Track Layer : 0
Track Volume : 0.00%
Image Width : 1280
Image Height : 720
Graphics Mode : srcCopy
Op Color : 0 0 0
Compressor ID : avc1
Source Image Width : 1280
Source Image Height : 720
X Resolution : 72
Y Resolution : 72
Bit Depth : 24
Buffer Size : 0
Max Bitrate : 3500685
Average Bitrate : 3500685
Video Frame Rate : 30
Matrix Structure : 1 0 0 0 1 0 0 0 1
Media Header Version : 0
{color:#ffab00}Media Create Date : 0000:00:00 00:00:00{color}
{color:#ffab00}Media Modify Date : 0000:00:00 00:00:00{color}
Media Time Scale : 44100
Media Duration : 0:00:50
Media Language Code : und
Handler Description : Core Media Audio
Balance : 0
Audio Format : mp4a
Audio Channels : 2
Audio Bits Per Sample : 16
Audio Sample Rate : 44100
Handler Type : Metadata
Handler Vendor ID : Apple
Encoder : Lavf58.76.100
Image Size : 1280x720
Megapixels : 0.922
Avg Bitrate : 3.63 Mbps
Rotation : 0

 

That above date times are actually set to 0 in the metadata for the mp4 file. 
Tika actually seems to set the values as the old Mac Classic Epoch date which 
is "1904-01-01T00:00:00Z".  Is there anyway to change this default behavior? 
The following is output from sample project along with code from sample project:

 

May 20, 2021 1:37:00 PM org.apache.tika.config.InitializableProblemHandler$3 
handleInitializableProblem
WARNING: org.xerial's sqlite-jdbc is not loaded.
Please provide the jar on your classpath to parse sqlite files.
See tika-parsers/pom.xml for the correct version.
Lavf58.76.100

date: 1904-01-01T00:00:00Z
X-Parsed-By: org.apache.tika.parser.DefaultParser
xmp:CreatorTool: Lavf58.76.100
{color:#ffab00}meta:creation-date: 1904-01-01T00:00:00Z{color}
{color:#ffab00}Creation-Date: 1904-01-01T00:00:00Z{color}
tiff:ImageLength: 720
{color:#ffab00}dcterms:created: 1904-01-01T00:00:00Z{color}
{color:#ffab00}dcterms:modified: 1904-01-01T00:00:00Z{color}
{color:#ffab00}Last-Modified: 1904-01-01T00:00:00Z{color}
{color:#ffab00}Last-Save-Date: 1904-01-01T00:00:00Z{color}
xmpDM:audioSampleRate: 1000
{color:#ffab00}meta:save-date: 1904-01-01T00:00:00Z{color}
{color:#ffab00}modified: 1904-01-01T00:00:00Z{color}
tiff:ImageWidth: 1280
xmpDM:duration: 50.0
Content-Type: video/mp4

BUILD SUCCESSFUL in 5s
2 actionable tasks: 2 executed
1:37:03 PM: Task execution finished 'Main.main()'.

 

Program: 

===================================================================================

import org.apache.tika.exception.TikaException;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.parser.AutoDetectParser;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.parser.Parser;
import org.apache.tika.sax.BodyContentHandler;
import org.xml.sax.SAXException;

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;

public class Main {
 public static void main(String[] args) {
 File file = new File(".\\Data\\B1.PC.000161970-10-min.mp4");

 // Parser method parameters
 Parser parser = new AutoDetectParser();
 BodyContentHandler handler = new BodyContentHandler();
 Metadata metadata = new Metadata();
 ParseContext context = new ParseContext();
 try {
 FileInputStream is = new FileInputStream(file);
 parser.parse(is, handler, metadata, context);
 System.out.println(handler);
 //getting the list of all meta data elements
 String[] metadataNames = metadata.names();
 for(String name : metadataNames) {
 System.out.println(name + ": " + metadata.get(name));
 }

 } catch (TikaException | IOException | SAXException e) {
 e.printStackTrace();
 }

 }
}

 

Version of Tika Core and Parsers used was 1.26.

> Apache Tika 1.26 Metadata for MP4 and MP3.
> ------------------------------------------
>
>                 Key: TIKA-3408
>                 URL: https://issues.apache.org/jira/browse/TIKA-3408
>             Project: Tika
>          Issue Type: Bug
>          Components: core, parser
>    Affects Versions: 1.26
>            Reporter: Danny McKinney
>            Priority: Minor
>
> Currently parser is returning incorrect date information form mp3 files and 
> mpeg4 files. Our sample is returning date fields with epoch date values which 
> start at 1904. Also the mp3 file is not returning date value although one is 
> part of the header information. I have attached sample program and data files.
>  
> Files (Upload Did not Work): 
> [https://drive.google.com/file/d/1qQmRcqABkwfrR1uuO_m3scXl2dKzxtv4/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to