[jira] [Commented] (PDFBOX-2356) Error Validating PDF Archive Document with half hour timezone

2014-09-18 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139126#comment-14139126
 ] 

Tilman Hausherr commented on PDFBOX-2356:
-

You can get the snapshot version here in a few hours:
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/preflight/1.8.8-SNAPSHOT/
(look at the date)

> Error Validating PDF Archive Document with half hour timezone
> -
>
> Key: PDFBOX-2356
> URL: https://issues.apache.org/jira/browse/PDFBOX-2356
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 1.8.4, 1.8.5, 1.8.6, 1.8.7, 1.8.8, 2.0.0
>Reporter: Cetra Free
>Assignee: Tilman Hausherr
> Fix For: 1.8.8, 2.0.0
>
> Attachments: pdfafile.pdf
>
>
> When trying to validate a PDF archive file (attached to this ticket) we get 
> the following error:
> {code}
> 7.2   - Error on MetaData, ModificationDate present in the document catalog 
> dictionary doesn't match with XMP information
> {code}
> This is because the the Modification Date in the Dictionary is parsed 
> differently from the XMP Metadata.  The XMP Metadata is correct, but the Date 
> from the Dictionary appends an extra 30 minutes.
> The following is the raw COSObject from the PDF File
> {code}
> COSString{D:20140917122850+09'30'}
> {code}
> The Long value should be *141092273*
> The *org.apache.pdfbox.util.DateConverter* *parseDate* method returns the 
> Date with Long *141092453* which is 30 minutes ahead.
> XMP Modification Date is parsed differently and returns the correct date.
> This means that validation will fail for PDF Archives.
> My suggestion would be to refactor the parseDate function to use the Standard 
> Java library.
> Here's an example class which will be compatible with the PDF Specification:
> {code}
> static class DateParser {
>  private Map formats =
>new HashMap();
>  
>  public DateParser() {
>String expr = "";
>  
>   for(String part: Arrays.asList("", "MM", "dd", "HH", "mm", "ss", "Z")) {
>  expr = expr + part;
>  formats.put(expr.length(), new SimpleDateFormat(expr));
>}
>  }
>  
>  public Calendar parseDate(String expr) {
>try {
>  expr = expr.replace("D:", "").replace("'", "").replace("Z", "+");
>  Date date = formats.get(Math.min(expr.length(), 15)).parse(expr);
>  
>  
>  Calendar calendar =  Calendar.getInstance();
>  calendar.setTime(date);
>  
>  return calendar;
>} catch (ParseException e) {
>  return null;
>}
>  }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2356) Error Validating PDF Archive Document with half hour timezone

2014-09-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139114#comment-14139114
 ] 

ASF subversion and git services commented on PDFBOX-2356:
-

Commit 1626017 from [~tilman] in branch 'pdfbox/branches/1.8'
[ https://svn.apache.org/r1626017 ]

PDFBOX-2356: add tests for part hour timezones

> Error Validating PDF Archive Document with half hour timezone
> -
>
> Key: PDFBOX-2356
> URL: https://issues.apache.org/jira/browse/PDFBOX-2356
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 1.8.4, 1.8.5, 1.8.6, 1.8.7, 1.8.8
>Reporter: Cetra Free
> Attachments: pdfafile.pdf
>
>
> When trying to validate a PDF archive file (attached to this ticket) we get 
> the following error:
> {code}
> 7.2   - Error on MetaData, ModificationDate present in the document catalog 
> dictionary doesn't match with XMP information
> {code}
> This is because the the Modification Date in the Dictionary is parsed 
> differently from the XMP Metadata.  The XMP Metadata is correct, but the Date 
> from the Dictionary appends an extra 30 minutes.
> The following is the raw COSObject from the PDF File
> {code}
> COSString{D:20140917122850+09'30'}
> {code}
> The Long value should be *141092273*
> The *org.apache.pdfbox.util.DateConverter* *parseDate* method returns the 
> Date with Long *141092453* which is 30 minutes ahead.
> XMP Modification Date is parsed differently and returns the correct date.
> This means that validation will fail for PDF Archives.
> My suggestion would be to refactor the parseDate function to use the Standard 
> Java library.
> Here's an example class which will be compatible with the PDF Specification:
> {code}
> static class DateParser {
>  private Map formats =
>new HashMap();
>  
>  public DateParser() {
>String expr = "";
>  
>   for(String part: Arrays.asList("", "MM", "dd", "HH", "mm", "ss", "Z")) {
>  expr = expr + part;
>  formats.put(expr.length(), new SimpleDateFormat(expr));
>}
>  }
>  
>  public Calendar parseDate(String expr) {
>try {
>  expr = expr.replace("D:", "").replace("'", "").replace("Z", "+");
>  Date date = formats.get(Math.min(expr.length(), 15)).parse(expr);
>  
>  
>  Calendar calendar =  Calendar.getInstance();
>  calendar.setTime(date);
>  
>  return calendar;
>} catch (ParseException e) {
>  return null;
>}
>  }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2356) Error Validating PDF Archive Document with half hour timezone

2014-09-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139113#comment-14139113
 ] 

ASF subversion and git services commented on PDFBOX-2356:
-

Commit 1626016 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1626016 ]

PDFBOX-2356: add tests for part hour timezones

> Error Validating PDF Archive Document with half hour timezone
> -
>
> Key: PDFBOX-2356
> URL: https://issues.apache.org/jira/browse/PDFBOX-2356
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 1.8.4, 1.8.5, 1.8.6, 1.8.7, 1.8.8
>Reporter: Cetra Free
> Attachments: pdfafile.pdf
>
>
> When trying to validate a PDF archive file (attached to this ticket) we get 
> the following error:
> {code}
> 7.2   - Error on MetaData, ModificationDate present in the document catalog 
> dictionary doesn't match with XMP information
> {code}
> This is because the the Modification Date in the Dictionary is parsed 
> differently from the XMP Metadata.  The XMP Metadata is correct, but the Date 
> from the Dictionary appends an extra 30 minutes.
> The following is the raw COSObject from the PDF File
> {code}
> COSString{D:20140917122850+09'30'}
> {code}
> The Long value should be *141092273*
> The *org.apache.pdfbox.util.DateConverter* *parseDate* method returns the 
> Date with Long *141092453* which is 30 minutes ahead.
> XMP Modification Date is parsed differently and returns the correct date.
> This means that validation will fail for PDF Archives.
> My suggestion would be to refactor the parseDate function to use the Standard 
> Java library.
> Here's an example class which will be compatible with the PDF Specification:
> {code}
> static class DateParser {
>  private Map formats =
>new HashMap();
>  
>  public DateParser() {
>String expr = "";
>  
>   for(String part: Arrays.asList("", "MM", "dd", "HH", "mm", "ss", "Z")) {
>  expr = expr + part;
>  formats.put(expr.length(), new SimpleDateFormat(expr));
>}
>  }
>  
>  public Calendar parseDate(String expr) {
>try {
>  expr = expr.replace("D:", "").replace("'", "").replace("Z", "+");
>  Date date = formats.get(Math.min(expr.length(), 15)).parse(expr);
>  
>  
>  Calendar calendar =  Calendar.getInstance();
>  calendar.setTime(date);
>  
>  return calendar;
>} catch (ParseException e) {
>  return null;
>}
>  }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2356) Error Validating PDF Archive Document with half hour timezone

2014-09-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139110#comment-14139110
 ] 

ASF subversion and git services commented on PDFBOX-2356:
-

Commit 1626015 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1626015 ]

PDFBOX-2356: correct handling of part hour timezones

> Error Validating PDF Archive Document with half hour timezone
> -
>
> Key: PDFBOX-2356
> URL: https://issues.apache.org/jira/browse/PDFBOX-2356
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 1.8.4, 1.8.5, 1.8.6, 1.8.7, 1.8.8
>Reporter: Cetra Free
> Attachments: pdfafile.pdf
>
>
> When trying to validate a PDF archive file (attached to this ticket) we get 
> the following error:
> {code}
> 7.2   - Error on MetaData, ModificationDate present in the document catalog 
> dictionary doesn't match with XMP information
> {code}
> This is because the the Modification Date in the Dictionary is parsed 
> differently from the XMP Metadata.  The XMP Metadata is correct, but the Date 
> from the Dictionary appends an extra 30 minutes.
> The following is the raw COSObject from the PDF File
> {code}
> COSString{D:20140917122850+09'30'}
> {code}
> The Long value should be *141092273*
> The *org.apache.pdfbox.util.DateConverter* *parseDate* method returns the 
> Date with Long *141092453* which is 30 minutes ahead.
> XMP Modification Date is parsed differently and returns the correct date.
> This means that validation will fail for PDF Archives.
> My suggestion would be to refactor the parseDate function to use the Standard 
> Java library.
> Here's an example class which will be compatible with the PDF Specification:
> {code}
> static class DateParser {
>  private Map formats =
>new HashMap();
>  
>  public DateParser() {
>String expr = "";
>  
>   for(String part: Arrays.asList("", "MM", "dd", "HH", "mm", "ss", "Z")) {
>  expr = expr + part;
>  formats.put(expr.length(), new SimpleDateFormat(expr));
>}
>  }
>  
>  public Calendar parseDate(String expr) {
>try {
>  expr = expr.replace("D:", "").replace("'", "").replace("Z", "+");
>  Date date = formats.get(Math.min(expr.length(), 15)).parse(expr);
>  
>  
>  Calendar calendar =  Calendar.getInstance();
>  calendar.setTime(date);
>  
>  return calendar;
>} catch (ParseException e) {
>  return null;
>}
>  }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2356) Error Validating PDF Archive Document with half hour timezone

2014-09-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139109#comment-14139109
 ] 

ASF subversion and git services commented on PDFBOX-2356:
-

Commit 1626014 from [~tilman] in branch 'pdfbox/branches/1.8'
[ https://svn.apache.org/r1626014 ]

PDFBOX-2356: correct handling of part hour timezones

> Error Validating PDF Archive Document with half hour timezone
> -
>
> Key: PDFBOX-2356
> URL: https://issues.apache.org/jira/browse/PDFBOX-2356
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 1.8.4, 1.8.5, 1.8.6, 1.8.7, 1.8.8
>Reporter: Cetra Free
> Attachments: pdfafile.pdf
>
>
> When trying to validate a PDF archive file (attached to this ticket) we get 
> the following error:
> {code}
> 7.2   - Error on MetaData, ModificationDate present in the document catalog 
> dictionary doesn't match with XMP information
> {code}
> This is because the the Modification Date in the Dictionary is parsed 
> differently from the XMP Metadata.  The XMP Metadata is correct, but the Date 
> from the Dictionary appends an extra 30 minutes.
> The following is the raw COSObject from the PDF File
> {code}
> COSString{D:20140917122850+09'30'}
> {code}
> The Long value should be *141092273*
> The *org.apache.pdfbox.util.DateConverter* *parseDate* method returns the 
> Date with Long *141092453* which is 30 minutes ahead.
> XMP Modification Date is parsed differently and returns the correct date.
> This means that validation will fail for PDF Archives.
> My suggestion would be to refactor the parseDate function to use the Standard 
> Java library.
> Here's an example class which will be compatible with the PDF Specification:
> {code}
> static class DateParser {
>  private Map formats =
>new HashMap();
>  
>  public DateParser() {
>String expr = "";
>  
>   for(String part: Arrays.asList("", "MM", "dd", "HH", "mm", "ss", "Z")) {
>  expr = expr + part;
>  formats.put(expr.length(), new SimpleDateFormat(expr));
>}
>  }
>  
>  public Calendar parseDate(String expr) {
>try {
>  expr = expr.replace("D:", "").replace("'", "").replace("Z", "+");
>  Date date = formats.get(Math.min(expr.length(), 15)).parse(expr);
>  
>  
>  Calendar calendar =  Calendar.getInstance();
>  calendar.setTime(date);
>  
>  return calendar;
>} catch (ParseException e) {
>  return null;
>}
>  }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2356) Error Validating PDF Archive Document with half hour timezone

2014-09-18 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139105#comment-14139105
 ] 

Tilman Hausherr commented on PDFBOX-2356:
-

I will also add tests for the timezone "Australia/Adelaide" in your honour :-)

> Error Validating PDF Archive Document with half hour timezone
> -
>
> Key: PDFBOX-2356
> URL: https://issues.apache.org/jira/browse/PDFBOX-2356
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 1.8.4, 1.8.5, 1.8.6, 1.8.7, 1.8.8
>Reporter: Cetra Free
> Attachments: pdfafile.pdf
>
>
> When trying to validate a PDF archive file (attached to this ticket) we get 
> the following error:
> {code}
> 7.2   - Error on MetaData, ModificationDate present in the document catalog 
> dictionary doesn't match with XMP information
> {code}
> This is because the the Modification Date in the Dictionary is parsed 
> differently from the XMP Metadata.  The XMP Metadata is correct, but the Date 
> from the Dictionary appends an extra 30 minutes.
> The following is the raw COSObject from the PDF File
> {code}
> COSString{D:20140917122850+09'30'}
> {code}
> The Long value should be *141092273*
> The *org.apache.pdfbox.util.DateConverter* *parseDate* method returns the 
> Date with Long *141092453* which is 30 minutes ahead.
> XMP Modification Date is parsed differently and returns the correct date.
> This means that validation will fail for PDF Archives.
> My suggestion would be to refactor the parseDate function to use the Standard 
> Java library.
> Here's an example class which will be compatible with the PDF Specification:
> {code}
> static class DateParser {
>  private Map formats =
>new HashMap();
>  
>  public DateParser() {
>String expr = "";
>  
>   for(String part: Arrays.asList("", "MM", "dd", "HH", "mm", "ss", "Z")) {
>  expr = expr + part;
>  formats.put(expr.length(), new SimpleDateFormat(expr));
>}
>  }
>  
>  public Calendar parseDate(String expr) {
>try {
>  expr = expr.replace("D:", "").replace("'", "").replace("Z", "+");
>  Date date = formats.get(Math.min(expr.length(), 15)).parse(expr);
>  
>  
>  Calendar calendar =  Calendar.getInstance();
>  calendar.setTime(date);
>  
>  return calendar;
>} catch (ParseException e) {
>  return null;
>}
>  }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2356) Error Validating PDF Archive Document with half hour timezone

2014-09-17 Thread Cetra Free (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138591#comment-14138591
 ] 

Cetra Free commented on PDFBOX-2356:


Well, it's actually Adelaide/Australia!

Which is +9:30 or +10:30 depending on the time of the year 
(http://en.wikipedia.org/wiki/Australian_Time)

There is also a +8:45 time zone just to make things confusing.

> Error Validating PDF Archive Document with half hour timezone
> -
>
> Key: PDFBOX-2356
> URL: https://issues.apache.org/jira/browse/PDFBOX-2356
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 1.8.4, 1.8.5, 1.8.6, 1.8.7, 1.8.8
>Reporter: Cetra Free
> Attachments: pdfafile.pdf
>
>
> When trying to validate a PDF archive file (attached to this ticket) we get 
> the following error:
> {code}
> 7.2   - Error on MetaData, ModificationDate present in the document catalog 
> dictionary doesn't match with XMP information
> {code}
> This is because the the Modification Date in the Dictionary is parsed 
> differently from the XMP Metadata.  The XMP Metadata is correct, but the Date 
> from the Dictionary appends an extra 30 minutes.
> The following is the raw COSObject from the PDF File
> {code}
> COSString{D:20140917122850+09'30'}
> {code}
> The Long value should be *141092273*
> The *org.apache.pdfbox.util.DateConverter* *parseDate* method returns the 
> Date with Long *141092453* which is 30 minutes ahead.
> XMP Modification Date is parsed differently and returns the correct date.
> This means that validation will fail for PDF Archives.
> My suggestion would be to refactor the parseDate function to use the Standard 
> Java library.
> Here's an example class which will be compatible with the PDF Specification:
> {code}
> static class DateParser {
>  private Map formats =
>new HashMap();
>  
>  public DateParser() {
>String expr = "";
>  
>   for(String part: Arrays.asList("", "MM", "dd", "HH", "mm", "ss", "Z")) {
>  expr = expr + part;
>  formats.put(expr.length(), new SimpleDateFormat(expr));
>}
>  }
>  
>  public Calendar parseDate(String expr) {
>try {
>  expr = expr.replace("D:", "").replace("'", "").replace("Z", "+");
>  Date date = formats.get(Math.min(expr.length(), 15)).parse(expr);
>  
>  
>  Calendar calendar =  Calendar.getInstance();
>  calendar.setTime(date);
>  
>  return calendar;
>} catch (ParseException e) {
>  return null;
>}
>  }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2356) Error Validating PDF Archive Document

2014-09-17 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138580#comment-14138580
 ] 

Tilman Hausherr commented on PDFBOX-2356:
-

1) My change does work with your file, but I will expand the tests tonight and 
then commit the changes and they'll be available in the snapshot and in the 
next release in a few months
2) Your issue indicates that PDFBox preflight has never been used in India.
http://geography.about.com/od/culturalgeography/a/offsettimezones.htm

> Error Validating PDF Archive Document
> -
>
> Key: PDFBOX-2356
> URL: https://issues.apache.org/jira/browse/PDFBOX-2356
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 1.8.4, 1.8.5, 1.8.6, 1.8.7, 1.8.8
>Reporter: Cetra Free
> Attachments: pdfafile.pdf
>
>
> When trying to validate a PDF archive file (attached to this ticket) we get 
> the following error:
> {code}
> 7.2   - Error on MetaData, ModificationDate present in the document catalog 
> dictionary doesn't match with XMP information
> {code}
> This is because the the Modification Date in the Dictionary is parsed 
> differently from the XMP Metadata.  The XMP Metadata is correct, but the Date 
> from the Dictionary appends an extra 30 minutes.
> The following is the raw COSObject from the PDF File
> {code}
> COSString{D:20140917122850+09'30'}
> {code}
> The Long value should be *141092273*
> The *org.apache.pdfbox.util.DateConverter* *parseDate* method returns the 
> Date with Long *141092453* which is 30 minutes ahead.
> XMP Modification Date is parsed differently and returns the correct date.
> This means that validation will fail for PDF Archives.
> My suggestion would be to refactor the parseDate function to use the Standard 
> Java library.
> Here's an example class which will be compatible with the PDF Specification:
> {code}
> static class DateParser {
>  private Map formats =
>new HashMap();
>  
>  public DateParser() {
>String expr = "";
>  
>   for(String part: Arrays.asList("", "MM", "dd", "HH", "mm", "ss", "Z")) {
>  expr = expr + part;
>  formats.put(expr.length(), new SimpleDateFormat(expr));
>}
>  }
>  
>  public Calendar parseDate(String expr) {
>try {
>  expr = expr.replace("D:", "").replace("'", "").replace("Z", "+");
>  Date date = formats.get(Math.min(expr.length(), 15)).parse(expr);
>  
>  
>  Calendar calendar =  Calendar.getInstance();
>  calendar.setTime(date);
>  
>  return calendar;
>} catch (ParseException e) {
>  return null;
>}
>  }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2356) Error Validating PDF Archive Document

2014-09-17 Thread Cetra Free (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138197#comment-14138197
 ] 

Cetra Free commented on PDFBOX-2356:


I'm just using the code from here:

http://pdfbox.apache.org/cookbook/pdfavalidation.html

{code}
ValidationResult result = null;

FileDataSource fd = new FileDataSource(args[0]);
PreflightParser parser = new PreflightParser(fd);
try {

  /* Parse the PDF file with PreflightParser that inherits from the 
NonSequentialParser.
   * Some additional controls are present to check a set of PDF/A requirements. 
   * (Stream length consistency, EOL after some Keyword...)
   */
  parser.parse();

  /* Once the syntax validation is done, 
   * the parser can provide a PreflightDocument 
   * (that inherits from PDDocument) 
   * This document process the end of PDF/A validation.
   */
  PreflightDocument document = parser.getPreflightDocument();
  document.validate();

  // Get validation result
  result = document.getResult();
  document.close();

} catch (SyntaxValidationException e) {
  /* the parse method can throw a SyntaxValidationException 
   *if the PDF file can't be parsed.
   */ In this case, the exception contains an instance of ValidationResult  
  result = e.getResult();
}

// display validation result
if (result.isValid()) {
  System.out.println("The file " + args[0] + " is a valid PDF/A-1b file");
} else {
  System.out.println("The file" + args[0] + " is not valid, error(s) :");
  for (ValidationError error : result.getErrorsList()) {
System.out.println(error.getErrorCode() + " : " + error.getDetails());
  }
}
{code}


> Error Validating PDF Archive Document
> -
>
> Key: PDFBOX-2356
> URL: https://issues.apache.org/jira/browse/PDFBOX-2356
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 1.8.4, 1.8.5, 1.8.6
>Reporter: Cetra Free
> Attachments: pdfafile.pdf
>
>
> When trying to validate a PDF archive file (attached to this ticket) we get 
> the following error:
> {code}
> 7.2   - Error on MetaData, ModificationDate present in the document catalog 
> dictionary doesn't match with XMP information
> {code}
> This is because the the Modification Date in the Dictionary is parsed 
> differently from the XMP Metadata.  The XMP Metadata is correct, but the Date 
> from the Dictionary appends an extra 30 minutes.
> The following is the raw COSObject from the PDF File
> {code}
> COSString{D:20140917122850+09'30'}
> {code}
> The Long value should be *141092273*
> The *org.apache.pdfbox.util.DateConverter* *parseDate* method returns the 
> Date with Long *141092453* which is 30 minutes ahead.
> XMP Modification Date is parsed differently and returns the correct date.
> This means that validation will fail for PDF Archives.
> My suggestion would be to refactor the parseDate function to use the Standard 
> Java library.
> Here's an example class which will be compatible with the PDF Specification:
> {code}
> static class DateParser {
>  private Map formats =
>new HashMap();
>  
>  public DateParser() {
>String expr = "";
>  
>   for(String part: Arrays.asList("", "MM", "dd", "HH", "mm", "ss", "Z")) {
>  expr = expr + part;
>  formats.put(expr.length(), new SimpleDateFormat(expr));
>}
>  }
>  
>  public Calendar parseDate(String expr) {
>try {
>  expr = expr.replace("D:", "").replace("'", "").replace("Z", "+");
>  Date date = formats.get(Math.min(expr.length(), 15)).parse(expr);
>  
>  
>  Calendar calendar =  Calendar.getInstance();
>  calendar.setTime(date);
>  
>  return calendar;
>} catch (ParseException e) {
>  return null;
>}
>  }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2356) Error Validating PDF Archive Document

2014-09-17 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137490#comment-14137490
 ] 

Tilman Hausherr commented on PDFBOX-2356:
-

Are you building from source? If yes, please try this:
{code}
private static void adjustTimeZoneNicely(GregorianCalendar cal, TimeZone tz)
{
cal.setTimeZone(tz);
int offset = (cal.get(Calendar.ZONE_OFFSET) + 
cal.get(Calendar.DST_OFFSET)) / 
MILLIS_PER_MINUTE;
cal.add(Calendar.MINUTE, -offset);
}
{code}
If no, please post a minimal code or command line that you used to check your 
file (I never use preflight) and I'll test it.


> Error Validating PDF Archive Document
> -
>
> Key: PDFBOX-2356
> URL: https://issues.apache.org/jira/browse/PDFBOX-2356
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 1.8.4, 1.8.5, 1.8.6
>Reporter: Cetra Free
> Attachments: pdfafile.pdf
>
>
> When trying to validate a PDF archive file (attached to this ticket) we get 
> the following error:
> {code}
> 7.2   - Error on MetaData, ModificationDate present in the document catalog 
> dictionary doesn't match with XMP information
> {code}
> This is because the the Modification Date in the Dictionary is parsed 
> differently from the XMP Metadata.  The XMP Metadata is correct, but the Date 
> from the Dictionary appends an extra 30 minutes.
> The following is the raw COSObject from the PDF File
> {code}
> COSString{D:20140917122850+09'30'}
> {code}
> The Long value should be *141092273*
> The *org.apache.pdfbox.util.DateConverter* *parseDate* method returns the 
> Date with Long *141092453* which is 30 minutes ahead.
> XMP Modification Date is parsed differently and returns the correct date.
> This means that validation will fail for PDF Archives.
> My suggestion would be to refactor the parseDate function to use the Standard 
> Java library.
> Here's an example class which will be compatible with the PDF Specification:
> {code}
> static class DateParser {
>  private Map formats =
>new HashMap();
>  
>  public DateParser() {
>String expr = "";
>  
>   for(String part: Arrays.asList("", "MM", "dd", "HH", "mm", "ss", "Z")) {
>  expr = expr + part;
>  formats.put(expr.length(), new SimpleDateFormat(expr));
>}
>  }
>  
>  public Calendar parseDate(String expr) {
>try {
>  expr = expr.replace("D:", "").replace("'", "").replace("Z", "+");
>  Date date = formats.get(Math.min(expr.length(), 15)).parse(expr);
>  
>  
>  Calendar calendar =  Calendar.getInstance();
>  calendar.setTime(date);
>  
>  return calendar;
>} catch (ParseException e) {
>  return null;
>}
>  }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)