[jira] [Created] (TIKA-1057) The document property "Status" is not extracted for *.doc files

2013-01-16 Thread Thomas Stroeter (JIRA)
Thomas Stroeter created TIKA-1057:
-

 Summary: The document property "Status" is not extracted for *.doc 
files
 Key: TIKA-1057
 URL: https://issues.apache.org/jira/browse/TIKA-1057
 Project: Tika
  Issue Type: Bug
 Environment: java 1.5 / Windows
Reporter: Thomas Stroeter
Priority: Minor


I would like to use Tika to extract the document property "Status"
from a word 97-2003 *.doc file.
   
Tika dumps the document status property correctly from the xml *.docx files as 
"Content-Status" and "cp:contentStatus", but I can not extract the metadata 
from a *.doc Word documents using Tika. 

Nevertheless Word 2010 has no problem to set and extract that document meta 
data from a *.doc file. (Attached are two example files)

Is there a way to extract these information by Tika for *.doc files, too?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (TIKA-1057) The document property "Status" is not extracted for *.doc files

2013-01-16 Thread Thomas Stroeter (JIRA)

 [ 
https://issues.apache.org/jira/browse/TIKA-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Stroeter updated TIKA-1057:
--

Description: 
I would like to use Tika to extract the document property "Status"
from a word 97-2003 *.doc file.
   
Tika dumps the document status property correctly from the xml *.docx files as 
"Content-Status" and "cp:contentStatus", but I can not extract the metadata 
from a *.doc Word documents using Tika. 

Nevertheless Word 2010 has no problem to set and extract that document meta 
data from a *.doc file.

Is there a way to extract these information by Tika for *.doc files, too?

  was:
I would like to use Tika to extract the document property "Status"
from a word 97-2003 *.doc file.
   
Tika dumps the document status property correctly from the xml *.docx files as 
"Content-Status" and "cp:contentStatus", but I can not extract the metadata 
from a *.doc Word documents using Tika. 

Nevertheless Word 2010 has no problem to set and extract that document meta 
data from a *.doc file. (Attached are two example files)

Is there a way to extract these information by Tika for *.doc files, too?


> The document property "Status" is not extracted for *.doc files
> ---
>
> Key: TIKA-1057
> URL: https://issues.apache.org/jira/browse/TIKA-1057
> Project: Tika
>  Issue Type: Bug
> Environment: java 1.5 / Windows
>Reporter: Thomas Stroeter
>Priority: Minor
>
> I would like to use Tika to extract the document property "Status"
> from a word 97-2003 *.doc file.
>
> Tika dumps the document status property correctly from the xml *.docx files 
> as "Content-Status" and "cp:contentStatus", but I can not extract the 
> metadata from a *.doc Word documents using Tika. 
> Nevertheless Word 2010 has no problem to set and extract that document meta 
> data from a *.doc file.
> Is there a way to extract these information by Tika for *.doc files, too?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (TIKA-1057) document content property "Status" is not extracted for *.doc files

2013-01-16 Thread Thomas Stroeter (JIRA)

 [ 
https://issues.apache.org/jira/browse/TIKA-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Stroeter updated TIKA-1057:
--

Description: 
I would like to use Tika to extract the document property "Status" from a word 
97-2003 *.doc file.
   
Tika dumps the document status property correctly from the xml *.docx files as 
"Content-Status" and "cp:contentStatus", but I can not extract the metadata 
from a *.doc Word documents using Tika. 

Nevertheless Word 2010 has no problem to set and extract that document meta 
data from a *.doc file.

Is there a way to extract these information by Tika for *.doc files, too?

  was:
I would like to use Tika to extract the document property "Status"
from a word 97-2003 *.doc file.
   
Tika dumps the document status property correctly from the xml *.docx files as 
"Content-Status" and "cp:contentStatus", but I can not extract the metadata 
from a *.doc Word documents using Tika. 

Nevertheless Word 2010 has no problem to set and extract that document meta 
data from a *.doc file.

Is there a way to extract these information by Tika for *.doc files, too?

Environment: java 1.5/1.6 / Windows 7  (was: java 1.5 / Windows)
Summary: document content property "Status" is not extracted for *.doc 
files  (was: The document property "Status" is not extracted for *.doc files)

> document content property "Status" is not extracted for *.doc files
> ---
>
> Key: TIKA-1057
> URL: https://issues.apache.org/jira/browse/TIKA-1057
> Project: Tika
>  Issue Type: Bug
> Environment: java 1.5/1.6 / Windows 7
>Reporter: Thomas Stroeter
>Priority: Minor
>
> I would like to use Tika to extract the document property "Status" from a 
> word 97-2003 *.doc file.
>
> Tika dumps the document status property correctly from the xml *.docx files 
> as "Content-Status" and "cp:contentStatus", but I can not extract the 
> metadata from a *.doc Word documents using Tika. 
> Nevertheless Word 2010 has no problem to set and extract that document meta 
> data from a *.doc file.
> Is there a way to extract these information by Tika for *.doc files, too?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (TIKA-1056) unify ImageMetadataExtractor interface

2013-01-16 Thread Ray Gauss II (JIRA)

 [ 
https://issues.apache.org/jira/browse/TIKA-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Gauss II reassigned TIKA-1056:
--

Assignee: Ray Gauss II

> unify ImageMetadataExtractor interface
> --
>
> Key: TIKA-1056
> URL: https://issues.apache.org/jira/browse/TIKA-1056
> Project: Tika
>  Issue Type: Wish
>Reporter: Maciej Lizewski
>Assignee: Ray Gauss II
>Priority: Trivial
>
> there are several methods in this class that are targeted for different image 
> type but with different visibility:
> public void parseJpeg(File file);
> protected void parseTiff(InputStream stream);
> both simply extract all possible metadata from image file or stream. Would be 
> nice if parseTiff could also be "public" so it will be easier to create 
> custom parsers located in external jars that use this functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (TIKA-1056) unify ImageMetadataExtractor interface

2013-01-16 Thread Ray Gauss II (JIRA)

 [ 
https://issues.apache.org/jira/browse/TIKA-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Gauss II resolved TIKA-1056.


   Resolution: Fixed
Fix Version/s: 1.3

Resolved in r1434117.

> unify ImageMetadataExtractor interface
> --
>
> Key: TIKA-1056
> URL: https://issues.apache.org/jira/browse/TIKA-1056
> Project: Tika
>  Issue Type: Wish
>Reporter: Maciej Lizewski
>Assignee: Ray Gauss II
>Priority: Trivial
> Fix For: 1.3
>
>
> there are several methods in this class that are targeted for different image 
> type but with different visibility:
> public void parseJpeg(File file);
> protected void parseTiff(InputStream stream);
> both simply extract all possible metadata from image file or stream. Would be 
> nice if parseTiff could also be "public" so it will be easier to create 
> custom parsers located in external jars that use this functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Jenkins build is back to normal : Tika-trunk #966

2013-01-16 Thread Apache Jenkins Server
See