[ https://issues.apache.org/jira/browse/TIKA-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283148#comment-14283148 ]
Uwe Schindler commented on TIKA-1523: ------------------------------------- Hi, I did some recherche: This is a bug in Word 2000 (aka Word 9.0) fixed in later versions. Indeed the page count is wrong initially on saving, if you don't scroll to the end. People were complaining about that at that time, too, because it caused sometimes the total page number in footnotes to be incorrect, too. http://support.microsoft.com/kb/212653/en-us See also: http://www.ms-office-forum.net/forum/archive/index.php?t-125861.html (German only, 1st comment): {quote} SSD 26.04.2004, 21:07 Ich übernehme die Seitenzahl aus den Eigenschaften einer Word-Datei in Access. Jetzt habe ich das Problem, daß wenn die Datei in Word geöffnet ist in den Eigenschaften die richtige Seitenzahl angezeigt wird. Es die Datei geschossen und ich gehe im Fenster öffnen auf die Eigenschaften stimmt die Seitenzahl (steht immer erstmal 1 Seite) erst nach mehrmaligem speichern der Datei, woran kann das liegen, wie kann ich das ändern? {quote} And: https://groups.google.com/forum/#!topic/microsoft.public.word.vba.general/daf-sUpPlgs You see, initially the page count is wrong. If you open a file with Word 2000 / 9.0 and safe it without waiting until the full count was calculated (computers were slower at that time), it saved 1. :-) > metadata extractor gets the wrong number of pages of some documents Microsoft > Word 9.0 > -------------------------------------------------------------------------------------- > > Key: TIKA-1523 > URL: https://issues.apache.org/jira/browse/TIKA-1523 > Project: Tika > Issue Type: Bug > Components: metadata > Affects Versions: 1.7 > Environment: Ubuntu > Reporter: Yamileydis Veranes > Assignee: Konstantin Gribov > Attachments: Sigmund Freud.doc, screenshot-1.png, screenshot-2.png > > > When I extract the metadata from a Microsoft Word 9.0 document which has 10 > pages extractor gives me the result that only has 1 page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)