[ 
https://issues.apache.org/jira/browse/TIKA-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283148#comment-14283148
 ] 

Uwe Schindler edited comment on TIKA-1523 at 1/19/15 11:16 PM:
---------------------------------------------------------------

Hi, I did some recherche:
This is a bug in Word 2000 (aka Word 9.0) fixed in later versions. Indeed the 
page count is wrong initially on saving, if you don't scroll to the end. People 
were complaining about that at that time, too, because it caused sometimes the 
total page number in footnotes to be incorrect, too.

http://support.microsoft.com/kb/212653/en-us

See also: http://www.ms-office-forum.net/forum/archive/index.php?t-125861.html 
(German only, 1st comment):

{quote}
SSD 26.04.2004, 21:07
Ich übernehme die Seitenzahl aus den Eigenschaften einer Word-Datei in Access. 
Jetzt habe ich das Problem, daß wenn die Datei in Word geöffnet ist in den 
Eigenschaften die richtige Seitenzahl angezeigt wird. Es die Datei geschossen 
und ich gehe im Fenster öffnen auf die Eigenschaften stimmt die Seitenzahl 
(steht immer erstmal 1 Seite) erst nach mehrmaligem speichern der Datei, woran 
kann das liegen, wie kann ich das ändern?
{quote}

And: 
https://groups.google.com/forum/#!topic/microsoft.public.word.vba.general/daf-sUpPlgs

{quote}
Anyone can help me with this? If I take out "Sleep 10000",
myDoc.BuiltinDocumentProperties(wdPropertyPages) doesnt return the correct
number of pages sometimes. For example, if a document has 200 pages, it may
come out to return 140, or sometimes 199, instead of 200. To me, it seems it
takes some time for MS word to think and get the number of pages. After i
put "Sleep 10000", 99% I got the correct number of pages. However, this will
take very long time to process as I need to read 200 to 300 files and the
number of pages from each files. Please let me know if there is another
better solution for this.
{quote}

You see, initially the page count is wrong. If you open a file with Word 2000 / 
9.0 and save it without waiting until the full count was calculated (computers 
were slower at that time), it saved 1. :-)


was (Author: thetaphi):
Hi, I did some recherche:
This is a bug in Word 2000 (aka Word 9.0) fixed in later versions. Indeed the 
page count is wrong initially on saving, if you don't scroll to the end. People 
were complaining about that at that time, too, because it caused sometimes the 
total page number in footnotes to be incorrect, too.

http://support.microsoft.com/kb/212653/en-us

See also: http://www.ms-office-forum.net/forum/archive/index.php?t-125861.html 
(German only, 1st comment):

{quote}
SSD 26.04.2004, 21:07
Ich übernehme die Seitenzahl aus den Eigenschaften einer Word-Datei in Access. 
Jetzt habe ich das Problem, daß wenn die Datei in Word geöffnet ist in den 
Eigenschaften die richtige Seitenzahl angezeigt wird. Es die Datei geschossen 
und ich gehe im Fenster öffnen auf die Eigenschaften stimmt die Seitenzahl 
(steht immer erstmal 1 Seite) erst nach mehrmaligem speichern der Datei, woran 
kann das liegen, wie kann ich das ändern?
{quote}

And: 
https://groups.google.com/forum/#!topic/microsoft.public.word.vba.general/daf-sUpPlgs

You see, initially the page count is wrong. If you open a file with Word 2000 / 
9.0 and safe it without waiting until the full count was calculated (computers 
were slower at that time), it saved 1. :-)

> metadata extractor gets the wrong number of pages of some documents Microsoft 
> Word 9.0
> --------------------------------------------------------------------------------------
>
>                 Key: TIKA-1523
>                 URL: https://issues.apache.org/jira/browse/TIKA-1523
>             Project: Tika
>          Issue Type: Bug
>          Components: metadata
>    Affects Versions: 1.7
>         Environment: Ubuntu
>            Reporter: Yamileydis Veranes
>            Assignee: Konstantin Gribov
>         Attachments: Sigmund Freud.doc, screenshot-1.png, screenshot-2.png
>
>
> When I extract the metadata from a Microsoft Word 9.0 document which has 10 
> pages extractor gives me the result that only has 1 page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to