Thanks Andrea!

 

D

 

From: Andrea Schweer [mailto:[email protected]] 
Sent: Thursday, March 07, 2013 3:34 PM
To: Daniel Sifton
Cc: [email protected]
Subject: Re: [Dspace-general] OCR bitsream

 

Hi Daniel,

On 08/03/13 10:57, Daniel Sifton wrote:

        We've uploaded a limited amount of OCR pdf documents. Were we to
edit the OCR bitstream (.pdf.text) does anyone have any advice on how to
go about getting out the bitstream and then getting it back in? Or
perhaps I'm coming at this from the wrong angle? 


There's nothing special about .pdf.txt files other than the name. Just
download the .pdf.txt file, make the edits you want, delete the .pdf.txt
file from the DSpace item and upload the edited one. As long as you
don't change the filename of the .pdf.txt file, all should be well.
You'll have to update your index(es) to include the new text:
[dspace]/bin/dspace index-update -f
[dspace]/bin/dspace update-discovery-index -f (if you're using
Discovery)

cheers,
Andrea



-- 
Dr Andrea Schweer
IRR Technical Specialist, ITS Information Systems
The University of Waikato, Hamilton, New Zealand
------------------------------------------------------------------------------
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester  
Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the  
endpoint security space. For insight on selecting the right partner to 
tackle endpoint security challenges, access the full report. 
http://p.sf.net/sfu/symantec-dev2dev
_______________________________________________
Dspace-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-general

Reply via email to