Re: [GENERAL] Indexing MS/Open Office and PDF documents

2012-03-15 Thread Samba
Word documents can be processed by Abiword into any msword document into html, latex, postscript, text formats with very simple commands; i guess it also exposes some api which can be integrated into document parsers/indexers. Spreadsheets can be processed by utilizing *ExcelFormat *library http:/

Re: [GENERAL] Indexing MS/Open Office and PDF documents

2012-03-15 Thread dennis jenkins
On Thu, Mar 15, 2012 at 4:12 PM, Jeff Davis wrote: > On Fri, 2012-03-16 at 01:57 +0530, alexander.bager...@cognizant.com > wrote: >> Hi, >> >> We are looking to use Postgres 9 for the document storing and would >> like to take advantage of the full text search capabilities. We have >> hard time id

Re: [GENERAL] Indexing MS/Open Office and PDF documents

2012-03-15 Thread Richard Huxton
On 15/03/12 21:12, Jeff Davis wrote: On Fri, 2012-03-16 at 01:57 +0530, alexander.bager...@cognizant.com We have hard time identifying MS/Open Office and PDF parsers to index stored documents and make them available for text searching. The first step is to find a library that can parse such

Re: [GENERAL] Indexing MS/Open Office and PDF documents

2012-03-15 Thread Jeff Davis
On Fri, 2012-03-16 at 01:57 +0530, alexander.bager...@cognizant.com wrote: > Hi, > > We are looking to use Postgres 9 for the document storing and would > like to take advantage of the full text search capabilities. We have > hard time identifying MS/Open Office and PDF parsers to index stored > d

[GENERAL] Indexing MS/Open Office and PDF documents

2012-03-15 Thread Alexander.Bagerman
Hi, We are looking to use Postgres 9 for the document storing and would like to take advantage of the full text search capabilities. We have hard time identifying MS/Open Office and PDF parsers to index stored documents and make them available for text searching. Any advice would be appreciated.