If you use the new apache tika functions in GSearch 2.4.1, you may specify a 
limit (or just use the default of 100000 characters), then the remaining part 
of the pdf file will be ignored. This will guard you against very large pdf 
files. If you use the original indexing functions in GSearch there is no limit, 
except your system ressources. Both set of functions call PDFBox 1.6.0.

Gert


On 01/05/2012, at 20.59, Chalk, Stuart wrote:

> Can anyone tell me the file size limitations on PDF files indexed by FGS 
> (using lucene)?  Also, what version of PDF does it handle?
> 
> Stuart Chalk, Ph.D.
> Associate Professor of Chemistry
> Department of Chemistry, Building 50, Room 3514,
> University of North Florida
> 1 UNF Drive, Jacksonville, FL 32224 USA
> P: 904-620-1938
> F: 904-620-3535
> E: [email protected]
> W: http://www.unf.edu/coas/chemistry/
> 
> 
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and 
> threat landscape has changed and how IT managers can respond. Discussions 
> will include endpoint security, mobile security and the latest in malware 
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Fedora-commons-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Fedora-commons-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users

Reply via email to