Re: [Zope] Indexing files

2006-01-25 Thread Dieter Maurer
Sune Christiansen wrote at 2006-1-24 18:56 +0100:
when you say external PDF converter, do you mean the pdf converter I
created the pdf file with? I have tried to index a microsoft word file
also, but the result is the same: an empty index.

You need converters from the media format (i.e. PDF, MS-Word, ...)
to text (or maybe better named: text extraction utilities).

The standard PDF converter is XPDF (which contains pdftotext (or
similarly)). The standard Word converter is wvware.



-- 
Dieter
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Indexing files

2006-01-24 Thread Sune Christiansen
Hei again.

I have installed TextIndexNG and indexed my Zope DTML Methods objects and
Zope Files objects, and enabled Document converters (PDF, Word etc.)
As indexed attributes I use SearchableText,PrincipiaSearchSource,getFile,
but the indexes related to the pdf files are still empty.
Is it correct to upload my pdf document as a Zope File object?

Thanks,

Sune


 On 21 Jan 2006, at 13:02, Sune Christiansen wrote:

 Hei All.

 I have the following problem:
 I am building up a ZCatalog and indexing my DTML methods. I use the
 index
 type ZCTextIndex and the object function PrincipiaSearchSource. It
 works
 fine.
 But when I try to index my Files (type File) with index type
 ZCTextIndex
 and the object function SearchableText it finds no words and the
 index is
 empty. Am I using the wrong object function?

 Zope File objects do not support indexing their textual content. You
 will need to implement your own text retrieval or use some of the
 other indices out there like Andreas Jung's  TextIndexNG which come
 with suitable modules that can pull text out of various file formats.

 jens

 ___
 Zope maillist  -  Zope@zope.org
 http://mail.zope.org/mailman/listinfo/zope
 **   No cross posts or HTML encoding!  **
 (Related lists -
  http://mail.zope.org/mailman/listinfo/zope-announce
  http://mail.zope.org/mailman/listinfo/zope-dev )



___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Indexing files

2006-01-24 Thread Andreas Jung



--On 24. Januar 2006 16:58:52 +0100 Sune Christiansen [EMAIL PROTECTED] 
wrote:



Hei again.

I have installed TextIndexNG and indexed my Zope DTML Methods objects and
Zope Files objects, and enabled Document converters (PDF, Word etc.)
As indexed attributes I use SearchableText,PrincipiaSearchSource,getFile,
but the indexes related to the pdf files are still empty.
Is it correct to upload my pdf document as a Zope File object?



Is your external PDF converter installed _properly_?

-aj



pgpXSzHHpLRQd.pgp
Description: PGP signature
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Indexing files

2006-01-24 Thread Sune Christiansen
when you say external PDF converter, do you mean the pdf converter I
created the pdf file with? I have tried to index a microsoft word file
also, but the result is the same: an empty index.

- Sune



 --On 24. Januar 2006 16:58:52 +0100 Sune Christiansen [EMAIL PROTECTED]
 wrote:

 Hei again.

 I have installed TextIndexNG and indexed my Zope DTML Methods objects
 and
 Zope Files objects, and enabled Document converters (PDF, Word etc.)
 As indexed attributes I use
 SearchableText,PrincipiaSearchSource,getFile,
 but the indexes related to the pdf files are still empty.
 Is it correct to upload my pdf document as a Zope File object?


 Is your external PDF converter installed _properly_?

 -aj




___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


[Zope] Indexing files

2006-01-21 Thread Sune Christiansen
Hei All.

I have the following problem:
I am building up a ZCatalog and indexing my DTML methods. I use the index
type ZCTextIndex and the object function PrincipiaSearchSource. It works
fine.
But when I try to index my Files (type File) with index type ZCTextIndex
and the object function SearchableText it finds no words and the index is
empty. Am I using the wrong object function?

Thanks,

Sune

___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Indexing files

2006-01-21 Thread Jens Vagelpohl


On 21 Jan 2006, at 13:02, Sune Christiansen wrote:


Hei All.

I have the following problem:
I am building up a ZCatalog and indexing my DTML methods. I use the  
index
type ZCTextIndex and the object function PrincipiaSearchSource. It  
works

fine.
But when I try to index my Files (type File) with index type  
ZCTextIndex
and the object function SearchableText it finds no words and the  
index is

empty. Am I using the wrong object function?


Zope File objects do not support indexing their textual content. You  
will need to implement your own text retrieval or use some of the  
other indices out there like Andreas Jung's  TextIndexNG which come  
with suitable modules that can pull text out of various file formats.


jens

___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce

http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Indexing files

2006-01-21 Thread Tino Wildenhain
Jens Vagelpohl schrieb:
 
 On 21 Jan 2006, at 13:02, Sune Christiansen wrote:
 
 Hei All.

 I have the following problem:
 I am building up a ZCatalog and indexing my DTML methods. I use the 
 index
 type ZCTextIndex and the object function PrincipiaSearchSource. It  works
 fine.
 But when I try to index my Files (type File) with index type  ZCTextIndex
 and the object function SearchableText it finds no words and the 
 index is
 empty. Am I using the wrong object function?
 
 
 Zope File objects do not support indexing their textual content. You 
 will need to implement your own text retrieval or use some of the  other
 indices out there like Andreas Jung's  TextIndexNG which come  with
 suitable modules that can pull text out of various file formats.
 

Newer Zopes have file-objects indexable via PrincipiaSearchSource
if their content-type is text/*

OFS/Image.py, 423ff:

def PrincipiaSearchSource(self):
 Allow file objects to be searched.

if self.content_type.startswith('text/'):
return str(self.data)
return ''


HTH
tino
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )