Hi Abdeslam,

Did you replace /files/yourproject.preview/binaries with the correct
pathname for your project?

Ard: is removing the lucene index enough? Shouldn't the files be
touched?

Jasha Joachimsthal 

www.onehippo.com
Amsterdam - Hippo B.V. Oosteinde 11 1017 WT Amsterdam +31(0)20-5224466 
San Francisco - Hippo USA Inc. 101 H Street, suite Q Petaluma CA
94952-3329 +1 (707) 773-4646



> -----Original Message-----
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Irahhoten, Abdeslam
> Sent: dinsdag 15 juli 2008 12:10
> To: Hippo CMS development public mailinglist
> Subject: RE: [HippoCMS-dev] search voor some text in pdf or 
> word document
> 
> Hello ard, 
> 
> I have tried the following but I still can't search for some 
> text inside the pdf documents
> 
> may be I still miss some configuration; can you tell me what 
> exactly the problem is:
> 
> I have added this exractors
> <extractor classname="org.apache.slide.extractor.PDFExtractor"
> uri="/files/yourproject.preview/binaries"
> content-type="application/pdf"/>
> 
> en then I have used the following in my dasl query 
> <d:contains>${param.zoekwoorden}</d:contains>
> 
> when I'm looking for ${param.zoekwoorden} I see nothing
> 
> Thanks in advance
> -----Oorspronkelijk bericht-----
> Van: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Namens Ard Schrijvers
> Verzonden: Monday, July 14, 2008 4:55 PM
> Aan: Hippo CMS development public mailinglist
> Onderwerp: RE: [HippoCMS-dev] search voor some text in pdf or 
> word document
> 
> Ofcourse there is, otherwise I wouldn't dare calling the system a cms
> :-)
> 
> If you search with target binaries (or root) you can just 
> type the search term in <d:contains> element.
> 
> In the repository you need to configure an extractor if not 
> already done. See [1] for possible extractors.
> 
> There you can see how to configure for example MSWord, Excel, 
> powerpoint, pdf etc etc. If you add them, reindexing has to be done.
> This is simply done by removing the lucene index, but watch 
> with this in production environment obviously.
> 
> Examples:
> 
> <extractor classname="nl.hippo.slide.extractor.ImagePropertyExtractor"
> uri="/files/yourproject.preview/binaries"/>
> 
> <extractor classname="nl.hippo.slide.extractor.OfficeExtractor"
> uri="/files/yourproject.preview/binaries"
> content-type="application/vnd.ms-excel">
>     <configuration>
>       <instruction property="author" 
> namespace="http://hippo.nl/cms/1.0";
> summary-information="4"/>
>       <instruction property="application"
> namespace="http://hippo.nl/cms/1.0"; summary-information="18"/>
>       <instruction property="date" namespace="http://hippo.nl/cms/1.0";
> date-format="yyyyMMdd" summary-information="13"/>
>       <instruction property="creationdate"
> namespace="http://hippo.nl/cms/1.0"; date-format="yyyyMMdd"
> summary-information="12"/>
>       <instruction property="caption"
> namespace="http://hippo.nl/cms/1.0"; summary-information="2"/>
>     </configuration>
>   </extractor>
> 
> <extractor classname="org.apache.slide.extractor.PDFExtractor"
> uri="/files/yourproject.preview/binaries"
> content-type="application/pdf"/>
>   
> -Ard
> 
> [1]
> http://www.hippocms.org/display/CMS/4.+Hippo+Repository+Config
> ure+Extrac
> tors
> 
> > 
> > Hello,
> > 
> >  
> > 
> > Is it may be possible (using a dasl query) to search for some text 
> > inside a pdf or word document
> > 
> >  
> > 
> > Thanks in advance
> > 
> > 
> > Disclaimer
> > 
> > Dit bericht met eventuele bijlagen is vertrouwelijk en uitsluitend 
> > bestemd voor de geadresseerde. Indien u niet de bedoelde ontvanger 
> > bent, wordt u verzocht de afzender te waarschuwen en dit 
> bericht met 
> > eventuele bijlagen direct te verwijderen en/of te 
> vernietigen. Het is 
> > niet toegestaan dit bericht en eventuele bijlagen te 
> vermenigvuldigen, 
> > door te sturen, openbaar te maken, op te slaan of op andere 
> wijze te 
> > gebruiken. Ordina N.V. en/of haar groepsmaatschappijen 
> accepteren geen 
> > verantwoordelijkheid of aansprakelijkheid voor schade die 
> voortvloeit 
> > uit de inhoud en/of de verzending van dit bericht.
> > 
> > This e-mail and any attachments are confidential and are solely 
> > intended for the addressee. If you are not the intended recipient, 
> > please notify the sender and delete and/or destroy this message and 
> > any attachments immediately.
> > It is prohibited to copy, to distribute, to disclose or to use this 
> > e-mail and any attachments in any other way. Ordina N.V. and/or its 
> > group companies do not accept any responsibility nor 
> liability for any 
> > damage resulting from the content of and/or the 
> transmission of this 
> > message.
> > ********************************************
> > Hippocms-dev: Hippo CMS development public mailinglist
> > 
> > Searchable archives can be found at:
> > MarkMail: http://hippocms-dev.markmail.org
> > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
> > 
> > 
> ********************************************
> Hippocms-dev: Hippo CMS development public mailinglist
> 
> Searchable archives can be found at:
> MarkMail: http://hippocms-dev.markmail.org
> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
> 
> 
> Disclaimer
> 
> Dit bericht met eventuele bijlagen is vertrouwelijk en 
> uitsluitend bestemd voor de geadresseerde. Indien u niet de 
> bedoelde ontvanger bent, wordt u verzocht de afzender te 
> waarschuwen en dit bericht met eventuele bijlagen direct te 
> verwijderen en/of te vernietigen. Het is niet toegestaan dit 
> bericht en eventuele bijlagen te vermenigvuldigen, door te 
> sturen, openbaar te maken, op te slaan of op andere wijze te 
> gebruiken. Ordina N.V. en/of haar groepsmaatschappijen 
> accepteren geen verantwoordelijkheid of aansprakelijkheid 
> voor schade die voortvloeit uit de inhoud en/of de verzending 
> van dit bericht.
> 
> This e-mail and any attachments are confidential and are 
> solely intended for the addressee. If you are not the 
> intended recipient, please notify the sender and delete 
> and/or destroy this message and any attachments immediately. 
> It is prohibited to copy, to distribute, to disclose or to 
> use this e-mail and any attachments in any other way. Ordina 
> N.V. and/or its group companies do not accept any 
> responsibility nor liability for any damage resulting from 
> the content of and/or the transmission of this message.
> ********************************************
> Hippocms-dev: Hippo CMS development public mailinglist
> 
> Searchable archives can be found at:
> MarkMail: http://hippocms-dev.markmail.org
> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
> 
> 
********************************************
Hippocms-dev: Hippo CMS development public mailinglist

Searchable archives can be found at:
MarkMail: http://hippocms-dev.markmail.org
Nabble: http://www.nabble.com/Hippo-CMS-f26633.html

Reply via email to