Hi,

 

I have been using luke to analyze index.

 

Well, all Portuguese characters appear replaced by an strange character.

 

What I can do to avoid this? 

It is not possible make clucene working with Portuguese characters?

 

Thanks & Regards,

Rui 



 
> Date: Fri, 23 Apr 2010 20:43:49 +0200
> From: [email protected]
> To: [email protected]
> Subject: Re: [CLucene-dev] Clucene search - Do not found some words
> 
> I suggest using a program called luke (google it). You can then look
> into the index and see what is indexed. Let us know if u see all the
> words you would expect to see. And see if u can find the document if u
> search from luke
> 
> handy program :)
> 
> cheers
> ben
> 
> On Friday, April 23, 2010, Rui Oliveira <[email protected]> wrote:
> >
> >
> >
> >
> >
> > Itamar,
> >
> > The test results are made all them in same file. The same file have 
> > "orçamento" and "administração" and found "administração" and do not found 
> > "orçamento".
> >
> > The results are the same for a file in ANSI, Unicode or UTF8 encoded. The 
> > problem is not loading files because I debug the text loaded from file and 
> > this text are ok.
> >
> > Rui
> >
> >
> >
> >
> > From: [email protected]
> > To: [email protected]
> > Date: Fri, 23 Apr 2010 17:59:27 +0300
> > Subject: Re: [CLucene-dev] Clucene search - Do not found some words
> >
> > Rui,
> >
> > This file is ANSI encoded. Are the other files you do succeed in finding 
> > are Unicode / UTF8 encoded perhaps? If that's the case your routine for 
> > loading the files is buggy. You should either have them all encoded using 
> > the same encoding, or have more intelligent code to convert incompatible 
> > encoding.
> >
> > HTH
> >
> > Itamar.
> >
> >
> > From: Rui Oliveira [mailto:[email protected]]
> > Sent: Friday, April 23, 2010 4:32 PM
> > To: clucene-developers; [email protected]
> > Subject: Re: [CLucene-dev] Clucene search - Do not found some words
> >
> >
> > I just attach the file.
> >
> > Tks, Rui
> >
> >
> > From: [email protected]
> > Date: Fri, 23 Apr 2010 09:22:05 -0400
> > To: [email protected]
> > Subject: Re: [CLucene-dev] Clucene search - Do not found some words
> >
> > Can you send me this file that has both "orçamento" and administração?
> >
> > Or you can do a test: Open the file and delete the ç form orçamento and 
> > administração.
> > And then type ç again.
> >
> > Index again and try to search both words again.
> >
> > On Fri, Apr 23, 2010 at 9:14 AM, Rui Oliveira <[email protected]> wrote:
> >
> > They are text file (*.txt) and both words are in same document.
> > When I search for "orçamento" don't found anything and when I search for 
> > "administração" the document is found.
> >
> >
> > Rui
> >
> >
> > From: [email protected]
> > Date: Fri, 23 Apr 2010 09:09:30 -0400
> >
> >
> >
> > To: [email protected]
> > Subject: Re: [CLucene-dev] Clucene search - Do not found some words
> >
> > Seems like an encoding problem with these documents. Are they html pages?
> > Are the words "orçamento" and "administração" in the same page? for example?
> >
> > Can you dump one of these files here? (One that has the problem and one 
> > that has not)
> >
> >
> > On Fri, Apr 23, 2010 at 9:05 AM, Rui Oliveira <[email protected]> wrote:
> >
> > I am indexing some separated documents.
> >
> > The document that have these words are a small text document. This document 
> > is indexed without any visible error. This same document is found when I 
> > search for other words on it.
> >
> >
> > Rui
> >
> >
> > From: [email protected]
> > Date: Fri, 23 Apr 2010 08:58:05 -0400
> >
> >
> >
> > To: [email protected]
> > Subject: Re: [CLucene-dev] Clucene search - Do not found some words
> >
> > What are you indexing?
> >
> > Just a big document?
> > Or a lot of sepparate documents ? (html documents?)
> >
> > On Fri, Apr 23, 2010 at 8:54 AM, Rui Oliveira <[email protected]> wrote:
> >
> > Hi Onilton,
> >
> > I have tested with "orcamento" instead of "orçamento" and didn't get 
> > anything.
> >
> > I do not know if lucene indexes "orçamento" in a wrong way, because indexes 
> > without any error, but when I search for it do not get anything.
> >
> > Thnaks & Regards,
> > Rui
> >
> >
> > From:
> >
> 
> ------------------------------------------------------------------------------
> _______________________________________________
> CLucene-developers mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/clucene-developers
                                          
_________________________________________________________________
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1
------------------------------------------------------------------------------
_______________________________________________
CLucene-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/clucene-developers

Reply via email to