Re: add CJKTokenizer to solr

2007-06-27 Thread Daniel Alheiros
OK Hoss. I agree with you. Regards, Daniel On 26/6/07 19:14, "Chris Hostetter" <[EMAIL PROTECTED]> wrote: > > : conversion tool... Other is to define a "version" on the config and > : depending on the version, the expected structure will be different. > > FYI: schema.xml does have this ... i

Re: add CJKTokenizer to solr

2007-06-26 Thread Chris Hostetter
: conversion tool... Other is to define a "version" on the config and : depending on the version, the expected structure will be different. FYI: schema.xml does have this ... it's one of hte ,schema> attributes. we've only ever reved it once when multiValue fields were added because we wanted to

Re: add CJKTokenizer to solr

2007-06-26 Thread Daniel Alheiros
Hi Hoss. Yes, it's the tricky part when re-structuring configs... One possible solution is, when you create a new schema, you offer a conversion tool... Other is to define a "version" on the config and depending on the version, the expected structure will be different. I'm sure you know this all

Re: add CJKTokenizer to solr

2007-06-25 Thread Chris Hostetter
: : : I think this way, the config terms are a bit clearer... What do you think? in general, do i think it would be better if the and declarations used "factory" as the attribute instead of "class"? ...yes. So i think it makes sense to change this now? ... i don't know. the backward compatib

Re: add CJKTokenizer to solr

2007-06-25 Thread Daniel Alheiros
Or I think this way, the config terms are a bit clearer... What do you think? Regards, Daniel On 22/6/07 20:45, "Chris Hostetter" <[EMAIL PROTECTED]> wrote: > > : What would be the best way to not hide their use? > : > : > > How about just... > > > > > > -Hoss > http://www.bbc.co.

Re: add CJKTokenizer to solr

2007-06-22 Thread Chris Hostetter
: What would be the best way to not hide their use? : : How about just... -Hoss

Re: add CJKTokenizer to solr

2007-06-22 Thread Mike Klaas
On 21-Jun-07, at 10:22 PM, Chris Hostetter wrote: like i said though: i'm in favore of factories like this ... i just don't think we should do anything to hide their use and make refering to Tokenizer or TOkenFilter class names directly use reflection magicly. What would be the best way to

Re: add CJKTokenizer to solr

2007-06-22 Thread Chris Hostetter
: Sorry I've confused things a bit... The thread safeness have to be : considered only on the Tokenizers, not on the factories. So are the : Tokenizers thread safe? nope ... they are constructed using Readers and mainting state about the text they are processing ... the only api is a "next()" met

RE: add CJKTokenizer to solr

2007-06-22 Thread Xuesong Luo
Thanks, otis, I didn't know CJK is only used for Asian language. I'll try the German Analyzer. -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Friday, June 22, 2007 3:18 AM To: solr-user@lucene.apache.org Subject: Re: add CJKTokenizer to solr I&#x

Re: add CJKTokenizer to solr

2007-06-22 Thread Otis Gospodnetic
ene.apache.org Sent: Friday, June 22, 2007 12:43:50 PM Subject: Re: add CJKTokenizer to solr Sorry I've confused things a bit... The thread safeness have to be considered only on the Tokenizers, not on the factories. So are the Tokenizers thread safe? Regards, Daniel On 22/6/07 11:36, "

Re: add CJKTokenizer to solr

2007-06-22 Thread Daniel Alheiros
Sorry I've confused things a bit... The thread safeness have to be considered only on the Tokenizers, not on the factories. So are the Tokenizers thread safe? Regards, Daniel On 22/6/07 11:36, "Daniel Alheiros" <[EMAIL PROTECTED]> wrote: > Hi Hoss. > > I've done a few tests using reflection to

Re: add CJKTokenizer to solr

2007-06-22 Thread Daniel Alheiros
Hi Hoss. I've done a few tests using reflection to instantiate a simple object and the results will vary a lot depending on the JVM. As the JVM optimizes code as it is executed it will vary depending on the usage, but I think we have something to consider: If done 1,000 samples (5 clean X loop of

Re: add CJKTokenizer to solr

2007-06-22 Thread Otis Gospodnetic
Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original Message From: Xuesong Luo <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Friday, June 22, 2007 8:54:37 AM Subject: RE: add CJKTokenizer to solr Thanks, Toru and Chris, I tried both the CJKTokenizer and

RE: add CJKTokenizer to solr

2007-06-21 Thread Xuesong Luo
ubject: Re: add CJKTokenizer to solr I'm sorry. Because it was not possible to append it, it sends it again. > > I got the error below after adding CJKTokenizer to schema.xml. I > > checked the constructor of CJKTokenizer, it requires a Reader parameter, > > I guess that's

Re: add CJKTokenizer to solr

2007-06-21 Thread Chris Hostetter
: Regarding reflection - even if reflection is slower, and I'm sure it is, : I just don't know exactly how much slower it is, couldn't we cache the : instantiated instances keyed off by name? Such instances would have to : be thread-safe, but I imagine most/all Tokenizers already are : thread-saf

Re: add CJKTokenizer to solr

2007-06-21 Thread Otis Gospodnetic
to JIRA. Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original Message From: Chris Hostetter <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Thursday, June 21, 2007 9:39:20 PM Subject: Re: add CJKTokenizer to solr

Re: add CJKTokenizer to solr

2007-06-21 Thread Chris Hostetter
: Why instead of that we don't create an UbberFactory that takes the Tokenizer : class as a parameter and instantiates the proper Tokenizer? The idea has come up before ... and there's really no reason why it wouldn't be okay to include a reflection based facotry like this in Solr -- it just hasn

Re: add CJKTokenizer to solr

2007-06-21 Thread Daniel Alheiros
Hi Well, creating a Factory for each new Tokenizer we want to add means you are replicating the same code again and again just to bind the Factory (Solr interface) to the Tokenizer (Lucene interface). Why instead of that we don't create an UbberFactory that takes the Tokenizer class as a paramete

Re: add CJKTokenizer to solr

2007-06-19 Thread Mike Klaas
On 18-Jun-07, at 10:28 PM, Toru Matsuzawa wrote: I'm sorry. Because it was not possible to append it, it sends it again. I got the error below after adding CJKTokenizer to schema.xml. I checked the constructor of CJKTokenizer, it requires a Reader parameter, I guess that's why I get this e

Re: add CJKTokenizer to solr

2007-06-18 Thread Toru Matsuzawa
I'm sorry. Because it was not possible to append it, it sends it again. > > I got the error below after adding CJKTokenizer to schema.xml. I > > checked the constructor of CJKTokenizer, it requires a Reader parameter, > > I guess that's why I get this error, I searched the email archive, it > >

Re: add CJKTokenizer to solr

2007-06-18 Thread Toru Matsuzawa
> I got the error below after adding CJKTokenizer to schema.xml. I > checked the constructor of CJKTokenizer, it requires a Reader parameter, > I guess that's why I get this error, I searched the email archive, it > seems working for other users. Does anyone know what is the problem? CJKTokenize

Re: add CJKTokenizer to solr

2007-06-18 Thread Chris Hostetter
: I got the error below after adding CJKTokenizer to schema.xml. I : checked the constructor of CJKTokenizer, it requires a Reader parameter, : I guess that's why I get this error, I searched the email archive, it : seems working for other users. Does anyone know what is the problem? You can use

add CJKTokenizer to solr

2007-06-18 Thread Xuesong Luo
Hi, I got the error below after adding CJKTokenizer to schema.xml. I checked the constructor of CJKTokenizer, it requires a Reader parameter, I guess that's why I get this error, I searched the email archive, it seems working for other users. Does anyone know what is the problem? Thanks Xue

Re: add CJKTokenizer to solr

2007-01-30 Thread zha jimmy
Thank you all, it's works now:). 2007/1/30, James liu <[EMAIL PROTECTED]>: he now is ok. -- regards jl

Re: add CJKTokenizer to solr

2007-01-29 Thread James liu
he now is ok. -- regards jl

Re: add CJKTokenizer to solr

2007-01-29 Thread Erik Hatcher
hoss++ On Jan 29, 2007, at 3:43 PM, Chris Hostetter wrote: : >I realized that solr do not have the CJK package ,but how can I : > add it : > in? : : You need to add the analyzers JAR from Lucene's contrib area to your : Solr application, under WEB-INF/lib. You can get that JAR from the :

Re: add CJKTokenizer to solr

2007-01-29 Thread Chris Hostetter
: >I realized that solr do not have the CJK package ,but how can I : > add it : > in? : : You need to add the analyzers JAR from Lucene's contrib area to your : Solr application, under WEB-INF/lib. You can get that JAR from the : latest Lucene release distribution. it's acctually eazier then

Re: add CJKTokenizer to solr

2007-01-29 Thread Erik Hatcher
On Jan 29, 2007, at 1:08 AM, zha jimmy wrote: hi, all I am try to config solr to support chinese tokenize。 I saw the tips in schema.xml: Then I modified schema.xml positionIncrementGap="100"> class="org.apache.lucene.analysis.cjk.CJKTokenizer "/>

add CJKTokenizer to solr

2007-01-28 Thread zha jimmy
hi, all I am try to config solr to support chinese tokenize。 I saw the tips in schema.xml: Then I modified schema.xml : When I start the solr there is some error Caused by: java.lang.ClassNotFoundException: org.apache.lucene.analysis.cjk.CJ