RE: SmartChineseAnalyzer and stopwords.txt

2012-01-07 Thread Delbosc, Sylvain
Hello,

Has anyone used SmartChineseAnalyzer to index & search Chinese content?
I would like to discuss about few things.

Best Regards,
Sylvain

De : Delbosc, Sylvain [mailto:sylvain.delb...@capgemini.com]
Envoyé : jeudi 5 janvier 2012 14:02
À : solr-user@lucene.apache.org
Cc : Delance, Quentin
Objet : SmartChineseAnalyzer and stopwords.txt

Hello,

I would like to know how to use stopwords with SmartChineseAnalyzer.
Following what is described at 
http://lucene.apache.org/java/2_9_0/api/contrib-smartcn/org/apache/lucene/analysis/cn/smart/SmartChineseAnalyzer.html
 it seems to be possible but I do not manage to make it work.

Presently I am defining my analyzer like this but the stopwords.txt file 
located in the same directory as schema.xml does not seem to be taken into 
account.
  

Has somebody managed to make this work?

NB: I am using SolR 1.4 and I am using several cores.

Best Regards,
_

Sylvain DELBOSC/ Capgemini Sud / Toulouse
Application Architect Senior / TIC - ADC

Tel.: +33 5 61 31 55 70 / www.capgemini.com<http://www.capgemini.com/>
Fax: +33 5 61 31 53 85

15, avenue du Docteur Grynfogel
BP 53655 - 31036 Toulouse Cedex 1
[cid:image001.gif@01CCCBB1.E82858F0]Ensemble, libérons nos énergies.
_
Capgemini is a trading name used by the Capgemini Group of companies which 
includes Capgemini Sud, registered in Toulouse, France (RCS 479 766 990) whose 
registered office is 15 avenue du Dr Grynfogel - BP 53655 - 31036 Toulouse 
cedex 1.

[cid:image002.gif@01CCCBB1.E82858F0]







This message contains information that may be privileged or confidential and is 
the property of the Capgemini Group. It is
intended only for the person to whom it is addressed. If you are not the 
intended recipient, you are not authorized to
read, print, retain, copy, disseminate, distribute, or use this message or any 
part thereof. If you receive this message
in error, please notify the sender immediately and delete all copies of this 
message.


SmartChineseAnalyzer and stopwords.txt

2012-01-05 Thread Delbosc, Sylvain
Hello,

I would like to know how to use stopwords with SmartChineseAnalyzer.
Following what is described at 
http://lucene.apache.org/java/2_9_0/api/contrib-smartcn/org/apache/lucene/analysis/cn/smart/SmartChineseAnalyzer.html
 it seems to be possible but I do not manage to make it work.

Presently I am defining my analyzer like this but the stopwords.txt file 
located in the same directory as schema.xml does not seem to be taken into 
account.
  

Has somebody managed to make this work?

NB: I am using SolR 1.4 and I am using several cores.

Best Regards,
_

Sylvain DELBOSC/ Capgemini Sud / Toulouse
Application Architect Senior / TIC - ADC

Tel.: +33 5 61 31 55 70 / www.capgemini.com<http://www.capgemini.com/>
Fax: +33 5 61 31 53 85

15, avenue du Docteur Grynfogel
BP 53655 - 31036 Toulouse Cedex 1
[cid:image001.gif@01CCCBB1.E82858F0]Ensemble, libérons nos énergies.
_
Capgemini is a trading name used by the Capgemini Group of companies which 
includes Capgemini Sud, registered in Toulouse, France (RCS 479 766 990) whose 
registered office is 15 avenue du Dr Grynfogel - BP 53655 - 31036 Toulouse 
cedex 1.

[cid:image002.gif@01CCCBB1.E82858F0]







This message contains information that may be privileged or confidential and is 
the property of the Capgemini Group. It is 
intended only for the person to whom it is addressed. If you are not the 
intended recipient, you are not authorized to 
read, print, retain, copy, disseminate, distribute, or use this message or any 
part thereof. If you receive this message 
in error, please notify the sender immediately and delete all copies of this 
message.


Re: SmartChineseAnalyzer

2011-12-12 Thread Chris Hostetter

: Subject: SmartChineseAnalyzer
: References:
: 
:  
:  
: In-Reply-To:
: 

http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists

When starting a new discussion on a mailing list, please do not reply to 
an existing message, instead start a fresh email.  Even if you change the 
subject line of your email, other mail headers still track which thread 
you replied to and your question is "hidden" in that thread and gets less 
attention.   It makes following discussions in the mailing list archives 
particularly difficult.



-Hoss


SmartChineseAnalyzer

2011-12-09 Thread waynelam

Hi all,

I checked the documentation of SmartChineseAnalyzer, It looks like it is 
for Simplified Chinese Only.
Does anyone tried to include Traditional Chinese characters also. As the 
analyzer is based on a
dictionary from ICTCLAS1.0. My first thought is maybe i can get it work 
by simply convert the

whole dictionary to Traditional Chinese?

Btw, I checked ICTCLAS official website and it seems the newest version 
java library supports GB2312、GBK、UTF-8、BIG5.

So I can expect a roadmap for SmartChineseAnalyzer to support BIG5 later?



Anyone can show me some hint is much appreciated.



Regards,

Wayne


RE: Error while indexing using SmartChineseAnalyzer

2009-09-01 Thread Jana, Kumar Raja
Thanks for the reply Shalin.
Posted the stack trace on the Jira issue SOLR-1336.

-Kumar

-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] 
Sent: Tuesday, September 01, 2009 4:56 PM
To: solr-user@lucene.apache.org
Subject: Re: Error while indexing using SmartChineseAnalyzer

On Tue, Sep 1, 2009 at 4:37 PM, Jana, Kumar Raja  wrote:

> Hi,
>
> I tried using the patch provided for Solr-1336 JIRA issue for
> integrating Lucene's SmartChineseAnalyzer with Solr and tried testing it
> out but I faced the AbstractMethodError during indexing as well as
> Searching (stack trace below).
>

Questions on patches are best asked on the issue. Please post the stack
trace to SOLR-1336.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Error while indexing using SmartChineseAnalyzer

2009-09-01 Thread Shalin Shekhar Mangar
On Tue, Sep 1, 2009 at 4:37 PM, Jana, Kumar Raja  wrote:

> Hi,
>
> I tried using the patch provided for Solr-1336 JIRA issue for
> integrating Lucene's SmartChineseAnalyzer with Solr and tried testing it
> out but I faced the AbstractMethodError during indexing as well as
> Searching (stack trace below).
>

Questions on patches are best asked on the issue. Please post the stack
trace to SOLR-1336.

-- 
Regards,
Shalin Shekhar Mangar.


Error while indexing using SmartChineseAnalyzer

2009-09-01 Thread Jana, Kumar Raja
Hi,

I tried using the patch provided for Solr-1336 JIRA issue for
integrating Lucene's SmartChineseAnalyzer with Solr and tried testing it
out but I faced the AbstractMethodError during indexing as well as
Searching (stack trace below). There seems to be something wrong during
the tokenization of the content.

 

Can someone please tell me what I am doing wrong here?

 

The Stack Trace

SEVERE: java.lang.AbstractMethodError

at
org.apache.solr.analysis.TokenizerChain.tokenStream(TokenizerChain.java:
64)

at
org.apache.solr.schema.IndexSchema$SolrIndexAnalyzer.tokenStream(IndexSc
hema.java:360)

at
org.apache.lucene.analysis.Analyzer.reusableTokenStream(Analyzer.java:44
)

at
org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPer
Field.java:123)

at
org.apache.lucene.index.DocFieldConsumersPerField.processFields(DocField
ConsumersPerField.java:36)

at
org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFi
eldProcessorPerThread.java:234)

at
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.j
ava:762)

at
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.j
ava:745)

at
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2199
)

at
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2171
)

at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.
java:218)

at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdate
ProcessorFactory.java:60)

at
org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:140)

at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)

at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Conte
ntStreamHandlerBase.java:54)

at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerB
ase.java:131)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1333)

at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.ja
va:303)

at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j
ava:232)

at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applica
tionFilterChain.java:235)

 

Thanks,

Kumar