Re: NPE when adding a Document to an IndexWriter

2013-01-09 Thread Igal @ getRailo.org

I think that all I needed to create the components is:


@Override
protected Analyzer.TokenStreamComponents createComponents( String 
fieldName, Reader reader ) {


Analyzer.TokenStreamComponents tsc = new 
Analyzer.TokenStreamComponents(


  getTokenFilterChain( reader, config )
);

return tsc;
}


I'll need to test it to know for sure though.

thanks,


Igal


On 1/9/2013 6:54 PM, Igal @ getRailo.org wrote:
hi Hoss -- thank you for your time.  it looks like you're right (and 
it makes sense if the reader is advanced in two places at the same 
time that it will cause a problem).


I'll try to figure out how to create an Analyzer out of the 
Tokenizer.  that's what I was trying to do there and obviously I did 
it wrong.


thanks again,


Igal


On 1/9/2013 6:28 PM, Chris Hostetter wrote:

: thanks for your reply.  please see attached.  I tried to maintain the
: structure of the code that I need to use in the library I'm 
building.  I think
: it should work for you as long as you remove the package 
declaration at the

: top.

I can't currently try your code, but skimming through it i'd bet 
money the

problem is in your Analyzer.  Have you tried simplifying your test down
and just using "StandardAnalyzer" to rule that out?

In particular i see this...

Analyzer.TokenStreamComponents tsc = new 
Analyzer.TokenStreamComponents(

   getCharTokenizer( reader )
 , getTokenFilterChain( reader, config )
 );

...passing the same Reader to to diff methods there is almost certainly
not what you want to do.



-Hoss

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org






-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: NPE when adding a Document to an IndexWriter

2013-01-09 Thread Igal @ getRailo.org
hi Hoss -- thank you for your time.  it looks like you're right (and it 
makes sense if the reader is advanced in two places at the same time 
that it will cause a problem).


I'll try to figure out how to create an Analyzer out of the Tokenizer.  
that's what I was trying to do there and obviously I did it wrong.


thanks again,


Igal


On 1/9/2013 6:28 PM, Chris Hostetter wrote:

: thanks for your reply.  please see attached.  I tried to maintain the
: structure of the code that I need to use in the library I'm building.  I think
: it should work for you as long as you remove the package declaration at the
: top.

I can't currently try your code, but skimming through it i'd bet money the
problem is in your Analyzer.  Have you tried simplifying your test down
and just using "StandardAnalyzer" to rule that out?

In particular i see this...


Analyzer.TokenStreamComponents tsc = new Analyzer.TokenStreamComponents(
   getCharTokenizer( reader )
 , getTokenFilterChain( reader, config )
 );

...passing the same Reader to to diff methods there is almost certainly
not what you want to do.



-Hoss

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: NPE when adding a Document to an IndexWriter

2013-01-09 Thread Chris Hostetter

: thanks for your reply.  please see attached.  I tried to maintain the
: structure of the code that I need to use in the library I'm building.  I think
: it should work for you as long as you remove the package declaration at the
: top.

I can't currently try your code, but skimming through it i'd bet money the 
problem is in your Analyzer.  Have you tried simplifying your test down 
and just using "StandardAnalyzer" to rule that out?

In particular i see this...

>>> Analyzer.TokenStreamComponents tsc = new Analyzer.TokenStreamComponents( 
>>>   getCharTokenizer( reader )
>>> , getTokenFilterChain( reader, config ) 
>>> );

...passing the same Reader to to diff methods there is almost certainly 
not what you want to do.



-Hoss

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: NPE when adding a Document to an IndexWriter

2013-01-09 Thread Igal @ getRailo.org
thanks for your reply.  please see attached.  I tried to maintain the 
structure of the code that I need to use in the library I'm building.  I 
think it should work for you as long as you remove the package 
declaration at the top.


when I run the attached file I get the following output:

debug:
Exception in thread "main" java.lang.NullPointerException
at 
org.apache.lucene.analysis.util.CharacterUtils$Java5CharacterUtils.fill(CharacterUtils.java:191)
at 
org.apache.lucene.analysis.util.CharTokenizer.incrementToken(CharTokenizer.java:153)
at 
org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:102)
at 
org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:307)
at 
org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:244)
at 
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:373)
at 
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1445)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1124)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1105)

at s21waf.text.lucene4.TestNPE.testIndexWriter(TestNPE.java:47)
at s21waf.text.lucene4.TestNPE.main(TestNPE.java:111)
Java Result: 1
BUILD SUCCESSFUL (total time: 0 seconds)

thanks,

Igal


On 1/9/2013 5:23 PM, Chris Hostetter wrote:

: I keep getting an NPE when trying to add a Doc to an IndexWriter. I've
: minimized my code to very basic code.  what am I doing wrong? pseudo-code:

can you post a full test that other people can run to try and reproduce?

it doesn't even have to be a junit test -- just some complete javacode
people paste into a main method and compile would be enough (right now we
have no idea what IndexWriterConfig you are using (could easily affect
things) or what directory you are using (less likeley, but still)


-Hoss

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



package s21waf.text.lucene4;

import java.io.IOException;
import java.io.Reader;
import java.io.StringReader;
import java.util.Collections;
import java.util.List;
import java.util.Map;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.Tokenizer;
import org.apache.lucene.analysis.core.WhitespaceTokenizer;
import org.apache.lucene.analysis.standard.StandardTokenizer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.util.Version;


public class TestNPE {
 

public static void testIndexWriter() throws IOException {

Directory dir = FSDirectory.open( new java.io.File( 
"F:/Test/Lucene/dir1" ) );

Document doc = new Document();

TextField ft;

ft = new TextField( "desc1", "word1", Field.Store.YES );
doc.add( ft );

ft = new TextField( "desc2", "word2", Field.Store.YES );
doc.add( ft );

Analyzer analyzer = createAnalyzer( TokenizerConfig.DEFAULT );

IndexWriterConfig iwc = new IndexWriterConfig( Version.LUCENE_40, 
analyzer );

IndexWriter iw = new IndexWriter( dir, iwc);

iw.addDocument(doc);

iw.close();
}


/** returns a WhitespaceTokenizerExt Tokenizer that strips html and 
replaces commas with comma-space */
public static Tokenizer getCharTokenizer( Reader input ) {

//Tokenizer result = new WhitespaceTokenizer( Version.LUCENE_40, 
getCharFilter( input ) );

Tokenizer result = new WhitespaceTokenizer( Version.LUCENE_40, input );

//Tokenizer result = new StandardTokenizer( Version.LUCENE_40, input );

return result;
}


/** return getTokenizer( new StringReader( input ) ); */
public static Tokenizer getCharTokenizer( String input ) {

return getCharTokenizer( new StringReader( input ) );
}


public static Analyzer createAnalyzer( final TokenizerConfig config ) {

Analyzer result = new Analyzer() {

@Override
protected Analyzer.TokenStreamComponents createComponents( String 
fieldName, Reader reader ) {

Analyzer.TokenStreamComponents tsc = new 
Analyzer.TokenStreamComponents( 

  getCharTokenizer( reader )
, getTokenFilterChain( reader, config ) 
);


Re: NPE when adding a Document to an IndexWriter

2013-01-09 Thread Chris Hostetter

: I keep getting an NPE when trying to add a Doc to an IndexWriter. I've
: minimized my code to very basic code.  what am I doing wrong? pseudo-code:

can you post a full test that other people can run to try and reproduce?  

it doesn't even have to be a junit test -- just some complete javacode 
people paste into a main method and compile would be enough (right now we 
have no idea what IndexWriterConfig you are using (could easily affect 
things) or what directory you are using (less likeley, but still)


-Hoss

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



NPE when adding a Document to an IndexWriter

2013-01-09 Thread Igal @ getRailo.org
I keep getting an NPE when trying to add a Doc to an IndexWriter. I've 
minimized my code to very basic code.  what am I doing wrong? pseudo-code:


Document doc = new Document();

TextField ft;

ft = new TextField( "desc1", "word1", Field.Store.YES );
doc.add( ft );

ft = new TextField( "desc2", "word2", Field.Store.YES );
doc.add( ft );// if I comment out this line then no NPE

IndexWriter iw = new IndexWriter( luceneDirectory, config );

iw.addDocument( doc );// <== throws NPE


Exception in thread "main" java.lang.NullPointerException
at 
org.apache.lucene.analysis.util.CharacterUtils$Java5CharacterUtils.fill(CharacterUtils.java:191)
at 
org.apache.lucene.analysis.util.CharTokenizer.incrementToken(CharTokenizer.java:153)
at 
org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:102)
at 
org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:307)
at 
org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:244)
at 
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:373)
at 
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1445)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1124)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1105)

at s21waf.text.lucene4.Test1.testIndexWriter(Test1.java:71)
at s21waf.text.lucene4.Test1.main(Test1.java:141)


NPE with StandardTokenizer:

Exception in thread "main" java.lang.NullPointerException
at 
org.apache.lucene.analysis.standard.StandardTokenizerImpl.zzRefill(StandardTokenizerImpl.java:921)
at 
org.apache.lucene.analysis.standard.StandardTokenizerImpl.getNextToken(StandardTokenizerImpl.java:1128)
at 
org.apache.lucene.analysis.standard.StandardTokenizer.incrementToken(StandardTokenizer.java:179)
at 
org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:102)
at 
org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:307)
at 
org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:244)
at 
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:373)
at 
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1445)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1124)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1105)

at s21waf.text.lucene4.Test1.testIndexWriter(Test1.java:75)
at s21waf.text.lucene4.Test1.main(Test1.java:145)

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Upgrade Lucene to latest version (4.0) from 2.4.0

2013-01-09 Thread Steve Rowe
Of course you're free to do as you like - who will stop you? :)

The problem is the lack of a single place to look for detailed guidance on 
handling a long-distance upgrade like that.

But it's difficult to generalize here: the possible range in the level of 
difficulty involved is vast, depending on the amount of code involved, what 
parts of the API are being used, etc.  

On Jan 9, 2013, at 4:58 PM, saisantoshi  wrote:

> Are there any best practices that we can follow? We want to get to the latest
> version and am thinking if we can directly go from 2.4.0 to 4.x (as supposed
> to 2.x - 3.x and 3.x - 4.x)? so that it will not only save time but also
> testing cycle at each migration hop.
> 
> Are there any limitations in directly upgrading from 2.x - 4.x? Is this
> allowed?
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Upgrade-Lucene-to-latest-version-4-0-from-2-4-0-tp4031956p4032038.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Upgrade Lucene to latest version (4.0) from 2.4.0

2013-01-09 Thread Glen Newton
I am in the process of upgrading LuSql from 2.x to 4.x and I am first
going to 3.6 as the jump to 4.x was too big.
 I would suggest this to you. I think it is less work.
Of course I am also able to offer LuSql to 3.6 users, so this is
slightly different from your case.

-Glen

On Wed, Jan 9, 2013 at 4:58 PM, saisantoshi  wrote:
> Are there any best practices that we can follow? We want to get to the latest
> version and am thinking if we can directly go from 2.4.0 to 4.x (as supposed
> to 2.x - 3.x and 3.x - 4.x)? so that it will not only save time but also
> testing cycle at each migration hop.
>
> Are there any limitations in directly upgrading from 2.x - 4.x? Is this
> allowed?
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Upgrade-Lucene-to-latest-version-4-0-from-2-4-0-tp4031956p4032038.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>



-- 
-
http://zzzoot.blogspot.com/
-

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Upgrade Lucene to latest version (4.0) from 2.4.0

2013-01-09 Thread saisantoshi
Are there any best practices that we can follow? We want to get to the latest
version and am thinking if we can directly go from 2.4.0 to 4.x (as supposed
to 2.x - 3.x and 3.x - 4.x)? so that it will not only save time but also
testing cycle at each migration hop.

Are there any limitations in directly upgrading from 2.x - 4.x? Is this
allowed?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Upgrade-Lucene-to-latest-version-4-0-from-2-4-0-tp4031956p4032038.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Upgrade Lucene to latest version (4.0) from 2.4.0

2013-01-09 Thread Igal @ getRailo.org
as mentioned before -- I'm not an expert on Lucene (far from it) -- but 
it seems to me like each migration version will take almost equal amount 
of work so if I were you I'd rethink this plan and consider migration to 4.0



Igal


On 1/9/2013 1:08 PM, saisantoshi wrote:

Is there any migration guide from 2.x to 3.x? ( as per the suggestion, i
would like to upgrade first from 2.4.0 to 2.9.0 and from 2.9.0 to 3.6) and
later we decide if we want to upgrade from 3.6 to 4.x version?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Upgrade-Lucene-to-latest-version-4-0-from-2-4-0-tp4031956p4032029.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Upgrade Lucene to latest version (4.0) from 2.4.0

2013-01-09 Thread Steve Rowe
I don't think there is a migration guide from 2.X to 3.X, other than the 
specific information in the release notes.

If you start reading CHANGES.txt at version 3.0.0, and then each later 
release's notes after that, especially the sections "Changes in backwards 
compatibility policy", e.g. for 3.0.0: 
,
 and "API Changes", you'll get a good picture of the changes you need to make 
to your code.

On Jan 9, 2013, at 4:08 PM, saisantoshi  wrote:

> Is there any migration guide from 2.x to 3.x? ( as per the suggestion, i
> would like to upgrade first from 2.4.0 to 2.9.0 and from 2.9.0 to 3.6) and
> later we decide if we want to upgrade from 3.6 to 4.x version?
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Upgrade-Lucene-to-latest-version-4-0-from-2-4-0-tp4031956p4032029.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Upgrade Lucene to latest version (4.0) from 2.4.0

2013-01-09 Thread saisantoshi
Is there any migration guide from 2.x to 3.x? ( as per the suggestion, i
would like to upgrade first from 2.4.0 to 2.9.0 and from 2.9.0 to 3.6) and
later we decide if we want to upgrade from 3.6 to 4.x version?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Upgrade-Lucene-to-latest-version-4-0-from-2-4-0-tp4031956p4032029.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Upgrade Lucene to latest version (4.0) from 2.4.0

2013-01-09 Thread Steve Rowe
Sai,

For the transition from 2.X to 3.X, I recommend compiling your code against the 
latest 2.9.X version (2.9.4), looking at the deprecation messages, and making 
changes until these are all addressed and compilation no longer produces 
deprecation messages.  Once that's done, your code should compile against 3.0.X.

Unlike the 2.X->3.X transition, there was no "bridge" version that simplifies 
the upgrade path from 3.X->4.0.  However, MIGRATE.txt contains a detailed list 
of upgrade notes from 3.X to 4.0.  See below for an HTML version of this file, 
called the "Migration Guide".

Check out the 4.0 "Changes" and "Migration Guide" (under "Reference 
Documents"): 

 

(note that the 4.0 Changes page contains release notes for *all* previous 
versions.)

Steve

On Jan 9, 2013, at 2:21 PM, saisantoshi  wrote:

> Thanks. Could you please elaborate on what is needed other than replacing the
> jars? Are the jars listed is the only jars or any additional jars required?
> 
> Is the API not backward compatible? I mean to say whatever the API calls we
> are using in 2.4.0 is not supported by 4.0? Has the signature modified in
> 4.x version?
> 
> Thanks,
> Sai.
> 
> 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Upgrade-Lucene-to-latest-version-4-0-from-2-4-0-tp4031956p4031974.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



RE: Upgrade Lucene to latest version (4.0) from 2.4.0

2013-01-09 Thread Paul Hill
My guess is that upgrading to 3.6 to cover the _mostly_ upward compatible 
changes to that point (Fieldable vs. Field) might make a worthwhile 
intermediate step.

Then test that to make sure it is working using whatever have to test. Then 
work out the "real" changes to 4.0.
That is only a thought, because I haven't upgraded to 4.0 yet.

-Paul

> -Original Message-
> From: Igal Sapir [mailto:i...@getrailo.org]
> Sent: Wednesday, January 09, 2013 11:42 AM
> To: java-user@lucene.apache.org
> Subject: Re: Upgrade Lucene to latest version (4.0) from 2.4.0
> 
> I can not elaborate much myself add there are many changes and I'm not an 
> expert on License.
> 
> I can tell you though that many signatures have changed as well as package 
> names.
> 
> There were many API changes even between 3.6 and 4.0
> 
> --
> typos, misspels, and other weird words brought to you courtesy of my mobile 
> device.
> On Jan 9, 2013 11:21 AM, "saisantoshi"  wrote:
> 
> > Thanks. Could you please elaborate on what is needed other than
> > replacing the jars? Are the jars listed is the only jars or any
> > additional jars required?
> >
> > Is the API not backward compatible? I mean to say whatever the API
> > calls we are using in 2.4.0 is not supported by 4.0? Has the signature
> > modified in 4.x version?
> >
> > Thanks,
> > Sai.
> >
> >
> >
> >
> >
> > --
> > View this message in context:
> > http://lucene.472066.n3.nabble.com/Upgrade-Lucene-to-latest-version-4-
> > 0-from-2-4-0-tp4031956p4031974.html
> > Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >
> > -
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Upgrade Lucene to latest version (4.0) from 2.4.0

2013-01-09 Thread Igal Sapir
I can not elaborate much myself add there are many changes and I'm not an
expert on License.

I can tell you though that many signatures have changed as well as package
names.

There were many API changes even between 3.6 and 4.0

--
typos, misspels, and other weird words brought to you courtesy of my mobile
device.
On Jan 9, 2013 11:21 AM, "saisantoshi"  wrote:

> Thanks. Could you please elaborate on what is needed other than replacing
> the
> jars? Are the jars listed is the only jars or any additional jars required?
>
> Is the API not backward compatible? I mean to say whatever the API calls we
> are using in 2.4.0 is not supported by 4.0? Has the signature modified in
> 4.x version?
>
> Thanks,
> Sai.
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Upgrade-Lucene-to-latest-version-4-0-from-2-4-0-tp4031956p4031974.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


Re: Upgrade Lucene to latest version (4.0) from 2.4.0

2013-01-09 Thread saisantoshi
Thanks. Could you please elaborate on what is needed other than replacing the
jars? Are the jars listed is the only jars or any additional jars required?

Is the API not backward compatible? I mean to say whatever the API calls we
are using in 2.4.0 is not supported by 4.0? Has the signature modified in
4.x version?

Thanks,
Sai.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Upgrade-Lucene-to-latest-version-4-0-from-2-4-0-tp4031956p4031974.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Upgrade Lucene to latest version (4.0) from 2.4.0

2013-01-09 Thread Bin Lan
We recently went through the same process. We upgraded our indexing service
from 1.9.1 to 3.6.1. Unfortunately, the process is not as easy as you
thought. Besides replacing the jar files. You also need to change your code
to adopt to the new API. There are many changes, the most import parts are
in IndexWriter, IndexReader, and IndexSearcher classes.


Regards
--
Bin Lan
Software Developer
Perimeter E-Security
O - (203)541-3412

Follow Us on Twitter: www.twitter.com/PerimeterNews
Read Our Blog: security.perimeterusa.com/blog





On Wed, Jan 9, 2013 at 2:04 PM, saisantoshi  wrote:

> We have an existing application which uses Lucene 2.4.0 version. We are
> thinking of upgrading it to alatest version (4.0). I am not sure the
> process
> involved in upgrading to latest version. Is it just copying of the jars? If
> yes, what are all the jars that we need to copy over. Will it be backward
> compatible? Any additional measures we need to take care of?
>
> With version 2.4.0, we have the following jars:
> lucene-core-2.4.0.jar
> lucene-analyzers-2.4.0.jar
>
> With 4.0:
> lucene-analyzers-common-4.0.0.jar
> lucene-analyzers-icu-4.0.0.jar
> lucene-core-4.0.0.jar
>
> Any advise on this is much appreciated?
>
> Thanks,
> Sai.
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Upgrade-Lucene-to-latest-version-4-0-from-2-4-0-tp4031956.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>




This message is for the sole use of the intended recipient(s) and may contain 
confidential and/or privileged information of Perimeter Internetworking Corp.  
Any unauthorized review, use, copying, disclosure, or distribution is 
prohibited.  If you are not the intended recipient, please immediately contact 
the sender by reply email and delete all copies of the original message.



--
 The sender of this email subscribes to Perimeter E-Security's email
 anti-virus service. This email has been scanned for malicious code and is
 believed to be virus free. For more information on email security please
 visit: http://www.perimeterusa.com/services/messaging
 This communication is confidential, intended only for the named recipient(s)
 above and may contain trade secrets or other information that is exempt from
 disclosure under applicable law. Any use, dissemination, distribution or
 copying of this communication by anyone other than the named recipient(s) is
 strictly prohibited. If you have received this communication in error, please
 delete the email and immediately notify our Command Center at 203-541-3444.


Re: Upgrade Lucene to latest version (4.0) from 2.4.0

2013-01-09 Thread Igal @ getRailo.org
the API has changed much over time so I suspect that it will take more 
than replacing the jars.



On 1/9/2013 11:04 AM, saisantoshi wrote:

We have an existing application which uses Lucene 2.4.0 version. We are
thinking of upgrading it to alatest version (4.0). I am not sure the process
involved in upgrading to latest version. Is it just copying of the jars? If
yes, what are all the jars that we need to copy over. Will it be backward
compatible? Any additional measures we need to take care of?

With version 2.4.0, we have the following jars:
lucene-core-2.4.0.jar
lucene-analyzers-2.4.0.jar

With 4.0:
lucene-analyzers-common-4.0.0.jar
lucene-analyzers-icu-4.0.0.jar
lucene-core-4.0.0.jar

Any advise on this is much appreciated?

Thanks,
Sai.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Upgrade-Lucene-to-latest-version-4-0-from-2-4-0-tp4031956.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Upgrade Lucene to latest version (4.0) from 2.4.0

2013-01-09 Thread saisantoshi
We have an existing application which uses Lucene 2.4.0 version. We are
thinking of upgrading it to alatest version (4.0). I am not sure the process
involved in upgrading to latest version. Is it just copying of the jars? If
yes, what are all the jars that we need to copy over. Will it be backward
compatible? Any additional measures we need to take care of?

With version 2.4.0, we have the following jars:
lucene-core-2.4.0.jar
lucene-analyzers-2.4.0.jar

With 4.0:
lucene-analyzers-common-4.0.0.jar
lucene-analyzers-icu-4.0.0.jar
lucene-core-4.0.0.jar

Any advise on this is much appreciated?

Thanks,
Sai.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Upgrade-Lucene-to-latest-version-4-0-from-2-4-0-tp4031956.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Cannot instantiate SPI class

2013-01-09 Thread Igal @ getRailo.org

hi everybody,

I figured it out.  the problem was that I was using a "custom" jar to 
deploy this along with other libs that I use in my application. so at 
the end of my build.xml I create a jar file with all the required libs.


the problem was that I was adding lucene-core.jar with a filter of 
includes="**/*.class" and I guess there are some required files there 
that I was missing. removing that filter fixed the problem




thanks everyone,


Igal


On 1/9/2013 2:12 AM, Igal Sapir wrote:


Thanks, I'll do that.

p.s. -- that was http://getrailo.org -- 'auto-correct' messed it up ;-)

--
typos, misspels, and other weird words brought to you courtesy of my 
mobile device.


On Jan 9, 2013 2:08 AM, "Nick Burch" > wrote:


On Wed, 9 Jan 2013, Igal Sapir wrote:

The syntax is CFML / CFScript (ColdFusion Script).  Railo is
an open
source, high performance, ColdFusion server. http://getrailo.arg/

I will re-download the Lucene jars and try again.  I'll let
you know what I find.


It may be worth double-checking that you don't have any older
lucene jars kicking around your classpath confusing things. I've
not used CF in a while, but when I last did we'd often get caught
out by an old version of a jar we were introducing already being
shipped with CF. You can fairly easily (via the classloader +
getresource) work out which jar a given class file is coming from,
you should use that to verify it's the jar you were expecting!

Nick

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org

For additional commands, e-mail: java-user-h...@lucene.apache.org






RE: Is StandardAnalyzer good enough for multi languages...

2013-01-09 Thread Paul Hill
There is often the possibility to put another tokenizer in the chain to create 
a variant analyzer.  This NOT very hard at all in either Lucene or 
ElasticSearch. 
Extra tokenizers can often be used to tweak the overall processing to add a 
late tokenization to overcome an overlooked tokenization (break on colon would 
be a simple example).  Adding a tokenizer before others can change a token that 
seem incorrectly  processed into one that is done how you like.

Trejkaz, I haven't tried to use ICU yet, but what I understand, I think you'll 
find that ICU is more in agreement with your views and embraces the idea of 
refining the tokenization etc. as needed, not relying on the curios (and often 
flawed) choices of some design committee somewhere.  

 [ICU]
> -Original Message-
> ... no specialisation for straight Roman script, but I guess it could
> always be added.

That would be one of the main points of the whole ICU infrastructure.

-Paul 




Re: Is StandardAnalyzer good enough for multi languages...

2013-01-09 Thread saisantoshi
Thanks for all the responses. From the above, it sounds that there are two
options.

1. Use ICUTokenizer ( is it in Lucene 4.0 or 4.1)? If its in 4.1, then we
cannot use at this time as it is not released out.

2. Write a custom analyzer by extending ( StandardAnalyzer) and add filters
for additional languages. 

The problem that we are facing currently is described in detail at: 

http://lucene.472066.n3.nabble.com/Lucene-support-for-multi-byte-characters-2-4-0-version-td4031654.html

  
Just to summarize it, we are facing some issues tokenizing some Japanese
keyword characters (while uploading some documents, we have some keywords
where people can type in any language) and as a result, searching using such
specific keywords words is not working with the StandardAnalyzer (2.4.0
version).

Can you suggest any filter for this to integrate in Standard Analyzer?

Thanks,
Sai.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-StandardAnalyzer-good-enough-for-multi-languages-tp4031660p4031942.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: extensive minor garbage collection when using RAMDirectory on Lucune 3.6.2

2013-01-09 Thread Alon Muchnick
attaching the second screen shot of live recorded objects .

thanks again

Alon


On Wed, Jan 9, 2013 at 7:34 PM, Alon Muchnick  wrote:

> hi ,
> after upgrading to Lucune 3.6.2 i noticed there is an extensive minor
> garbage collection operations. once or twice a second , and the amount of
> memory being freed is about 600 MB each time for a load of 60 searches per
> second :
>
> 2013-01-09T18:57:24.350+0200: 174200.121: [GC [PSYoungGen:
> 630064K->544K(630336K)] 1405948K->776747K(1660992K), 0.0084250 secs]
> [Times: user=0.03 sys=0.00, real=0.01 secs]
> 2013-01-09T18:57:24.785+0200: 174200.555: [GC [PSYoungGen:
> 629920K->704K(630336K)] 1406123K->777083K(1660992K), 0.0066510 secs]
> [Times: user=0.04 sys=0.00, real=0.01 secs]
>
> after some investigation , the caused seemed to be Lucune related , we are
> using RAM directory (with about 7 documents ) on top of tomcat 7.
>
> so i took Lucune source code and integrated it to our application , then
> run Jprofiler during some  load for a minute and checked the "application
> call tree " for "Garbage collected objects" only .
> i found that the 2 Lucune method that produce the most garbage collected
> object are (*screen shot attached* ) :
>
> (we have several indexes the below is from one of them )
>
>
>- 23.2% - 215 MB - 48,252 alloc.
>org.apache.lucene.search.ExactPhraseScorer.
>
>
>- 6.3% - 60,481 kB - 728,414 alloc.
>org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer
>
>
>
> looking at the currently live object i see the top 2 classes instance are
> ( a screen shot is also attached )  :
>
>  org.apache.lucene.search.BooleanScorer$Bucket (instance count ) 1311699  
> (size)
> 41974368
>
>  org.apache.lucene.search.ScoreDoc (instance count )45846  (size)1100304
>
> we have a system with an earlier version  (tomcat 5.5 and Lucune 3.0.3 our
> all other code / jars are the same) which handles the same load and has the
> same indexes and the minor gc there occurs once every 1-2 second and the
> amount of memory being freed is about 60 MB.
>
> does this behavior is normal ?
> does a search should really generate so much short lived objects ( about
> 5MB for a search ) ?
> should we try and use some other RAM bases Index ? ( i know
> InstantiatedIndex is deprecated )
>
> thanks in advance .
>
> Alon
>
>
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Fwd: extensive minor garbage collection when using RAMDirectory on Lucune 3.6.2

2013-01-09 Thread Alon Muchnick
hi ,
after upgrading to Lucune 3.6.2 i noticed there is an extensive minor
garbage collection operations. once or twice a second , and the amount of
memory being freed is about 600 MB each time for a load of 60 searches per
second :

2013-01-09T18:57:24.350+0200: 174200.121: [GC [PSYoungGen:
630064K->544K(630336K)] 1405948K->776747K(1660992K), 0.0084250 secs]
[Times: user=0.03 sys=0.00, real=0.01 secs]
2013-01-09T18:57:24.785+0200: 174200.555: [GC [PSYoungGen:
629920K->704K(630336K)] 1406123K->777083K(1660992K), 0.0066510 secs]
[Times: user=0.04 sys=0.00, real=0.01 secs]

after some investigation , the caused seemed to be Lucune related , we are
using RAM directory (with about 7 documents ) on top of tomcat 7.

so i took Lucune source code and integrated it to our application , then
run Jprofiler during some  load for a minute and checked the "application
call tree " for "Garbage collected objects" only .
i found that the 2 Lucune method that produce the most garbage collected
object are (*screen shot attached* ) :

(we have several indexes the below is from one of them )


   - 23.2% - 215 MB - 48,252 alloc.
   org.apache.lucene.search.ExactPhraseScorer.


   - 6.3% - 60,481 kB - 728,414 alloc.
   org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer



looking at the currently live object i see the top 2 classes instance are (
a screen shot is also attached )  :

 org.apache.lucene.search.BooleanScorer$Bucket (instance count )
1311699  (size)
41974368

 org.apache.lucene.search.ScoreDoc (instance count )45846  (size)1100304

we have a system with an earlier version  (tomcat 5.5 and Lucune 3.0.3 our
all other code / jars are the same) which handles the same load and has the
same indexes and the minor gc there occurs once every 1-2 second and the
amount of memory being freed is about 60 MB.

does this behavior is normal ?
does a search should really generate so much short lived objects ( about
5MB for a search ) ?
should we try and use some other RAM bases Index ? ( i know
InstantiatedIndex is deprecated )

thanks in advance .

Alon

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: FuzzyQuery in lucene 4.0

2013-01-09 Thread Jack Krupansky
FWIW, new FuzzyQuery(term, 2 ,0) is the same as new FuzzyQuery(term), given 
the current values of defaultMaxEdits (2) and defaultPrefixLength (0).


-- Jack Krupansky

-Original Message- 
From: Ian Lea

Sent: Wednesday, January 09, 2013 9:44 AM
To: java-user@lucene.apache.org
Subject: Re: FuzzyQuery in lucene 4.0

See the javadocs for FuzzyQuery to see what the parameters are.  I
can't tell you what the comment means.  Possible values to try maybe?


--
Ian.


On Wed, Jan 9, 2013 at 2:34 PM, algebra  wrote:

is true Ian, o code is good.

The only thing that I dont understand is a line:

Query query = new FuzzyQuery(term, 2 ,0); //0-2

Whats means 0 to 2?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/FuzzyQuery-in-lucene-4-0-tp4031871p4031879.html

Sent from the Lucene - Java Users mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org 



-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Lucene for a linguistic corpus

2013-01-09 Thread Wu, Stephen T., Ph.D.
>> For an example, in the phrase "A man saw a elephant" "saw" has annotations as
>> follows (we also say that its position in index is 1234):
>> 
>> {lemma: see, pos: verb, tense: past}, {lemma: saw, pos: noun, number:
>> singular}
>> 
>> I think, it would be more effective to insert parse index in each attribute's
>> posting list entry as a payload and use it at the intersectiion stage. E.g.,
>> we have a posting list for 'pos = Verb' like ...|...|1.1234|...|..., and a
>> posting list for 'number = Singular': ...|...|2.1234|...|... While processing
>> a query like 'pos = Verb AND number = singular' at all stages of posting list
>> processing 'x.1234' will be accepted until the intersection stage at which
>> they will be rejected because of non-corresponding parse indexes.
We're working on something very similar.
Are there really posting lists like this (e.g., separate lists for pos=Verb,
number=Singular) for things in Payloads?  I think some previous discussion
was saying this kind of posting list is not available.  I couldn't find
anything like that in the documentation about the index format. If there
are, this would be really efficient.

> You might be able to insert your parses as payloads on a term and then
> implement a scorer extension (override computePayloadFactor) to handle your
> join cases for a given word.  You may also need to extend PayloadQuery or
> PayloadTermQuery.  Note, I don't know how well this will perform.
We've done it this way before, storing a slightly different set of
information in the Payload.  I thought making use of a Payload, though,
requires you to iterate through all the tokens, whether in the Analyzer
(i.e., in a TokenFilter) or Similarity (in an overridden scorePayload()
function).

If I'm right, then filtering this out at intersection time might not be
quite as efficient as you're talking about, but it's definitely a reasonable
way to do it.

stephen


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: FuzzyQuery in lucene 4.0

2013-01-09 Thread Ian Lea
See the javadocs for FuzzyQuery to see what the parameters are.  I
can't tell you what the comment means.  Possible values to try maybe?


--
Ian.


On Wed, Jan 9, 2013 at 2:34 PM, algebra  wrote:
> is true Ian, o code is good.
>
> The only thing that I dont understand is a line:
>
> Query query = new FuzzyQuery(term, 2 ,0); //0-2
>
> Whats means 0 to 2?
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/FuzzyQuery-in-lucene-4-0-tp4031871p4031879.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: FuzzyQuery in lucene 4.0

2013-01-09 Thread algebra
is true Ian, o code is good.

The only thing that I dont understand is a line:

Query query = new FuzzyQuery(term, 2 ,0); //0-2

Whats means 0 to 2?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/FuzzyQuery-in-lucene-4-0-tp4031871p4031879.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: FuzzyQuery in lucene 4.0

2013-01-09 Thread Ian Lea
What adjustments did you make?  One of them might be to blame.

But at a glance the code looks fine to me.  In what way is it not
working?  Care to provide any input/output/details of what
does/doesn't work?


--
Ian.


On Wed, Jan 9, 2013 at 2:03 PM, algebra  wrote:
> I was using lucene 3.6 and my function worked well. After I changed the
> version of lucene to 4.0 and did some adjustments and my function is not
> working. Someone tell me what do you know I'm doing wrong?
>
>  public List  fuzzyLuceneList(List list, String s) throws
> CorruptIndexException, LockObtainFailedException, IOException,
> ParseException {
> List listr = new ArrayList();
>
> Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_40);
>
> Directory directory = new RAMDirectory();
>
> IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_40,
> analyzer);
>
> IndexWriter iwriter = new IndexWriter(directory, config);
>
> Document doc;
>
> for (int i = 0; i < list.size(); i++) {
> doc = new Document();
> doc.add(new Field("fieldname", list.get(i), Field.Store.YES,
> Field.Index.ANALYZED));
> iwriter.addDocument(doc);
> }
>
> iwriter.close();
>
> IndexReader reader = IndexReader.open(directory);
> IndexSearcher isearcher = new IndexSearcher(reader);
>
> Term term = new Term("fieldname", s);
>
>
> Query query = new FuzzyQuery(term, 1,0);// 0-2
>
> TopScoreDocCollector collector = TopScoreDocCollector.create(10,
> true);
> isearcher.search(query, collector);
>
> ScoreDoc[] hits = collector.topDocs().scoreDocs;
> for (int i = 0; i < hits.length; i++) {
> Document hitDoc = isearcher.doc(hits[i].doc);
> listr.add(hitDoc.get("fieldname"));
> }
>
> //isearcher.close();
> directory.close();
>
> return listr;
> }
>
> Thanks!
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/FuzzyQuery-in-lucene-4-0-tp4031871.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



FuzzyQuery in lucene 4.0

2013-01-09 Thread algebra
I was using lucene 3.6 and my function worked well. After I changed the
version of lucene to 4.0 and did some adjustments and my function is not
working. Someone tell me what do you know I'm doing wrong?

 public List  fuzzyLuceneList(List list, String s) throws
CorruptIndexException, LockObtainFailedException, IOException,
ParseException {
List listr = new ArrayList();

Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_40);

Directory directory = new RAMDirectory();

IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_40,
analyzer);

IndexWriter iwriter = new IndexWriter(directory, config);

Document doc;

for (int i = 0; i < list.size(); i++) {
doc = new Document();
doc.add(new Field("fieldname", list.get(i), Field.Store.YES,
Field.Index.ANALYZED));
iwriter.addDocument(doc);
}

iwriter.close();

IndexReader reader = IndexReader.open(directory);
IndexSearcher isearcher = new IndexSearcher(reader);

Term term = new Term("fieldname", s);


Query query = new FuzzyQuery(term, 1,0);// 0-2

TopScoreDocCollector collector = TopScoreDocCollector.create(10,
true);
isearcher.search(query, collector);

ScoreDoc[] hits = collector.topDocs().scoreDocs;
for (int i = 0; i < hits.length; i++) {
Document hitDoc = isearcher.doc(hits[i].doc);
listr.add(hitDoc.get("fieldname"));
}

//isearcher.close();
directory.close();

return listr;
}

Thanks!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/FuzzyQuery-in-lucene-4-0-tp4031871.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: getting the offset of hits in a search

2013-01-09 Thread Itai Peleg
Great! I'll look into that.

Thanks!


2013/1/9 김한규 

> Try SpanTermQuery, getSpans() function. It returns Spans object which you
> can iterate through to find position of every hits in every documents.
>
> http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/spans/SpanTermQuery.html
>
> 2013/1/9 Itai Peleg 
>
> > Hi,
> >
> > I'n new to Lucene, and I'm having some problems with the search. Ive been
> > debugging Lucene in order to find out how I can get for each document
> what
> > are its "hits" (the terms that it has that were in the query) and their
> > offset. Can anyone give me some pointers?
> >
> > Thanks in advance,
> > Itai
> >
>


Re: getting the offset of hits in a search

2013-01-09 Thread 김한규
Try SpanTermQuery, getSpans() function. It returns Spans object which you
can iterate through to find position of every hits in every documents.
http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/spans/SpanTermQuery.html

2013/1/9 Itai Peleg 

> Hi,
>
> I'n new to Lucene, and I'm having some problems with the search. Ive been
> debugging Lucene in order to find out how I can get for each document what
> are its "hits" (the terms that it has that were in the query) and their
> offset. Can anyone give me some pointers?
>
> Thanks in advance,
> Itai
>


Re: how much blocksize is set in lucene.

2013-01-09 Thread hujing
the index lib must be saved into harddisk.  
Sent from Huawei Mobile

Ian Lea 编写:

>What do you mean by lucene blocksize?  What version of lucene are you using?
>
>A good general principle is to start with the defaults and only worry
>if there is a problem.
>
>
>--
>Ian.
>
>
>On Wed, Jan 9, 2013 at 8:51 AM, seacathello  wrote:
>> now i index very many email file, aboule 50m and every email file size about
>> 4-50k.
>> the index lib size is aboule 1TB, segment size is only.
>>
>> In this index lib, which blocksize should i shoose?
>> 4k or 512k, which choice is better?
>> Thanks very much?
>>
>>
>>
>> --
>> View this message in context: 
>> http://lucene.472066.n3.nabble.com/how-much-blocksize-is-set-in-lucene-tp4031796.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>
>-
>To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>For additional commands, e-mail: java-user-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: how much blocksize is set in lucene.

2013-01-09 Thread hujing
The  index lib must be saved in the harddisk.  when harddisk can not save a 
large size index lib, we will use  disk array.   the disk array must set stripe 
size.  so   i want to know,when index lib saved in the disk array ,which stripe 
size will be set.  when   index saved in the file sytem,  how much block size 
will be set?
Sent from Huawei Mobile

Ian Lea 编写:

>What do you mean by lucene blocksize?  What version of lucene are you using?
>
>A good general principle is to start with the defaults and only worry
>if there is a problem.
>
>
>--
>Ian.
>
>
>On Wed, Jan 9, 2013 at 8:51 AM, seacathello  wrote:
>> now i index very many email file, aboule 50m and every email file size about
>> 4-50k.
>> the index lib size is aboule 1TB, segment size is only.
>>
>> In this index lib, which blocksize should i shoose?
>> 4k or 512k, which choice is better?
>> Thanks very much?
>>
>>
>>
>> --
>> View this message in context: 
>> http://lucene.472066.n3.nabble.com/how-much-blocksize-is-set-in-lucene-tp4031796.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>
>-
>To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>For additional commands, e-mail: java-user-h...@lucene.apache.org
>


Re: Differences in MLT Query Terms Question

2013-01-09 Thread Peter Lavin


Hi Jack, thanks for your ideas, I've added some comments to your 
questions, maybe you can throw some more light on this...




On 01/08/2013 11:34 PM, Jack Krupansky wrote:

The term "arv" is on the first list, but not the second. Maybe it's
document frequency fell below the setting for minimum document frequency
on the second run.

Or, maybe the minimum word length was set to 4 or more on the second run.
The same parameters are used (same code) for each run, all that changes 
is that I change the path to a different folder, one containing 16, the 
other 4 files. The smaller folder was made by simply deleting the 
unwanted 12 files.




Are you using MoreLikeThisQuery or directly using MoreLikeThis?

I'm using MoreLikeThis directly, does this make a difference?



Or, possibly "arv" appears later in a document on the second run, after
the number of tokens specified by maxNumTokensParsed.
The files used in the second run are identical, and each file is read 
from disk and indexed individually (as is common I'm sure). I look at 
this, and when all 16 files are indexed together, the results are 
repeatedly identical, and the same for the 4 files runs. I.e. the 
outcomes for both 16 and 4 files can be reproduced.


The reason for my question (and for doing these runs) is that I'm using 
Lucene in an application where I want to use the similarity measurements 
between documents as a metric in another area. If the similarity score 
changes when the size of the index changes, I need to understand.


thanks again,
Peter




-- Jack Krupansky

-Original Message- From: Peter Lavin
Sent: Tuesday, January 08, 2013 1:46 PM
To: java-user@lucene.apache.org
Subject: Differences in MLT Query Terms Question


Dear Users,

I am running some simple experiments with Lucene and am seeing something
I don't understand.

I have 16 text files on 4 different topics, ranging in size from 50-900
KB. When I index all 16 of these and run an MLT query based on one of
the indexed documents, I get an expected result (i.e. similar topics are
found).

When I reduce the number of text files to 4 and index them (having taken
care to overwriting the previous index files), and then run the same MLT
query (based on the same document from the index), I get slightly
different scores. I'm assuming this is because the IDF is now different
because there is less documents.

For each run, I have set the max number of terms as...
mlt.setMaxQueryTerms(100)

However, when I compare the terms which get used for the MLT query on
the 16 document index and the 4 document index, they are slightly
different. I've printed, parsed and sorted them into two columns of a
CSV file. I've pasted a small part of it at the end of this email.

My Question(s)...
1) Can anybody explain why the set of terms used for the MLT query is
different when a file from an index of 16 documents versus 4 documents
is used?

2) Am I right in assuming that the reason for slightly different scores
in the IDF, or could it be this slight difference in the sets of terms
used (or possibly both)?

regards,
Peter




--
with best regards,
Peter Lavin,
PhD Candidate,
CAG - Computer Architecture & Grid Research Group,
Lloyd Institute, 005,
Trinity College Dublin, Ireland.
+353 1 8961536

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Help needed Regarding classification of Text Data using Lucene..

2013-01-09 Thread Tommaso Teofili
Hi,

you can have a look at the (early stage) Lucene classification module on
trunk [1], see also a brief introduction given at last ApacheCon EU [2].

Hope this helps,
Tommaso

[1] :
http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/classification/
[2] :
http://www.slideshare.net/teofili/text-categorization-with-lucene-and-solr


2013/1/9 VIGNESH S 

> Hi,
>
> can anyone suggest me how can i use lucene for text classification.
>
> --
> Thanks and Regards
> Vignesh Srinivasan
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


Re: how much blocksize is set in lucene.

2013-01-09 Thread Ian Lea
What do you mean by lucene blocksize?  What version of lucene are you using?

A good general principle is to start with the defaults and only worry
if there is a problem.


--
Ian.


On Wed, Jan 9, 2013 at 8:51 AM, seacathello  wrote:
> now i index very many email file, aboule 50m and every email file size about
> 4-50k.
> the index lib size is aboule 1TB, segment size is only.
>
> In this index lib, which blocksize should i shoose?
> 4k or 512k, which choice is better?
> Thanks very much?
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/how-much-blocksize-is-set-in-lucene-tp4031796.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Help needed Regarding classification of Text Data using Lucene..

2013-01-09 Thread Shashi Kant
http://www.slideshare.net/teofili/text-categorization-with-lucene-and-solr


On Wed, Jan 9, 2013 at 5:46 AM, VIGNESH S  wrote:
> Hi,
>
> can anyone suggest me how can i use lucene for text classification.
>
> --
> Thanks and Regards
> Vignesh Srinivasan
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Help needed Regarding classification of Text Data using Lucene..

2013-01-09 Thread VIGNESH S
Hi,

can anyone suggest me how can i use lucene for text classification.

-- 
Thanks and Regards
Vignesh Srinivasan

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



RE: Cannot instantiate SPI class

2013-01-09 Thread Igal Sapir
Thanks, I'll do that.

p.s. -- that was http://getrailo.org -- 'auto-correct' messed it up ;-)

--
typos, misspels, and other weird words brought to you courtesy of my mobile
device.
On Jan 9, 2013 2:08 AM, "Nick Burch"  wrote:

> On Wed, 9 Jan 2013, Igal Sapir wrote:
>
>> The syntax is CFML / CFScript (ColdFusion Script).  Railo is an open
>> source, high performance, ColdFusion server.  http://getrailo.arg/
>>
>> I will re-download the Lucene jars and try again.  I'll let you know what
>> I find.
>>
>
> It may be worth double-checking that you don't have any older lucene jars
> kicking around your classpath confusing things. I've not used CF in a
> while, but when I last did we'd often get caught out by an old version of a
> jar we were introducing already being shipped with CF. You can fairly
> easily (via the classloader + getresource) work out which jar a given class
> file is coming from, you should use that to verify it's the jar you were
> expecting!
>
> Nick
>
> --**--**-
> To unsubscribe, e-mail: 
> java-user-unsubscribe@lucene.**apache.org
> For additional commands, e-mail: 
> java-user-help@lucene.apache.**org
>
>


Re: Is StandardAnalyzer good enough for multi languages...

2013-01-09 Thread Trejkaz
On Wed, Jan 9, 2013 at 5:25 PM, Steve Rowe  wrote:
> Dude.  Go look.  It allows for per-script specialization, with (non-UAX#29) 
> specializations by default for Thai, Lao, Myanmar and Hewbrew.  See 
> DefaultICUTokenizerConfig.  It's filled with exactly the opposite of what you 
> were describing.

I guess that's a reasonable start. Still has no specialisation for
straight Roman script, but I guess it could always be added.

TX

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



RE: Cannot instantiate SPI class

2013-01-09 Thread Nick Burch

On Wed, 9 Jan 2013, Igal Sapir wrote:

The syntax is CFML / CFScript (ColdFusion Script).  Railo is an open
source, high performance, ColdFusion server.  http://getrailo.arg/

I will re-download the Lucene jars and try again.  I'll let you know 
what I find.


It may be worth double-checking that you don't have any older lucene jars 
kicking around your classpath confusing things. I've not used CF in a 
while, but when I last did we'd often get caught out by an old version of 
a jar we were introducing already being shipped with CF. You can fairly 
easily (via the classloader + getresource) work out which jar a given 
class file is coming from, you should use that to verify it's the jar you 
were expecting!


Nick

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



RE: Cannot instantiate SPI class

2013-01-09 Thread Igal Sapir
The syntax is CFML / CFScript (ColdFusion Script).  Railo is an open
source, high performance, ColdFusion server.  http://getrailo.arg/

I will re-download the Lucene jars and try again.  I'll let you know what I
find.

Thanks,

Igal

--
typos, misspels, and other weird words brought to you courtesy of my mobile
device.
On Jan 9, 2013 12:28 AM, "Uwe Schindler"  wrote:

> >  indexWriterConfig = createObject( "java",
> > "org.apache.lucene.index.IndexWriterConfig" ).init( Lucene.Version,
> > this.indexAnalyzer );
>
> What syntax is that, I have never seen that before!
>
> > where Lucene.Version is an object of Lucene.VERSION_40 and
> > this.indexAnalyzer is an Analyzer object that I create before.  one
> possible
> > problem is that Railo ships with a very old version of Lucene, so I had
> to
> > disable some of the jars that ship with Railo but I believe that I
> removed all of
> > them.  I also had to disable a jar of apache-commons-codec that ships
> with
> > Railo to avoid version conflicts.
> > stacktrace below:
> >
>
> Which still does not contain the root cause (this comes *after* the stack
> trace), printed like:
>
> Cannot instantiate SPI class:
> org.apache.lucene.codecs.appending.AppendingCodec at
> org.apache.lucene.util.NamedSPILoader.lookup(NamedSPILoader.java:104):104
> at
> ...
> Caused by: other Exception stack trace
>
> I need everything behind Caused By.
>
> I must tell you: The line numbers in this stack trace don't correspond to
> the ones officially released Lucene 4.0, so it looks like you have a
> version mismatch, maybe involving alpha/beta/snapshot versions of Lucene,
> and one of these old versions is causing the bug, that was already
> mentioned by Steven. Line 104 in NamedSPILoader of Lucene 4.0 has different
> code, so I think your lucene-core.jar file is outdated.
>
> 2nd: If you don't have lucene-codecs.jar, then this error *cannot* happen!
> If it happens, you have some lucene-codecs.jar file with a different Lucene
> version on your classpath. What's the error *without* lucene codecs.jar?
> (you don’t need that file!)
>
> Uwe
>
> > Cannot instantiate SPI class:
> > org.apache.lucene.codecs.appending.AppendingCodec at
> > org.apache.lucene.util.NamedSPILoader.lookup(NamedSPILoader.java:104):
> > 104 at
> > org.apache.lucene.codecs.PostingsFormat.forName(PostingsFormat.java:10
> > 0):100
> > at
> > org.apache.lucene.codecs.lucene40.Lucene40Codec.(Lucene40Codec.j
> > ava:114):114
> > at
> > org.apache.lucene.codecs.appending.AppendingCodec.(AppendingCo
> > dec.java:34):34
> > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> > Method):-2 at
> > sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source):-
> > 1 at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown
> > Source):-1 at java.lang.reflect.Constructor.newInstance(Unknown
> > Source):-1 at java.lang.Class.newInstance0(Unknown Source):-1 at
> > java.lang.Class.newInstance(Unknown Source):-1 at
> > org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:62):62
> > at
> > org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:42):42
> > at
> > org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:37):37
> > at org.apache.lucene.codecs.Codec.(Codec.java:41):41 at
> > org.apache.lucene.index.LiveIndexWriterConfig.(LiveIndexWriterConfi
> > g.java:118):118
> > at
> > org.apache.lucene.index.IndexWriterConfig.(IndexWriterConfig.java:1
> > 45):145
> > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> > Method):-2 at
> > sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source):-
> > 1 at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown
> > Source):-1 at java.lang.reflect.Constructor.newInstance(Unknown
> > Source):-1 at
> >
> railo.runtime.reflection.pairs.ConstructorInstance.invoke(ConstructorInstanc
> > e.java:34):34
> > at
> >
> railo.runtime.reflection.Reflector.callConstructor(Reflector.java:653):653
> > at railo.runtime.java.JavaObject.init(JavaObject.java:311):311 at
> > railo.runtime.java.JavaObject.call(JavaObject.java:235):235 at
> > railo.runtime.java.JavaObject.call(JavaObject.java:272):272 at
> >
> railo.runtime.util.VariableUtilImpl.callFunctionWithoutNamedValues(Variabl
> > eUtilImpl.java:723):723
> > at
> > railo.runtime.PageContextImpl.getFunction(PageContextImpl.java:1506):150
> > 6 at
> > s21.search.lucene4search_cfc$cf._1(E:\Websites\_S21WAF\CFC\s21\search\
> > Lucene4Search.cfc:92):92
> > at
> > s21.search.lucene4search_cfc$cf.udfCall(E:\Websites\_S21WAF\CFC\s21\sea
> > rch\Lucene4Search.cfc):-1
> > at railo.runtime.type.UDFImpl.implementation(UDFImpl.java:103):103 at
> > railo.runtime.type.UDFImpl._call(UDFImpl.java:371):371 at
> > railo.runtime.type.UDFImpl.call(UDFImpl.java:284):284 at
> > railo.runtime.type.scope.UndefinedImpl.call(UndefinedImpl.java:775):775
> > at
> >
> railo.runtime.util.VariableUtilImpl.callFunctionWithoutNamedValues(Variabl
> > eUtilImpl.java:723):723
> > at
> > railo.r

Re: how much blocksize is set in lucene.

2013-01-09 Thread seacathello
the index lib size is aboule 1TB and have only one segment.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-much-blocksize-is-set-in-lucene-tp4031796p4031797.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



how much blocksize is set in lucene.

2013-01-09 Thread seacathello
now i index very many email file, aboule 50m and every email file size about
4-50k.
the index lib size is aboule 1TB, segment size is only.

In this index lib, which blocksize should i shoose?
4k or 512k, which choice is better?
Thanks very much?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-much-blocksize-is-set-in-lucene-tp4031796.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



RE: Cannot instantiate SPI class

2013-01-09 Thread Uwe Schindler
>  indexWriterConfig = createObject( "java",
> "org.apache.lucene.index.IndexWriterConfig" ).init( Lucene.Version,
> this.indexAnalyzer );

What syntax is that, I have never seen that before!

> where Lucene.Version is an object of Lucene.VERSION_40 and
> this.indexAnalyzer is an Analyzer object that I create before.  one possible
> problem is that Railo ships with a very old version of Lucene, so I had to
> disable some of the jars that ship with Railo but I believe that I removed 
> all of
> them.  I also had to disable a jar of apache-commons-codec that ships with
> Railo to avoid version conflicts.
> stacktrace below:
> 

Which still does not contain the root cause (this comes *after* the stack 
trace), printed like:

Cannot instantiate SPI class: 
org.apache.lucene.codecs.appending.AppendingCodec at 
org.apache.lucene.util.NamedSPILoader.lookup(NamedSPILoader.java:104):104 at
...
Caused by: other Exception stack trace

I need everything behind Caused By.

I must tell you: The line numbers in this stack trace don't correspond to the 
ones officially released Lucene 4.0, so it looks like you have a version 
mismatch, maybe involving alpha/beta/snapshot versions of Lucene, and one of 
these old versions is causing the bug, that was already mentioned by Steven. 
Line 104 in NamedSPILoader of Lucene 4.0 has different code, so I think your 
lucene-core.jar file is outdated.

2nd: If you don't have lucene-codecs.jar, then this error *cannot* happen! If 
it happens, you have some lucene-codecs.jar file with a different Lucene 
version on your classpath. What's the error *without* lucene codecs.jar? (you 
don’t need that file!)

Uwe

> Cannot instantiate SPI class:
> org.apache.lucene.codecs.appending.AppendingCodec at
> org.apache.lucene.util.NamedSPILoader.lookup(NamedSPILoader.java:104):
> 104 at
> org.apache.lucene.codecs.PostingsFormat.forName(PostingsFormat.java:10
> 0):100
> at
> org.apache.lucene.codecs.lucene40.Lucene40Codec.(Lucene40Codec.j
> ava:114):114
> at
> org.apache.lucene.codecs.appending.AppendingCodec.(AppendingCo
> dec.java:34):34
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method):-2 at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source):-
> 1 at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown
> Source):-1 at java.lang.reflect.Constructor.newInstance(Unknown
> Source):-1 at java.lang.Class.newInstance0(Unknown Source):-1 at
> java.lang.Class.newInstance(Unknown Source):-1 at
> org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:62):62
> at
> org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:42):42
> at
> org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:37):37
> at org.apache.lucene.codecs.Codec.(Codec.java:41):41 at
> org.apache.lucene.index.LiveIndexWriterConfig.(LiveIndexWriterConfi
> g.java:118):118
> at
> org.apache.lucene.index.IndexWriterConfig.(IndexWriterConfig.java:1
> 45):145
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method):-2 at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source):-
> 1 at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown
> Source):-1 at java.lang.reflect.Constructor.newInstance(Unknown
> Source):-1 at
> railo.runtime.reflection.pairs.ConstructorInstance.invoke(ConstructorInstanc
> e.java:34):34
> at
> railo.runtime.reflection.Reflector.callConstructor(Reflector.java:653):653
> at railo.runtime.java.JavaObject.init(JavaObject.java:311):311 at
> railo.runtime.java.JavaObject.call(JavaObject.java:235):235 at
> railo.runtime.java.JavaObject.call(JavaObject.java:272):272 at
> railo.runtime.util.VariableUtilImpl.callFunctionWithoutNamedValues(Variabl
> eUtilImpl.java:723):723
> at
> railo.runtime.PageContextImpl.getFunction(PageContextImpl.java:1506):150
> 6 at
> s21.search.lucene4search_cfc$cf._1(E:\Websites\_S21WAF\CFC\s21\search\
> Lucene4Search.cfc:92):92
> at
> s21.search.lucene4search_cfc$cf.udfCall(E:\Websites\_S21WAF\CFC\s21\sea
> rch\Lucene4Search.cfc):-1
> at railo.runtime.type.UDFImpl.implementation(UDFImpl.java:103):103 at
> railo.runtime.type.UDFImpl._call(UDFImpl.java:371):371 at
> railo.runtime.type.UDFImpl.call(UDFImpl.java:284):284 at
> railo.runtime.type.scope.UndefinedImpl.call(UndefinedImpl.java:775):775
> at
> railo.runtime.util.VariableUtilImpl.callFunctionWithoutNamedValues(Variabl
> eUtilImpl.java:723):723
> at
> railo.runtime.PageContextImpl.getFunction(PageContextImpl.java:1506):150
> 6 at
> s21.search.lucene4search_cfc$cf._1(E:\Websites\_S21WAF\CFC\s21\search\
> Lucene4Search.cfc:142):142
> at
> s21.search.lucene4search_cfc$cf.udfCall(E:\Websites\_S21WAF\CFC\s21\sea
> rch\Lucene4Search.cfc):-1
> at railo.runtime.type.UDFImpl.implementation(UDFImpl.java:103):103 at
> railo.runtime.type.UDFImpl._call(UDFImpl.java:371):371 at
> railo.runtime.type.UDFImpl.call(UDFImpl.java:284):284 at
> railo.runtime.ComponentImpl._call(ComponentImpl.java:572):572 at
> railo.runtime.ComponentImpl.

Re: Cannot instantiate SPI class

2013-01-09 Thread Igal @ getRailo.org

hi Uwe,

thank you for answering.  I believe that this is the complete stack 
trace, no (pasted again below)?


I'm actually not trying to do anything fancy with codecs etc.  I'm 
trying to do something very basic:  create an object of type 
indexWriterConfig.  the CFML (Railo) code is as follows:


indexWriterConfig = createObject( "java", 
"org.apache.lucene.index.IndexWriterConfig" ).init( Lucene.Version, 
this.indexAnalyzer );


where Lucene.Version is an object of Lucene.VERSION_40 and 
this.indexAnalyzer is an Analyzer object that I create before.  one 
possible problem is that Railo ships with a very old version of Lucene, 
so I had to disable some of the jars that ship with Railo but I believe 
that I removed all of them.  I also had to disable a jar of 
apache-commons-codec that ships with Railo to avoid version conflicts.


stacktrace below:

Cannot instantiate SPI class: 
org.apache.lucene.codecs.appending.AppendingCodec at 
org.apache.lucene.util.NamedSPILoader.lookup(NamedSPILoader.java:104):104 at 
org.apache.lucene.codecs.PostingsFormat.forName(PostingsFormat.java:100):100 
at 
org.apache.lucene.codecs.lucene40.Lucene40Codec.(Lucene40Codec.java:114):114 
at 
org.apache.lucene.codecs.appending.AppendingCodec.(AppendingCodec.java:34):34 
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method):-2 at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source):-1 
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown 
Source):-1 at java.lang.reflect.Constructor.newInstance(Unknown 
Source):-1 at java.lang.Class.newInstance0(Unknown Source):-1 at 
java.lang.Class.newInstance(Unknown Source):-1 at 
org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:62):62 
at 
org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:42):42 
at 
org.apache.lucene.util.NamedSPILoader.(NamedSPILoader.java:37):37 
at org.apache.lucene.codecs.Codec.(Codec.java:41):41 at 
org.apache.lucene.index.LiveIndexWriterConfig.(LiveIndexWriterConfig.java:118):118 
at 
org.apache.lucene.index.IndexWriterConfig.(IndexWriterConfig.java:145):145 
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method):-2 at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source):-1 
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown 
Source):-1 at java.lang.reflect.Constructor.newInstance(Unknown 
Source):-1 at 
railo.runtime.reflection.pairs.ConstructorInstance.invoke(ConstructorInstance.java:34):34 
at 
railo.runtime.reflection.Reflector.callConstructor(Reflector.java:653):653 
at railo.runtime.java.JavaObject.init(JavaObject.java:311):311 at 
railo.runtime.java.JavaObject.call(JavaObject.java:235):235 at 
railo.runtime.java.JavaObject.call(JavaObject.java:272):272 at 
railo.runtime.util.VariableUtilImpl.callFunctionWithoutNamedValues(VariableUtilImpl.java:723):723 
at 
railo.runtime.PageContextImpl.getFunction(PageContextImpl.java:1506):1506 at 
s21.search.lucene4search_cfc$cf._1(E:\Websites\_S21WAF\CFC\s21\search\Lucene4Search.cfc:92):92 
at 
s21.search.lucene4search_cfc$cf.udfCall(E:\Websites\_S21WAF\CFC\s21\search\Lucene4Search.cfc):-1 
at railo.runtime.type.UDFImpl.implementation(UDFImpl.java:103):103 at 
railo.runtime.type.UDFImpl._call(UDFImpl.java:371):371 at 
railo.runtime.type.UDFImpl.call(UDFImpl.java:284):284 at 
railo.runtime.type.scope.UndefinedImpl.call(UndefinedImpl.java:775):775 
at 
railo.runtime.util.VariableUtilImpl.callFunctionWithoutNamedValues(VariableUtilImpl.java:723):723 
at 
railo.runtime.PageContextImpl.getFunction(PageContextImpl.java:1506):1506 at 
s21.search.lucene4search_cfc$cf._1(E:\Websites\_S21WAF\CFC\s21\search\Lucene4Search.cfc:142):142 
at 
s21.search.lucene4search_cfc$cf.udfCall(E:\Websites\_S21WAF\CFC\s21\search\Lucene4Search.cfc):-1 
at railo.runtime.type.UDFImpl.implementation(UDFImpl.java:103):103 at 
railo.runtime.type.UDFImpl._call(UDFImpl.java:371):371 at 
railo.runtime.type.UDFImpl.call(UDFImpl.java:284):284 at 
railo.runtime.ComponentImpl._call(ComponentImpl.java:572):572 at 
railo.runtime.ComponentImpl._call(ComponentImpl.java:490):490 at 
railo.runtime.ComponentImpl.call(ComponentImpl.java:1781):1781 at 
railo.runtime.util.VariableUtilImpl.callFunctionWithoutNamedValues(VariableUtilImpl.java:723):723 
at 
railo.runtime.PageContextImpl.getFunction(PageContextImpl.java:1506):1506 at 
_test.lucene4_cfm$cf.call(E:\Websites\21solutions\_test\lucene4.cfm:14):14 
at railo.runtime.PageContextImpl.doInclude(PageContextImpl.java:772):772 
at railo.runtime.PageContextImpl.doInclude(PageContextImpl.java:753):753 
at 
railo.runtime.listener.ModernAppListener._onRequest(ModernAppListener.java:183):183 
at 
railo.runtime.listener.MixedAppListener.onRequest(MixedAppListener.java:18):18 
at railo.runtime.PageContextImpl.execute(PageContextImpl.java:2255):2255 
at railo.runtime.PageContextImpl.execute(PageContextImpl.java:): 
at 
railo.runtime.engine.CFMLEngineImpl.serviceCFML(CFMLEngineImpl.java