Re: Searching index problems with tomcat

N. Hira Wed, 27 May 2009 07:41:01 -0700

Not sure if this applies here, but that tends to happen when theanalyzer you use for indexing is different from the one used in Lukeor you're running into character set issues. Are you using theStandardAnalyzer in both cases?

Also, could you post an example of the query you are trying? Thereare some very smart people who check this list and they may be ableto help you if they had a "sample" of your index, i.e., create a 10-document index and make it available for download so people can lookat it for you?


-h

On 27-May-2009, at 2:02 AM, Marco Lazzara wrote:

* I see that you have reported the creation of 3 files, but does Luke

recognize those files as an index and do you see the Documents youexpect to

see in this index?*

Luke recognizes those files and I see those documents in this indexbut Iobserved that when I run the query Luke finds (for example) only 3files of

5.
Any ideas???
Marco Lazzara


2009/5/27 N Hira <[email protected]>

Sorry for the confusion -- I checked the archive and I could notfind a

message where you have been able to open the index using Luke.

Have you been able to do that? I see that you have reported thecreationof 3 files, but does Luke recognize those files as an index and doyou see

the Documents you expect to see in this index?

This is the official site for Luke:
http://www.getopt.org/luke/

-h


----- Original Message ----
From: Marco Lazzara <[email protected]>
To: [email protected]
Sent: Tuesday, May 26, 2009 4:59:14 PM
Subject: Re: Searching index problems with tomcat

*Does the part of the web app that is responsible for searching have
permissions to read "/home/marco/testIndex"?*

Yes It does.It can read everywhere.

*Could you add some code to your searching app to print out thedirectory

listing to confirm?*

I've already posted them.See May 19

*Also, I may have missed this posting, but could you provide theanswer

from

Step 3. of mhall's suggestion on 22-May, i.e., did you find thedata that

you expected in your index using Luke?*


yes.there are 3 files in the index.see May 24

 -rw-r--r--  1 marco marco 4043 2009-05-24 12:00 _5.cfs
 -rw-r--r--  1 marco marco   58 2009-05-24 12:00 segments_c
 -rw-r--r--  1 marco marco   20 2009-05-24 12:00 segments.gen


2009/5/26 N Hira <[email protected]>

Marco,

Does the part of the web app that is responsible for searching have
permissions to read "/home/marco/testIndex"?
Could you add some code to your searching app to print out thedirectory
listing to confirm?
Also, I may have missed this posting, but could you provide theanswer

from

Step 3. of mhall's suggestion on 22-May, i.e., did you find thedata that
you expected in your index using Luke?

Good luck.

-h



----- Original Message ----
From: Marco Lazzara <[email protected]>
To: [email protected]
Sent: Tuesday, May 26, 2009 3:45:38 PM
Subject: Re: Searching index problems with tomcat
I tried different things.I tried to create the index without thewebapplication,I tried to create the index with a webapp and theindex was
created without any problem.But the research has alway no result.
For example,if the folder i'm searching on is empty, the webappcathces

an

exception : "no segments* file found in
org.apache.lucene.store.
ramdirect...@home/marco/testIndex...."
It means that Lucene tries to search in that index but itfails..maybe

the

index is incorrect for a webapp???

MARCO LAZZARA


2009/5/26 Matthew Hall <[email protected]>

Right.. so perhaps I'm a bit confused here.

The webapp.. is consuming an index.. yes?

Or, are you trying to create an index via a webapp?

I was assuming that you had some sort of indexing software that you

were

using to first build your indexes, which the webapp then consumes.

Is that your intent?
Sorry I didn't get back to you before this, but it was a holidayover

here.





Marco Lazzara wrote:

Ok i solve the problem I've posted before,I run the web app..It

creates

the
index in folder  /home/marco/testIndex with 3 files

-rw-r--r--  1 marco marco 4043 2009-05-24 12:00 _5.cfs
-rw-r--r--  1 marco marco   58 2009-05-24 12:00 segments_c
-rw-r--r--  1 marco marco   20 2009-05-24 12:00 segments.gen

but when I run the query I obtain no results!!!!

Why in my folder there are only 3 files???

Marco Lazzara


2009/5/24 Marco Lazzara <[email protected]>

Hi.At step 2 I have only 3 files in the folder,but i think isnot aproblema.I've tried to create the index in the web app e notonly in

the

standalone application but something failes.Tomcat report thiserror
 java.io.FileNotFoundException: no segments* file found in
org.apache.lucene.store.ramdirect...@1c2ec05: files:
   at

org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:604)

at

org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)

at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)at org.apache.lucene.index.IndexReader.open(IndexReader.java:227)
   at
org.apache.lucene.search.IndexSearcher.<init>(IndexSearcher.java:55)at org.utils.synonym.WordNetSynonymEngine.<init>(UnknownSource)
   at org.indexing.AlternativeRDFIndexing.<init>(Unknown Source)
   at org.gui.CreazioneIndici.run2(Unknown Source)
   at org.gui.Query.main(Unknown Source)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

at

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

   at java.lang.reflect.Method.invoke(Method.java:597)

at com.sun.javaws.Launcher.executeApplication(Launcher.java:1321)at com.sun.javaws.Launcher.executeMainClass(Launcher.java:1267)

   at com.sun.javaws.Launcher.doLaunchApp(Launcher.java:1066)
   at com.sun.javaws.Launcher.run(Launcher.java:116)
   at java.lang.Thread.run(Thread.java:619)

this changes everytime one time it is: no segments* file found in
org.apache.lucene.store.ramdirect...@*1c2ec05*
the second it is no segments* file found in
org.apache.lucene.store.ramdirect...@*170b819*

On the standalone it  works perfectly.

Marco Lazzara

2009/5/22 Matthew Hall <[email protected]>

humor me.

Open up your indexing software package.

Step 1: In all places where you reference your index, replace

whatever

the
heck you have there with the following EXACT STRING:

/home/marco/testIndex

Do not leave off the leading slash.

After you have made these changes to the indexing software,

recompile

and
create your indexes.

Step 2: After your indexing process completes do the following:

cd /home/marco/testIndex/index
You should see files in there, they will look something likethis:
drwxrwxr-x   3 mhall    progs       4.0K May 18 11:19 ..
-rw-rw-r--   1 mhall    progs         80 May 21 16:47 _9j7.fnm
-rw-rw-r--   1 mhall    progs       4.1G May 21 16:50 _9j7.fdt
-rw-rw-r--   1 mhall    progs       434M May 21 16:50 _9j7.fdx
-rw-rw-r--   1 mhall    progs       280M May 21 16:52 _9j7.frq
-rw-rw-r--   1 mhall    progs       108M May 21 16:52 _9j7.prx
-rw-rw-r--   1 mhall    progs       329M May 21 16:52 _9j7.tis
-rw-rw-r--   1 mhall    progs       4.7M May 21 16:52 _9j7.tii
-rw-rw-r--   1 mhall    progs       108M May 21 16:52 _9j7.nrm
-rw-rw-r-- 1 mhall progs 47 May 21 16:52segments_9je-rw-rw-r-- 1 mhall progs 20 May 21 16:52segments.gen
You have now confirmed that you are actually creatingindexes. And
the
indexes you are creating exist at EXACTLY the place you haveasked
them
to.
Step 3: Then.. go download luke, and open these indexes.Perform a
query
on them, confirm that the data you want is actually IN theindexes.
Step 4: Now, open up your standalone application, and replace

whatever

you
are using in the to open the index with the SAME string I have

listed

above.
Perform a search, verify that the indexes are there, andactually
return
values.
Step 5: Lastly, go into your web application and againreplace the
path
with the one I have above, recompile, and perform a search.Verify
that
the
indexes are actually THERE and searchable.
This.. damn well SHOULD work, if it doesn't it is likelypointing to
some
other issues in what you have setup.  For example your tomcat

instance

could
perhaps not have permission to read the lucene indexesdirectory.

You

should be able to tell this in the tomcat logs, BUT don't dothis

yet.

Carefully and fully follow the steps I have outlined foryou, and
then
you
have chased down the full debugging path for this.
If this yields nothing for you, I'd be happy to take a closerlook

at

your
source code, but until then give this a shot.
Oh.. if it fails, please post back EXACTLY which steps in theabove
outlined process failed for you, as that will be really really

helpful.

Matt



Marco Lazzara wrote:
I dont't know hot to solve the problem..I've tried allrationalsthings.Maybe the last thing is to try to index not withFSDirectory

but

with
something else.I have to peruse the api documentation.
But.....IF IT WAS A LUCENE'S BUG???

2009/5/22 Matthew Hall <[email protected]>

because that's the default index write behavior.

It will create any directory that you ask it to.

Matt


Marco Lazzara wrote:

ok.I understand what you really mean but It doesn't work.
I understand one thing.For example When i try to open anindex in

the

following location : "RDFIndexLucene/" but the folder doesn't
exist,*Lucene
create an empty folder named "RDFIndexLucene"* in my home
folder...WHY???

MARCO LAZZARA

2009/5/22 Matthew Hall <[email protected]>

For writing indexes?

Well I guess it depends on what you want.. but Ipersonally use

this:

(2.3.2 API)

File INDEX_DIR = "/data/searchtool/thisismyindexdirectory"
Analyzer analyzer = new WhateverConcreteAnalyzerYouWant();

writer = new IndexWriter(/INDEX_DIR/, /analyzer/, true);

Your best bet would be to peruse the API docs of whateverlucene

version
you are using.

However, I'm still pretty sure this ISN'T your actual issue

here.

Looking at your "full path" example those still seem tobe by
reference
to
me. Let me be more specific and tell you EXACTLY what Imean by
that,

Lets say you are running your program in the following

directory:


/home/test/app/

Trying to open an index like you have below willeffectively be

trying
to
open an index in the following location:

/home/test/app/home/marco/RdfIndexLucene

What I think you MEAN to be doing is:

/home/marco/RdfIndexLucene

That leading slash is VERY VERY important, as its the entire
difference
between an relative path and an absolute one.

Matt


Marco Lazzara wrote:

I was talking with my teacher.
Is it correct to use FSDirectory?Could you please lookagain at

the

code
I've posted here??
Should I choose a different way to Indexing ??

Marco Lazzara




2009/5/22 Ian Lea <[email protected]>

OK.  I'd still like to see some evidence, but never mind.
Next suggestion is the old standby - cut the code downto the
absolute
minimum to demonstrate the problem and post it here. Iknow
you've
already posted some code, but maybe not all of it, and

definitely

not
cut down to the absolute minimum.


--
Ian.


On Thu, May 21, 2009 at 10:48 PM, Marco Lazzara <
[email protected]
     wrote:

_I strongly suggest that you use a full path name and/or

provide

some
evidence that your readers and writers are using the same
directory
and thus lucene index.
_
I try a full path like home/marco/RdfIndexLucene,even
media/disk/users/fratelli/RDFIndexLucene.But nothing is

changed.

MARCOLAZZARA
_

_
Its been a few days, and we haven't heard back aboutthis

issue,

can
we assume that you fixed it via using fully qualifiedpaths
then?

Matt

Ian Lea wrote:
Marco
You haven't answered Matt's question about where youare
running
it
from. Tomcat's default directory may well not bethe same

as

yours.
I strongly suggest that you use a full path name and/or

provide

some

evidence that your readers and writers are using thesame

directory
and thus lucene index.


--
Ian.


On Wed, May 20, 2009 at 9:59 AM, Marco Lazzara
<[email protected]> wrote:

I've posted the indexing part,but I don't use thisin my
app.After
I
create the index,I put that in a folder like

/home/marco/RDFIndexLucece

and when I run the query I'm only searching (and not

indexing).

String[] fieldsearch = new String[] {"name", "synonyms",
"propIn"};

 //RDFinder rdfind = new
RDFinder("RDFIndexLucene/",fieldsearch);

TreeMap<Integer, ArrayList<String>> paths;
try {

this.paths = this.rdfind.Search(text,"path");

       } catch (ParseException e1) {
           e1.printStackTrace();
       } catch (IOException e1) {
           e1.printStackTrace();
       }

Marco Lazzara

Sorry, anyhow looking over this quickly here's a
summarization
of

what

I see:

You have documents in your index that look like the

following:

 name which is indexed and stored.

synonyms which are indexed and stored
path, which is stored but not indexed
propin, which is stored and indexed
propinnum, which is stored but not indexed
and ... vicinity I guess which is stored but notindexed
For an analyzer you are using Standard analyzer(which
considering
all

the Italian? is an interesting choice.)

And you are opening your index using FSDirectory, inwhat
appears
to
 be a by reference fashion (You don't have a fully

qualified

path

to

where your index is, you are ASSUMING that its in the

same

directory
as this code, unless FSDirectory is notimplemented as I
think
it
is.)

Now can I see the consumer code?  Specifically the part

where

you
are
opening the index/constructing your queries?
I'm betting what's going on here is you aredeploying this

as

war

file into tomcat, and its just not really finding the

index

as
a
result of how the war file is getting deployed, but

looking

more
closely at the source code should reveal if mysuspicion

is

correct
here.

Also runtime wise, when you run your standalone app,

where

specifically in your directory structure are yourunning

it

from?
Cause if you are opening your index reader/searcher in

the

same
way
as
you are creating your writer here, I'm pretty darncertain

that

will

cause you problems.

 Matt

Marco Lazzara wrote:

_Could you further post your Analyzer Setup/Query

Building

code
from
BOTH apps. _

there is only one code.It is the same for web andfor

standalone.
And it is exactly the real problem!!the code is the
same,libraries
are
the same,query index etc etc. are the same.

This is the class that create index


public class AlternativeRDFIndexing {
   private Analyzer analyzer;
 private Directory directory;
 private IndexWriter iwriter;
 private WordNetSynonymEngine wns;
 private AlternativeResourceAnalysis rs;
 public ArrayList<String> commonnodes;
   //private RDFinder rdfind = new

RDFinder("RDFIndexLucene/",new

String[] {"name"});

 //    public boolean Exists(String node) throws

ParseException,

 IOException{

//           //        return rdfind.Exists(node);

//    }

public AlternativeRDFIndexing(Stringinputfilename)

throws
IOException, ParseException{
         commonnodes = new ArrayList<String>();
           // bisogna istanziare un oggetto per fare

analisi

sul
documento rdf

rs = new AlternativeResourceAnalysis(inputfilename);


               ArrayList<String> nodelist =
rs.getResources();
    int nodesize = nodelist.size();
    ArrayList<String> sourcelist = rs.getsource();
    int sourcesize = sourcelist.size();
           //sinonimi
    wns = new WordNetSynonymEngine("sinonimi/");
           //creazione di un analyzer standard
    analyzer = new StandardAnalyzer();

    //Memorizza l'indice in RAM:

//Directory directory = new RAMDirector();

           //Memorizza l'indice su file
           directory =
FSDirectory.getDirectory("RDFIndexLucene/");
           //Creazione istanza per la scrittura

dell'indice

    //Tale istanza viene fornita di analyzer, di un

boolean

per
indicare se ricreare o meno da zero
    //la struttura e di una dimensione massima (o

infinita

IndexWriter.MaxFieldLength.UNLIMITED)
iwriter = new IndexWriter(directory,analyzer, true,

new

IndexWriter.MaxFieldLength(25000));
                  //costruiamo un indice con solo n
documenti:

un

documento per nodo

           for (int i = 0; i < nodesize; i++){

                    Document doc = new Document();

                   //creazione dei vari campi

                   // ogni documento avrˆ
        // un campo name: nome del nodo

// indicazione di memorizzazione(Store.YES) e

indicizzazione
con analyzer(ANALYZED)
                   String node = nodelist.get(i);
                   //if (sourcelist.contains(node))

break;

                   //if (rdfind.Exists(node))
commonnodes.add(node);
Field field = new Field("name", node,
Field.Store.YES,Field.Index.ANALYZED);
        //Aggiunta campo al documento
        doc.add(field);
                   //Aggiungo i sinonimi
String[] nodesynonyms = wns.getSynonyms(node);for (int is = 0; is <nodesynonyms.length; is++)

field = new Field("synonyms",

nodesynonyms[is],
Field.Store.YES,Field.Index.ANALYZED);
            //Aggiunta campo al documento
            doc.add(field);
        }
                   // uno o piu campi path_i: path

minimali

dalle

sorgenti al nodo

        // non indicizzati

for (int j = 0; j < sourcesize; j++) {

        String source = sourcelist.get(j);

ArrayList<LinkedList<String>> path = new
ArrayList<LinkedList<String>>();
        try{
                       if ((source.equals(node)) ||
(sourcelist.contains(node))){
                field = new Field("path", "null",
Field.Store.YES,
Field.Index.NO);
                doc.add(field);
            }
            else{
                path = rs.getPaths(source, node);

for (int ii = 0; ii < path.size(); ii++)

                    String pp =

rs.getPath(path.get(ii));

                    field = new Field("path", pp,
Field.Store.YES,
Field.Index.NO);
                    doc.add(field);

            }
                           }
        catch (IllegalArgumentException e){
            System.out.println("source: "+source+ "

node:

"+node);
            field = new Field("path", "null",
Field.Store.YES,
Field.Index.NO);
            doc.add(field);
        }
                                         }
                   // proprietˆ entranti
        // indicizzati
      //versione con i sinonimi
                   ArrayList<String> y =
rs.getInProperty(node);
                   if (y != null) {

        for (int j = 0; j < y.size(); j++) {
                           String propin = y.get(j);
            field = new Field("propIn", propin,
Field.Store.YES,
Field.Index.ANALYZED);
            doc.add(field);
        String[] propinsynonyms =

wns.getSynonyms(propin);

                   for (int is = 0; is <
propinsynonyms.length;
is++) {

field = new Field("propIn",

propinsynonyms[is],
Field.Store.YES,Field.Index.ANALYZED);
            //Aggiunta campo al documento
            doc.add(field);
             }
                     }
                   // un campo num_propIn: numero di
proprietˆ
entranti
        // non indicizzato
                   String num_propIN =
String.valueOf(y.size());
                   field = new Field("num_propIn",
num_propIN,
Field.Store.YES,
Field.Index.NO);
        doc.add(field);
                   }
                   else {
                           String num_propIN =

String.valueOf(0);

field = new Field("num_propIn",

num_propIN,

 Field.Store.YES, Field.Index.NO);

            doc.add(field);

                       }
                   // i vicini del nodo
                   ArrayList<String> v =

rs.getVicini(node);

                   if (v != null) {

        for (int j = 0; j < v.size(); j++) {
                           String vicino = v.get(j);
            field = new Field("vicini", vicino,
Field.Store.YES,
Field.Index.ANALYZED);
            doc.add(field);
}
                   }

//aggiunta

documento

all'indice

        iwriter.addDocument(doc);

           iwriter.close();

    directory.close();
       }
   public int getNR(){
    return rs.NumResource();
 }


}

MARCO LAZZARA

Things that could help us immensely here.
Can you post your indexReader/Searcherinitialization

code

from
your
standalone app, as well as your webapp.
Could you further post your Analyzer Setup/QueryBuilding

code

 from
both apps.
Could you further post the document creationcode used

at

indexing
time? (Which analyzer, and which fields are

indexed/stored)

Give us this, and I'm pretty darn sure we cannail down
your
issue.

Matt

Ian Lea wrote:
...
There are no exceptions.When I run the query anew

shell

is
displayed but
 with no result.

New shell?

_*Are you sure the index is the same - what do
IndexReader.maxDoc(),
numDocs() and getVersion() say, standalone
and in tomcat?

*_What do you mean with this question??

IndexReader ir = ...

System.out.printf("maxDoc=%s, ...", ir.maxDoc(), ...);


and run in tomcat and standalone.  To absolutely

confirm

you're
looking at the same index, and it hasdocuments, etc.
--
Ian.
---------------------------------------------------------------------
To unsubscribe, e-mail:
[email protected]
For additional commands, e-mail:
 [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail:
[email protected]
For additional commands, e-mail:
[email protected]
__________ Information from ESET NOD32 Antivirus,version

of

virus
signature database 4087 (20090519) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

__________ Information from ESET NOD32 Antivirus,

version

of

virus
signature database 4087 (20090519) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

---------------------------------------------------------------------

To unsubscribe, e-mail:

[email protected]

For additional commands, e-mail:
[email protected]

---------------------------------------------------------------------

To unsubscribe, e-mail:
[email protected]
For additional commands, e-mail:
[email protected]
__________ Information from ESET NOD32 Antivirus,version

of

virus
signature database 4088 (20090519) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

__________ Information from ESET NOD32 Antivirus,version

of

virus
signature database 4088 (20090519) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

---------------------------------------------------------------------

To unsubscribe, e-mail:
[email protected]
For additional commands, e-mail:
[email protected]

---------------------------------------------------------------------

To unsubscribe, e-mail:
[email protected]
For additional commands, e-mail:
[email protected]

---------------------------------------------------------------------

To unsubscribe, e-mail:

[email protected]

For additional commands, e-mail:
[email protected]
__________ Information from ESET NOD32 Antivirus,version of
virus
signature database 4093 (20090521) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com
__________ Information from ESET NOD32 Antivirus,version of
virus

signature database 4094 (20090521) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

---------------------------------------------------------------------

To unsubscribe, e-mail:

[email protected]

For additional commands, e-mail:
[email protected]

---------------------------------------------------------------------

To unsubscribe, e-mail:

[email protected]

For additional commands, e-mail:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-[email protected]
For additional commands, e-mail:

[email protected]

--
Matthew Hall
Software Engineer
Mouse Genome Informatics
[email protected]
(207) 288-6012
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-[email protected]For additional commands, e-mail: java-user-[email protected]
--
Matthew Hall
Software Engineer
Mouse Genome Informatics
[email protected]
(207) 288-6012

---------------------------------------------------------------------

To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: java-user-[email protected]
--
Matthew Hall
Software Engineer
Mouse Genome Informatics
[email protected]
(207) 288-6012
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Searching index problems with tomcat

Reply via email to