Re: Searching index problems with tomcat

N. Hira Wed, 27 May 2009 10:02:46 -0700

Okay -- if the problem is not the number of results, then let'sclarify the problem:


1.  You create an index in something like:
        /home/marco/testIndex


2.  You copy over the directory to something like:
        /home/marco/RDFIndexLucene

3. When you run Tomcat, your "searcher" tries to open the index at2. above (using the full path, including the leading slash) and failswith:

        no segments* file found in org.apache.lucene.store

Could you please confirm that this is the problem you are trying toresolve? If no, then please correct what I have stated above.


-h

On 27-May-2009, at 11:22 AM, Marco Lazzara wrote:

In my app I obtain 3 results.But I think is not a problem

Marco Lazzara

2009/5/27 Erick Erickson <[email protected]>
StandardAnalyzer is fine. I loaded your index into Luke and there is
exactly
one document with philipcimiano in the name field.
There is only one document that has researcher in the name field.
Both of these documents (using StandardAnalyzer) return one
document (doc 12 for PHILIPCIMIANO and doc 4 for RESEARCHER)
as I would expect.

So what is the behavior you expect?

Best
Erick
On Wed, May 27, 2009 at 11:47 AM, Marco Lazzara<[email protected]
wrote:
I attache the file testIndex.zip.Run the query with :PHILIPCIMIANO, or
RESEARCHER.

I use StandardAnalyzer.Is it a problem?

Marco Lazzara

2009/5/27 N. Hira <[email protected]>
Not sure if this applies here, but that tends to happen when the
analyzer
you use for indexing is different from the one used in Luke oryou'rerunning into character set issues. Are you using theStandardAnalyzer
in
both cases?
Also, could you post an example of the query you are trying?There aresome very smart people who check this list and they may be ableto help
you
if they had a "sample" of your index, i.e., create a 10-documentindex
and
make it available for download so people can look at it for you?

-h

On 27-May-2009, at 2:02 AM, Marco Lazzara wrote:
* I see that you have reported the creation of 3 files, butdoes Luke
recognize those files as an index and do you see the Documents you
expect
to
see in this index?*
Luke recognizes those files and I see those documents in thisindex but
I
observed that when I run the query Luke finds (for example) only 3
files
of
5.
Any ideas???
Marco Lazzara


2009/5/27 N Hira <[email protected]>
Sorry for the confusion -- I checked the archive and I couldnot find
a
message where you have been able to open the index using Luke.

Have you been able to do that?  I see that you have reported the
creation
of 3 files, but does Luke recognize those files as an indexand do you
see
the Documents you expect to see in this index?

This is the official site for Luke:
http://www.getopt.org/luke/

-h


----- Original Message ----
From: Marco Lazzara <[email protected]>
To: [email protected]
Sent: Tuesday, May 26, 2009 4:59:14 PM
Subject: Re: Searching index problems with tomcat
*Does the part of the web app that is responsible forsearching have
permissions to read "/home/marco/testIndex"?*

Yes It does.It can read everywhere.

*Could you add some code to your searching app to print out the
directory
listing to confirm?*

I've already posted them.See May 19

*Also, I may have missed this posting, but could you provide the
answer
from
Step 3. of mhall's suggestion on 22-May, i.e., did you findthe data
that
you expected in your index using Luke?*


yes.there are 3 files in the index.see May 24

 -rw-r--r--  1 marco marco 4043 2009-05-24 12:00 _5.cfs
 -rw-r--r--  1 marco marco   58 2009-05-24 12:00 segments_c
 -rw-r--r--  1 marco marco   20 2009-05-24 12:00 segments.gen


2009/5/26 N Hira <[email protected]>
Marco,
Does the part of the web app that is responsible forsearching have
permissions to read "/home/marco/testIndex"?

Could you add some code to your searching app to print out the
directory
listing to confirm?

Also, I may have missed this posting, but could you provide the
answer
from
Step 3. of mhall's suggestion on 22-May, i.e., did you findthe data
that
you expected in your index using Luke?

Good luck.

-h



----- Original Message ----
From: Marco Lazzara <[email protected]>
To: [email protected]
Sent: Tuesday, May 26, 2009 3:45:38 PM
Subject: Re: Searching index problems with tomcat
I tried different things.I tried to create the index withoutthe webapplication,I tried to create the index with a webapp and theindex
was
created without any problem.But the research has alway noresult.
For example,if the folder i'm searching on is empty, the webapp
cathces
an
exception : "no segments* file found in
org.apache.lucene.store.
ramdirect...@home/marco/testIndex...."
It means that Lucene  tries to search in that index but it
fails..maybe
the
index is incorrect for a webapp???

MARCO LAZZARA


2009/5/26 Matthew Hall <[email protected]>

 Right.. so perhaps I'm a bit confused here.
The webapp.. is consuming an index.. yes?

Or, are you trying to create an index via a webapp?
I was assuming that you had some sort of indexing softwarethat you
were
using to first build your indexes, which the webapp thenconsumes.
Is that your intent?
Sorry I didn't get back to you before this, but it was aholiday
over
here.
Marco Lazzara wrote:
Ok i solve the problem I've posted before,I run the webapp..It
creates
 the
index in folder  /home/marco/testIndex with 3 files

-rw-r--r--  1 marco marco 4043 2009-05-24 12:00 _5.cfs
-rw-r--r--  1 marco marco   58 2009-05-24 12:00 segments_c
-rw-r--r--  1 marco marco   20 2009-05-24 12:00 segments.gen

but when I run the query I obtain no results!!!!

Why in my folder there are only 3 files???

Marco Lazzara


2009/5/24 Marco Lazzara <[email protected]>
Hi.At step 2 I have only 3 files in the folder,but i thinkis not
a
problema.I've tried to create the index in the web app enot only
in
the
 standalone application but something failes.Tomcat report this
error
 java.io.FileNotFoundException: no segments* file found in
org.apache.lucene.store.ramdirect...@1c2ec05: files:
  at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:604)
   at
org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:111)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
  at
org.apache.lucene.index.IndexReader.open(IndexReader.java:227)
  at
org.apache.lucene.search.IndexSearcher.<init>(IndexSearcher.java:55)
at org.utils.synonym.WordNetSynonymEngine.<init>(UnknownSource)at org.indexing.AlternativeRDFIndexing.<init>(UnknownSource)
  at org.gui.CreazioneIndici.run2(Unknown Source)
  at org.gui.Query.main(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)
  at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
  at
com.sun.javaws.Launcher.executeApplication(Launcher.java:1321)
at com.sun.javaws.Launcher.executeMainClass(Launcher.java:1267)
  at com.sun.javaws.Launcher.doLaunchApp(Launcher.java:1066)
  at com.sun.javaws.Launcher.run(Launcher.java:116)
  at java.lang.Thread.run(Thread.java:619)
this changes everytime one time it is: no segments* filefound in
org.apache.lucene.store.ramdirect...@*1c2ec05*
the second it is no segments* file found in
org.apache.lucene.store.ramdirect...@*170b819*

On the standalone it  works perfectly.

Marco Lazzara

2009/5/22 Matthew Hall <[email protected]>



 humor me.
Open up your indexing software package.
Step 1: In all places where you reference your index,replace
whatever
 the
heck you have there with the following EXACT STRING:

/home/marco/testIndex

Do not leave off the leading slash.

After you have made these changes to the indexing software,
recompile
 and
create your indexes.
Step 2: After your indexing process completes do thefollowing:
cd /home/marco/testIndex/index

You should see files in there, they will look something like
this:
drwxrwxr-x   3 mhall    progs       4.0K May 18 11:19 ..
-rw-rw-r-- 1 mhall progs 80 May 21 16:47_9j7.fnm-rw-rw-r-- 1 mhall progs 4.1G May 21 16:50_9j7.fdt-rw-rw-r-- 1 mhall progs 434M May 21 16:50_9j7.fdx-rw-rw-r-- 1 mhall progs 280M May 21 16:52_9j7.frq-rw-rw-r-- 1 mhall progs 108M May 21 16:52_9j7.prx-rw-rw-r-- 1 mhall progs 329M May 21 16:52_9j7.tis-rw-rw-r-- 1 mhall progs 4.7M May 21 16:52_9j7.tii-rw-rw-r-- 1 mhall progs 108M May 21 16:52_9j7.nrm
-rw-rw-r--   1 mhall    progs         47 May 21 16:52
segments_9je
-rw-rw-r--   1 mhall    progs         20 May 21 16:52
segments.gen
You have now confirmed that you are actually creatingindexes.
 And
the
indexes you are creating exist at EXACTLY the place youhave asked
them
 to.
Step 3: Then.. go download luke, and open these indexes.Perform
a
query
on them, confirm that the data you want is actually IN the
indexes.
Step 4: Now, open up your standalone application, andreplace
whatever
 you
are using in the to open the index with the SAME string Ihave
listed
 above.
Perform a search, verify that the indexes are there, andactually
return
 values.
Step 5: Lastly, go into your web application and againreplace
the
path
with the one I have above, recompile, and perform asearch. Verify
that
 the
indexes are actually THERE and searchable.
This.. damn well SHOULD work, if it doesn't it is likelypointing
to
some
other issues in what you have setup. For example yourtomcat
instance
 could
perhaps not have permission to read the lucene indexesdirectory.
 You
should be able to tell this in the tomcat logs, BUT don't dothis
yet.
Carefully and fully follow the steps I have outlined foryou, and
then
 you
have chased down the full debugging path for this.
If this yields nothing for you, I'd be happy to take acloser
look
at
 your
source code, but until then give this a shot.
Oh.. if it fails, please post back EXACTLY which steps inthe
above
outlined process failed for you, as that will be reallyreally
helpful.
Matt



Marco Lazzara wrote:
I dont't know hot to solve the problem..I've tried allrationals
things.Maybe the last thing is to try to index not with
FSDirectory
but
 with
something else.I have to peruse the api documentation.
But.....IF IT WAS A LUCENE'S BUG???

2009/5/22 Matthew Hall <[email protected]>





 because that's the default index write behavior.
It will create any directory that you ask it to.

Matt


Marco Lazzara wrote:





 ok.I understand what you really mean but It doesn't work.
I understand one thing.For example When i try to openan index
in
the
  following location : "RDFIndexLucene/" but the folder doesn't
exist,*Lucene
create an empty folder named "RDFIndexLucene"* in my home
folder...WHY???

MARCO LAZZARA

2009/5/22 Matthew Hall <[email protected]>







 For writing indexes?
Well I guess it depends on what you want.. but Ipersonally
use
this:

(2.3.2 API)
File INDEX_DIR = "/data/searchtool/thisismyindexdirectory"Analyzer analyzer = newWhateverConcreteAnalyzerYouWant();
writer = new IndexWriter(/INDEX_DIR/, /analyzer/, true);
Your best bet would be to peruse the API docs ofwhatever
lucene
version
you are using.
However, I'm still pretty sure this ISN'T your actualissue
here.
Looking at your "full path" example those still seemto be by
reference
to
me. Let me be more specific and tell you EXACTLY whatI mean
by
that,

Lets say you are running your program in the following
directory:
/home/test/app/
Trying to open an index like you have below willeffectively
be
trying
to
open an index in the following location:

/home/test/app/home/marco/RdfIndexLucene

What I think you MEAN to be doing is:

/home/marco/RdfIndexLucene
That leading slash is VERY VERY important, as its theentire
difference
between an relative path and an absolute one.

Matt


Marco Lazzara wrote:







 I was talking with my teacher.
Is it correct to use FSDirectory?Could you pleaselook again
at
the
  code
I've posted here??
Should I choose a different way to Indexing ??

Marco Lazzara




2009/5/22 Ian Lea <[email protected]>
OK. I'd still like to see some evidence, but nevermind.
Next suggestion is the old standby - cut the codedown to
the
absolute
minimum to demonstrate the problem and post ithere. I
know
you've
already posted some code, but maybe not all of it, and
definitely
  not
cut down to the absolute minimum.


--
Ian.


On Thu, May 21, 2009 at 10:48 PM, Marco Lazzara <
[email protected]
    wrote:
_I strongly suggest that you use a full path nameand/or
provide
   some
evidence that your readers and writers are usingthe same
directory
and thus lucene index.
_
I try a full path like home/marco/RdfIndexLucene,even
media/disk/users/fratelli/RDFIndexLucene.Butnothing is
changed.
MARCOLAZZARA
_

_
Its been a few days, and we haven't heard backabout this
issue,
   can
we assume that you fixed it via using fullyqualified
paths
then?

Matt

Ian Lea wrote:








 Marco
You haven't answered Matt's question about whereyou are
running
it
from. Tomcat's default directory may well notbe the
same
as
   yours.
I strongly suggest that you use a full path nameand/or
provide
   some
evidence that your readers and writers are usingthe
same
directory
and thus lucene index.


--
Ian.


On Wed, May 20, 2009 at 9:59 AM, Marco Lazzara
<[email protected]> wrote:
I've posted the indexing part,but I don't usethis in
my
app.After
I
create the index,I put that in a folder like








 /home/marco/RDFIndexLucece
and when I run the query I'm only searching(and not
indexing).
String[] fieldsearch = new String[] {"name","synonyms",
"propIn"};




  //RDFinder rdfind = new
RDFinder("RDFIndexLucene/",fieldsearch);


 TreeMap<Integer, ArrayList<String>> paths;
try {
          this.paths = this.rdfind.Search(text,
"path");
      } catch (ParseException e1) {
          e1.printStackTrace();
      } catch (IOException e1) {
          e1.printStackTrace();
      }

Marco Lazzara









 Sorry, anyhow looking over this quickly here's a
summarization
of








 what
 I see:
 You have documents in your index that look like the
following:
  name which is indexed and stored.
 synonyms which are indexed and stored
path, which is stored but not indexed
propin, which is stored and indexed
propinnum, which is stored but not indexed
and ... vicinity I guess which is stored but not
indexed
For an analyzer you are using Standardanalyzer (which
considering








 all
 the Italian? is an interesting choice.)
And you are opening your index using FSDirectory,in what
appears
to




  be a by reference fashion (You don't have a fully
qualified
   path
 to
where your index is, you are ASSUMING that itsin the
same
    directory
as this code, unless FSDirectory is notimplemented as
I
think
it








 is.)
Now can I see the consumer code? Specificallythe
part
where
   you
 are
 opening the index/constructing your queries?
I'm betting what's going on here is you aredeploying
this
as
   a
 war
file into tomcat, and its just not reallyfinding the
index
    as
a
result of how the war file is gettingdeployed, but
looking
    more
closely at the source code should reveal if my
suspicion
is
    correct
here.
Also runtime wise, when you run yourstandalone app,
where
    specifically in your directory structure are you running
it
    from?
Cause if you are opening your index reader/searcher in
the
    same
way








 as
you are creating your writer here, I'm prettydarn
certain
that
 will
 cause you problems.
  Matt
 Marco Lazzara wrote:
_Could you further post your Analyzer Setup/Query
Building
    code
from
BOTH apps. _
there is only one code.It is the same for weband for
standalone.
And it is exactly the real problem!!the codeis the
same,libraries
are
the same,query index etc etc. are the same.

This is the class that create index


public class AlternativeRDFIndexing {
  private Analyzer analyzer;
 private Directory directory;
 private IndexWriter iwriter;
 private WordNetSynonymEngine wns;
 private AlternativeResourceAnalysis rs;
 public ArrayList<String> commonnodes;
  //private RDFinder rdfind = new








 RDFinder("RDFIndexLucene/",new
 String[] {"name"});
  //    public boolean Exists(String node) throws
ParseException,
  IOException{
 //           //        return rdfind.Exists(node);
//    }
public AlternativeRDFIndexing(Stringinputfilename)
throws
IOException, ParseException{
        commonnodes = new ArrayList<String>();
// bisogna istanziare un oggettoper fare
analisi
    sul
documento rdf
   rs = new
AlternativeResourceAnalysis(inputfilename);
              ArrayList<String> nodelist =
rs.getResources();
   int nodesize = nodelist.size();
ArrayList<String> sourcelist = rs.getsource();
   int sourcesize = sourcelist.size();
          //sinonimi
   wns = new WordNetSynonymEngine("sinonimi/");
          //creazione di un analyzer standard
   analyzer = new StandardAnalyzer();

   //Memorizza l'indice in RAM:
//Directory directory = newRAMDirector();
          //Memorizza l'indice su file
          directory =
FSDirectory.getDirectory("RDFIndexLucene/");
          //Creazione istanza per la scrittura
dell'indice
       //Tale istanza viene fornita di analyzer, di un
boolean
    per
indicare se ricreare o meno da zero
   //la struttura e di una dimensione massima (o
infinita
    IndexWriter.MaxFieldLength.UNLIMITED)
iwriter = new IndexWriter(directory,analyzer,
true,
new
    IndexWriter.MaxFieldLength(25000));
//costruiamo un indice consolo n
documenti:








 un
 documento per nodo
           for (int i = 0; i < nodesize; i++){
                    Document doc = new Document();
                   //creazione dei vari campi
                  // ogni documento avrˆ
       // un campo name: nome del nodo
// indicazione di memorizzazione(Store.YES) e
indicizzazione
con analyzer(ANALYZED)
                  String node = nodelist.get(i);
//if (sourcelist.contains(node))
break;
                      //if (rdfind.Exists(node))
commonnodes.add(node);
Field field = new Field("name",
node,
Field.Store.YES,Field.Index.ANALYZED);
       //Aggiunta campo al documento
       doc.add(field);
                  //Aggiungo i sinonimi
String[] nodesynonyms = wns.getSynonyms(node);for (int is = 0; is <nodesynonyms.length;
is++)
{
                              field = new Field("synonyms",
nodesynonyms[is],
Field.Store.YES,Field.Index.ANALYZED);
           //Aggiunta campo al documento
           doc.add(field);
       }
// uno o piu campi path_i:path
minimali
 dalle
 sorgenti al nodo
        // non indicizzati
for (int j = 0; j <sourcesize; j++)
{
        String source = sourcelist.get(j);
ArrayList<LinkedList<String>> path = new
ArrayList<LinkedList<String>>();
       try{
if ((source.equals(node)) ||
(sourcelist.contains(node))){
               field = new Field("path", "null",
Field.Store.YES,
Field.Index.NO);
               doc.add(field);
           }
           else{
               path = rs.getPaths(source, node);
for (int ii = 0; ii < path.size();
ii++)
{
                       String pp =
rs.getPath(path.get(ii));
                       field = new Field("path", pp,
Field.Store.YES,
Field.Index.NO);
                   doc.add(field);
}
  ...
[Messaggio troncato]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


----------------------------------------------------------------------
Hira, N.R.
Cognocys, Inc.
(773) 251-7453

Catch up on the news.  http://www.cognocys.com/prospector/news.html






---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Searching index problems with tomcat

Reply via email to