Hi Ian,
Please find a sample program below which better illustrates the scenario
public class TestWriter {
public static void main(String[] args) throws IOException {
createIndex();
searchIndex();
}
public static void createIndex() throws IOException {
Directory directory = FSDirectory.open(new File("C:\\temp"));
IndexWriterConfig iwriter = new IndexWriterConfig(
Version.LUCENE_44, new
StandardAnalyzer(Version.LUCENE_44));
IndexWriter iWriter = new IndexWriter(directory, iwriter);
Document document1 = new Document();
document1.add(new StringField("FILE_PATH",
"\\Samples\\Batching\\runner.p", Store.YES));
document1.add(new StringField("contents", "runnerfile",
Store.YES));
iWriter.addDocument(document1);
Document document2 = new Document();
document2.add(new StringField("FILE_PATH",
"\\Samples\\Business\\stopper.p", Store.YES));
document2
.add(new StringField("contents", "stopperfile",
Store.YES));
iWriter.addDocument(document2);
iWriter.commit();
iWriter.close();
}
public static void searchIndex() throws IOException {
Directory directory = FSDirectory.open(new File("C:\\temp"));
IndexReader indexReader = DirectoryReader.open(directory);
IndexSearcher indexSearcher = new IndexSearcher(indexReader);
// Create a wildcard query to get all file paths
// This query works fine and returns all the docs in index
Query query1 = new WildcardQuery(new Term("FILE_PATH", "*"));
TopDocs topDocs = indexSearcher.search(query1, 100);
System.out.println("total no of docs " + topDocs.totalHits);
// Create a wildcard query to search for paths starting with
/Samples
// This query doesnt work and returns zero docs
//doest work with "*Samples//*" either
// but works with "*Samples*"
Query query2 = new WildcardQuery(new Term("FILE_PATH",
"*Samples/*"));
TopDocs topDocs2 = indexSearcher.search(query2, 100);
System.out.println("total no of docs " + topDocs2.totalHits);
// Create a wildcard query to search for paths ending with runner.p
// This query works and returns 1 doc
Query query3 = new WildcardQuery(new Term("FILE_PATH",
"*runner.p"));
TopDocs topDocs3 = indexSearcher.search(query3, 100);
System.out.println("total no of docs " + topDocs3.totalHits);
// Queries to search in "contents" field
// Create a wildcard query to search for contents starting with
runner
// This query works and returns one doc
Query query4 = new WildcardQuery(new Term("contents", "runner*"));
TopDocs topDocs4 = indexSearcher.search(query4, 100);
System.out.println("total no of docs " + topDocs4.totalHits);
// Create a wildcard query to search for contents ending with file
// This query works and returns two docs
Query query5 = new WildcardQuery(new Term("contents", "*file"));
TopDocs topDocs5 = indexSearcher.search(query5, 100);
System.out.println("total no of docs " + topDocs5.totalHits);
}
}
I observed that the file path seperator that i am using in the field and
lucene escape charater seem to be same. so whenever i am using a escape
character in the query the search is failing, if i dont use the escape
sequence it is returning the results properly.
Though i am escaping "\" by giving two "\\" the query is still failing.
one way to solve this problem is to replace all "\" with "/" while
indexing. and subsequently using "/" as file path seperator while searching.
But i wouldnt prefer to meddle with the filepath. So is there any
alternative to solve this problem without replacing the file path.
TIA,
Nischal Y
On Mon, Oct 14, 2013 at 10:31 PM, Ian Lea <[email protected]> wrote:
> Seems to me that it should work. I suggest you show us a complete
> self-contained example program that demonstrates the problem.
>
>
> --
> Ian.
>
>
> On Mon, Oct 14, 2013 at 12:42 PM, nischal reddy
> <[email protected]> wrote:
> > Hi Ian,
> >
> > Actually im able to do wildcard searches on all the fields except the
> > "filePath" field. I am able to do both the leading and trailing wildcard
> > searches on all the fields,
> > but when i do the wildcard search on filepath field it is somehow not
> > working, an eg file path would look some thing like this
> "\Samples\F1.cls"
> > i think because of "\" present in the field it is failing. when i do a
> > wildcard search with the query "filePath : *" it is indeed returning all
> > the docs in the index. But when i do any other wildcard searches(leading
> or
> > trailing) it is not working, any clues why it is working in other fields
> > and not working on "filePath" field.
> >
> > TIA,
> > Nischal Y
> >
> >
> > On Mon, Oct 14, 2013 at 4:55 PM, Ian Lea <[email protected]> wrote:
> >
> >> Do some googling on leading wildcards and read things like
> >> http://www.gossamer-threads.com/lists/lucene/java-user/175732 and pick
> >> an option you like.
> >>
> >>
> >> --
> >> Ian.
> >>
> >>
> >> On Mon, Oct 14, 2013 at 9:12 AM, nischal reddy
> >> <[email protected]> wrote:
> >> > Hi,
> >> >
> >> > I have problem with doing wild card search on file path fields.
> >> >
> >> > i have a field "filePath" where i store complete path of files.
> >> >
> >> > i have used StringField to store the field ("i assume by default
> >> > StringField will not be tokenized") .
> >> >
> >> > doc.add(new StringField(FIELD_FILE_PATH,resourcePath, Store.YES));
> >> >
> >> > I am using StandardAnalyzer for IndexWriter
> >> >
> >> > but since i am using a StringField the fields are not analyzed.
> >> >
> >> > After the files are indexed i checked it with Luke the path seems
> fine.
> >> And
> >> > when i do wildcard searches with luke i am getting desired results.
> >> >
> >> > But when i do the same search in my code with IndexSearcher i am
> getting
> >> > zero docs
> >> >
> >> > My searching code looks something like this
> >> >
> >> > indexSearcher.search(new WildcardQuery(new
> >> > Term("filePath","*SuperClass.cls")),100);
> >> >
> >> > this is returning zero documents.
> >> >
> >> > But when i just use "*" in query it is returning all the documents
> >> >
> >> > indexSearcher.search(new WildcardQuery(new Term("filePath","*")),100);
> >> >
> >> > only when i use some queries like prefix wildcard etc it is not
> working
> >> >
> >> > What is possibly going wrong.
> >> >
> >> > Thanks,
> >> > Nischal Y
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [email protected]
> >> For additional commands, e-mail: [email protected]
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>