Re: Changing the default FSLockFactory implementation

2017-05-31 Thread Xiaolong Zheng
Yes, that’s what I need! Thank you! On 5/31/17, 1:58 PM, "Chris Hostetter" wrote: : We are experiencing some “Lock obtain timed out: NativeFSLock@” issues : on or NFS file system, could someone please show me, what’s the right : way to switch the Lucene default NativeFSLo

Changing the default FSLockFactory implementation

2017-05-31 Thread Xiaolong Zheng
Hello, We are experiencing some “Lock obtain timed out: NativeFSLock@” issues on or NFS file system, could someone please show me, what’s the right way to switch the Lucene default NativeFSLockFactory to SimpleFSLockFactory? Thanks, Xiaolong

Re: Collecting all stemming token

2017-02-03 Thread Xiaolong Zheng
, --Xiaolong On Fri, Feb 3, 2017 at 1:16 PM, Xiaolong Zheng wrote: > Hello, > > I am trying collect stemming changes in my search index during the > indexing time. So I could collect a list of stemmed word -> [variety > original word] (e.g: plot -> [plots, plotting, plotted])

Collecting all stemming token

2017-02-03 Thread Xiaolong Zheng
Hello, I am trying collect stemming changes in my search index during the indexing time. So I could collect a list of stemmed word -> [variety original word] (e.g: plot -> [plots, plotting, plotted]) for a later use. I am using k-stem filter + KeywordRepeatFilter + RemoveDuplicatesTokenFilter to

Re: Non-index files under the search directory

2016-11-22 Thread Xiaolong Zheng
be retrieved with IndexWriter#getCommitData() later. > > This may serve as good storage for metadata; as an example, > Elasticsearch stores attributes related to its transaction log there > (UUID and generation identifier). > > Regards, > András > > On Tue, Nov 22, 2016 at

Re: Non-index files under the search directory

2016-11-22 Thread Xiaolong Zheng
dd a StoredField to each document to hold your information? > > Mike McCandless > > http://blog.mikemccandless.com > > > On Mon, Nov 21, 2016 at 11:38 PM, Xiaolong Zheng > wrote: > > Hello, > > > > I am trying to adding some meta data into the search data bas

Non-index files under the search directory

2016-11-21 Thread Xiaolong Zheng
Hello, I am trying to adding some meta data into the search data base. Instead of adding a new search filed or adding a phony document, I am looking at the method org.apache.lucene.store.Directory#createOutpu, which is create new file in the search directory. I am wondering does indexwriter can

Re: Query the doc frequency across multiple search field.

2016-07-20 Thread Xiaolong Zheng
qs as the aggregated doc freq. > > Otherwise, you can also compute this number by running a BooleanQuery with > one SHOULD clause per field. > > Le mar. 19 juil. 2016 à 19:08, Xiaolong Zheng a > écrit : > > > Hi, > > > > I want to know is there any way that q

Query the doc frequency across multiple search field.

2016-07-19 Thread Xiaolong Zheng
Hi, I want to know is there any way that query the doc frequency across multiple search field? The existing API seems only provide the query for a single search field: indexReader.docFreq(new Term(field, word)) Any suggestions that I could get the doc frequency from multiple field? Thanks

How to prevent WordDelimiterFilter tokenize the string with underscore?

2016-06-15 Thread Xiaolong Zheng
Hi, How can I prevent WordDelimiterFilter tokenize the string with underscore, e.g. word_with_underscore. I am using WordDelimiterFilter to create my own Camel Case analyzer, I was using the configuration flag: flags |= GENERATE_WORD_PARTS; flags |= SPLIT_ON_CASE_CHANGE; flags |= PRESERVE_ORIGIN

Searching for "iso surface", and looking for "isosurface"

2015-12-17 Thread Xiaolong Zheng
Hi All, I want to know what's the common way to implement the searching with whitespace removal. For example, if I searching "iso surface" in google, it not only search for "iso" or "surface", but also have a search for "isosurface". Is that just simply add another search clause by removing the w

Re: [EXTERNAL] Re: ignore a match in a query

2015-07-23 Thread Xiaolong Zheng
sion-by-more-precise-linguistic-analysis/ Thanks, Xiaolong On 7/23/15, 1:56 PM, "Fielder, Todd Patrick" wrote: >Unfortunately, that removes all records since all records have the term >"Record type" > >-Original Message- >From: Xiaolong Zheng [mailto:xiaol

Re: ignore a match in a query

2015-07-23 Thread Xiaolong Zheng
Maybe you can use the phrase search like: NOT "\"Record type\"" On 7/23/15, 12:53 PM, "Fielder, Todd Patrick" wrote: >Hi, >I'm wondering if there is a way to ignore a match in a query? For >example, I have two strings > >1) "Record type: record" > >2) "Record type: cd" > >I

Does Lucene 4.6.1 compatible with Java 8?

2015-07-23 Thread Xiaolong Zheng
that they have not be tested? None of the bug fixes associated with the 4.8 release seem to be related to Java 8 compatibility. Any advises would be appreciated. Thanks, Xiaolong Zheng

Re: Lucene Query

2014-08-19 Thread Jin Guang Zheng
ted" > > > >I would suggest 1) if you are going to learn more about Lucene, and 2) > >if you are just want to get some thing out. > > > >Hope this helps, > >Tri > > > >On Aug 19, 2014, at 12:17 PM, Jin Guang Zheng wrote: > > > >Than

Re: Lucene Query

2014-08-19 Thread Jin Guang Zheng
lucene.apache.org/core/4_6_0/core/org/apache/lucene/search/BooleanQuery.html > > However, if your corpus is more sophisticated you'll find that relevance > ranking is not always that trivial :) > > On Aug 19, 2014, at 11:00 AM, Jin Guang Zheng wrote: > > Hi, > > I a

Lucene Query

2014-08-19 Thread Jin Guang Zheng
Hi, I am wondering if someone can help me on this: I have index: doc 1 -- label: United States of America doc 2 -- label: United doc 2 -- label: America doc 2 -- label: States I am wondering how to generate a query with terms: states united america so only doc 1 returns. I was thinking Spa

Re: Lucene Upgrade from 2.9.x to 4.7.x

2014-05-29 Thread Xiaolong Zheng
gt; 6. Apache Lucene Migration Guide <http://lucene.apache.org/core/4_7_2/MIGRATE.html> I would like to say number 5 is also very helpful. Thanks, Xiaolong Zheng On 5/29/14 9:56 AM, "Buddhavarapu, Suresh" wrote: >Hello, > >I'm looking for some documents/inf

Re: about RAMDirectory based B/S plantform problem

2010-08-17 Thread xiaoyan Zheng
to index using indexwriter [DO NOT CLOSE > THE INDEXWRITER HERE] > accept data from thread2 and write to index using indexwriter [DO NOT CLOSE > THE INDEXWRITER HERE] > . > . > > Close IndexWriter > -- > > Would this work/be implementable for your application? >

Re: about RAMDirectory based B/S plantform problem

2010-08-16 Thread xiaoyan Zheng
throw new AlreadyClosedException("this IndexWriter is closed"); } } how to avoid these kind of error? could lucene check this kind of sitation by itself? 2010/8/17 xiaoyan Zheng > > about RAMDirectory based B/S plantform problem > > hello, I just start to use

Re: how to post a question or a message?

2010-08-16 Thread xiaoyan Zheng
Hey, Anshum thanks again~[?] 2010/8/17 anshum.gu...@naukri.com > Hi Hilly, > So this is exactly what you need to do, mail to > java-user@lucene.apache.org and it'd get to all group members. > > > --Original Message-- > From: xiaoyan Zheng > To

about RAMDirectory based B/S plantform problem

2010-08-16 Thread xiaoyan Zheng
about RAMDirectory based B/S plantform problem hello, I just start to use lucene and become confused about RAMDirectory based lucene index establishment, the problem is one user use this RAM to establish index is ok, but, when it comes to multi user, the results is not correct. when i use synchro

how to post a question or a message?

2010-08-16 Thread xiaoyan Zheng
hey, I am new with this mail list thing, i wonder how to post a question or a message? I just send a question to FAQ mail address, but i recevie a letter with none available, have i send the wrong address? regards Hilly

Help wanted with Indexing PDF Documents

2010-03-02 Thread Ching Zheng
Hi, I have about 50 PDF douments with size of each is around 10MB. I am using PDFbox for parsing, just wondering how I can index bookmarsk with its corresponded page information? I use PDDocumentOutline to get bookmark's title, but I only have PDNamedDestination which offers no page number info. C

RE: Lucene search formula

2006-07-07 Thread zheng
Hi, Can somebody explain the lengthNorm, queryNorm and coord in lucene? lengthNorm is the (term freq)/(total terms number) or (term freq)/(max term freq) or something else. queryNorm is the (term squared weight)/(sumOfSqureWeights)? Why we still need queryNorm when it will not affect the score for

batch indexing using RAMDirectory

2006-06-28 Thread zheng
I am a novice in lucene. I write some code to do batch indexing using RAMDirectory according to the code provided in lucene in action, which is something like FSDirectory fsDir = FSDirectory.getDirectory("/tmp/index", true); RAMDirectory ramDir = new RAMDirectory(); IndexWriter fsWriter = IndexW