Re: how do i improve Indexing and Searching performance of 2 billion documents over SolrCloud

2017-02-14 Thread Duke DAI
SSD or in-memory index Best regards, Duke If not now, when? If not me, who? On Wed, Feb 15, 2017 at 12:32 AM, Adrien Grand wrote: > This list is for users of the Lucene Java API, maybe try solr-user instead? > > Le lun. 13 févr. 2017 à 21:24, yeshwanth kumar

Re: Hardcoded checksum mechanism in BlockTreeTermsReader

2016-12-25 Thread Duke DAI
Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > -Original Message- > > From: Michael McCandless [mailto:luc...@mikemccandless.com] > > Sent: Tuesday, December 6, 2016 12:30 PM > > To: Duke DAI <duke.dai@gmail.com> > > Cc: L

Re: Hardcoded checksum mechanism in BlockTreeTermsReader

2016-12-06 Thread Duke DAI
blog.mikemccandless.com > > > On Tue, Dec 6, 2016 at 5:25 AM, Duke DAI <duke.dai@gmail.com> wrote: > > Hi all, > > > > I'm customizing Lucene Directory, which extends o.a.l.store.Directory > based > > on database files. I do not need checksum again on I

Hardcoded checksum mechanism in BlockTreeTermsReader

2016-12-06 Thread Duke DAI
Hi all, I'm customizing Lucene Directory, which extends o.a.l.store.Directory based on database files. I do not need checksum again on IndexIndex and IndexOutput. But in BlockTreeTermsReader constructor, following code open a hard-coded BufferedChecksumIndexInput to checksum on raw IndexInput. I

Re: Having some trouble running tests with custom codec

2015-10-12 Thread Duke DAI
How about add line feed for the single line? It seems I have the impression that line feed is required. Best regards, Duke If not now, when? If not me, who? On Fri, Oct 9, 2015 at 10:07 PM, Sigbjørn Lund Olsen < sigbjorn.lund.ol...@gmail.com> wrote: > As part of my master's thesis I am planning

Re: Re: memory cost in forceMerge(1)

2015-08-11 Thread Duke DAI
From my experience, you must hit some system issue. You should check disk performance at first, disk queue length on Windows. Or you can enable gc verbose to know the gc activities in details. I designed auto upgrade mechanism in application by calling forceMerge(1), to eradicate hybrid index

Re: bug of highlighter/SimpleSpanFragmenter, returned longer fragment than expected?

2015-08-11 Thread Duke DAI
Greetings! Any body has input on this? Best regards, Duke If not now, when? If not me, who? On Fri, Aug 7, 2015 at 10:58 AM, Duke DAI duke.dai@gmail.com wrote: Hi experts, I'm trying to reproduce a bug from Lucene side, and found something. In latest codeline, 5.2.1, I modified test

Re: Standard highlighter returns whole document as a fragment

2015-08-11 Thread Duke DAI
Seems we are encountering same problem. (thread: bug of highlighter/SimpleSpanFragmenter, returned longer fragment than expected?) When debugging, your fragmenter is SimpleSpanFragmenter? isNewFragment() returns true due to below logic? boolean isNewFrag = offsetAtt.endOffset() = (fragmentSize *

bug of highlighter/SimpleSpanFragmenter, returned longer fragment than expected?

2015-08-06 Thread Duke DAI
Hi experts, I'm trying to reproduce a bug from Lucene side, and found something. In latest codeline, 5.2.1, I modified test case HighlighterTest.testSimpleQueryTermScorerHighlighter a little to below, mainly to use SimpleSpanFragmenter to get only one fragment with length 64. public void

Inconsistency of LogMergePolicy and IWC.useCompoundFile

2014-06-19 Thread Duke DAI
Hi Simon, guys, I see LUCENE-5038, useCompoundFile stuff had been refactored. Now I think there are some problems with LogMergePolicy. Example: 1. setting useCompoundFile as false and no changing NOCFSRatio(1.0 by default). 2. starting index, new segment will not use compound file even it's small

Re: Retrieving values for a NumericDocValuesField [SEC=UNOFFICIAL]

2013-10-23 Thread Duke DAI
Hi Stephen, I have the same scenario with you. I verified with simple pure Lucene test, same way as Mike mentioned, performance with NumericDocValue is 10x faster than retrieving stored field. Hope you can get similar performance measurement. Best regards, Duke If not now, when? If not me, who?

Re: problem found with DiskDocValuesFormat

2013-10-22 Thread Duke DAI
the issue you are seeing into a small test case? Mike McCandless http://blog.mikemccandless.com On Mon, Oct 21, 2013 at 10:35 AM, Duke DAI duke.dai@gmail.com wrote: Hi Mike, My scenario, query thread from a ThreadPool will be used to execute query. So thread must have

Re: problem found with DiskDocValuesFormat

2013-10-21 Thread Duke DAI
cases. Do you have any idea about this? Information is enough? Thanks, Duke Best regards, Duke If not now, when? If not me, who? On Tue, Aug 13, 2013 at 4:54 PM, Duke DAI duke.dai@gmail.com wrote: Hi experts, I'm upgrading Lucene 4.4 and trying to use DocValues instead of store

Re: problem found with DiskDocValuesFormat

2013-10-21 Thread Duke DAI
://blog.mikemccandless.com On Mon, Oct 21, 2013 at 6:28 AM, Duke DAI duke.dai@gmail.com wrote: Hi guys, Seems I have the same problem with Lucene45DocValuesFormat, no problem with MemoryDocValuesFormat. The problem I encountered with Lucene4.4 is with DiskDocValuesFormat

Re: Question on wildcard queries, filters, scoring and TooManyClauses exception

2013-08-21 Thread Duke DAI
Some share for this topic. QueryParser queryParser = new QueryParser(Version.LUCENE_30, my_field, new StandardAnalyzer(Version.LUCENE_30)); Query prefixQuery = queryParser.parse(t*); indexSearcher.search(prefixQuery, collector); MultiTermQuery.default(forgot the name) rewriter will be used, if

Re: SPI class of type org.apache.lucene.codecs.Codec error

2013-08-21 Thread Duke DAI
/servo/pom2.xml. It reuses a project which uses Lucene, and the POM is this project is http://lesimisped.free.fr/servo/pom.xml. With a similar project which uses lucene 2.9 we didn't experienced such an issue. Hope that may help. Best regards, GD Le 20/08/2013 16:10, Duke DAI a écrit

Re: SPI class of type org.apache.lucene.codecs.Codec error

2013-08-20 Thread Duke DAI
The link http://maven.apache.org/plugins/maven-shade-plugin/examples/resource-transformers.html#ServicesResourceTransformer will help. Best regards, Duke If not now, when? If not me, who? On Mon, Aug 19, 2013 at 8:48 PM, Amal Kammoun kammoun.ama...@gmail.comwrote: Dear All, Please do you

problem found with DiskDocValuesFormat

2013-08-13 Thread Duke DAI
Hi experts, I'm upgrading Lucene 4.4 and trying to use DocValues instead of store field for performance reason. But due to unknown size of index(depends on customer), so I will use DiskDocValuesFormat, especially for some binary field. Then I wrote my customized Codec: final Codec codec =

Re: problem found with DiskDocValuesFormat

2013-08-13 Thread Duke DAI
that's what you are seeing? So, you must fully re-index after any DiskDVFormat field after upgrading ... Only the default formats support index back compatibility between releases. Mike McCandless http://blog.mikemccandless.com On Tue, Aug 13, 2013 at 4:54 AM, Duke DAI duke.dai@gmail.com

Re: PayloadFunctions don't work the same since 4.1

2013-03-22 Thread Duke DAI
Most likely, the cause is what I said. I guess when you try to convert bytes to number you didn't use the payload.offset to locate the right start of bytes. Before 4.1, the start of payload is the expected value. But since 4.1, you must use the offset and length to get the correct bytes you

Re: PayloadFunctions don't work the same since 4.1

2013-03-21 Thread Duke DAI
I'm not sure your problem relates to the function or getting payload itslef. But after 4.1, in DefaultSimilarity.scorePayload(int doc, int start, int end, BytesRef payload), you must leverage payload.offset and payload.length to get the bytes. (start and end won't get the exact bytes you want).

Re: ArrayIndexOutOfBoundsException: -65536

2012-01-19 Thread Duke DAI
an absurdly enormous document...? Finally, it's possible this is a hardware issue; does it happen on other machines? Mike McCandless http://blog.mikemccandless.com On Wed, Jan 18, 2012 at 8:15 AM, Duke DAI duke.dai@gmail.com wrote: Dear Mike, Thank you very much and sorry for the late reply

Re: ArrayIndexOutOfBoundsException: -65536

2012-01-18 Thread Duke DAI
traceback of the exception? Mike McCandless http://blog.mikemccandless.com On Sun, Jan 15, 2012 at 7:21 PM, Duke DAI duke.dai@gmail.com wrote: Hi friends, Any one meet ArrayIndexOutOfBoundsException: -65536 described in https://issues.apache.org/jira/browse/LUCENE-1995 after it declared

ArrayIndexOutOfBoundsException: -65536

2012-01-15 Thread Duke DAI
Hi friends, Any one meet ArrayIndexOutOfBoundsException: -65536 described in https://issues.apache.org/jira/browse/LUCENE-1995 after it declared being fixed? My lucene version is 3.0.3 and MaxRAMBufferSize is 3M. All other configurations seem to be normal. It's hard to describe the environment and