Re: how do i improve Indexing and Searching performance of 2 billion documents over SolrCloud

2017-02-14 Thread Duke DAI
SSD or in-memory index Best regards, Duke If not now, when? If not me, who? On Wed, Feb 15, 2017 at 12:32 AM, Adrien Grand wrote: > This list is for users of the Lucene Java API, maybe try solr-user instead? > > Le lun. 13 févr. 2017 à 21:24, yeshwanth kumar a > écrit : > > > Hi, we have 4 sol

Re: Hardcoded checksum mechanism in BlockTreeTermsReader

2016-12-25 Thread Duke DAI
://www.thetaphi.de > eMail: u...@thetaphi.de > > > -Original Message- > > From: Michael McCandless [mailto:luc...@mikemccandless.com] > > Sent: Tuesday, December 6, 2016 12:30 PM > > To: Duke DAI > > Cc: Lucene Users > > Subject: Re: Hardcoded chec

Re: Hardcoded checksum mechanism in BlockTreeTermsReader

2016-12-06 Thread Duke DAI
ss > > http://blog.mikemccandless.com > > > On Tue, Dec 6, 2016 at 5:25 AM, Duke DAI wrote: > > Hi all, > > > > I'm customizing Lucene Directory, which extends o.a.l.store.Directory > based > > on database files. I do not need checksum again on

Hardcoded checksum mechanism in BlockTreeTermsReader

2016-12-06 Thread Duke DAI
Hi all, I'm customizing Lucene Directory, which extends o.a.l.store.Directory based on database files. I do not need checksum again on IndexIndex and IndexOutput. But in BlockTreeTermsReader constructor, following code open a hard-coded BufferedChecksumIndexInput to checksum on raw IndexInput. I

Re: Having some trouble running tests with custom codec

2015-10-11 Thread Duke DAI
How about add line feed for the single line? It seems I have the impression that line feed is required. Best regards, Duke If not now, when? If not me, who? On Fri, Oct 9, 2015 at 10:07 PM, Sigbjørn Lund Olsen < sigbjorn.lund.ol...@gmail.com> wrote: > As part of my master's thesis I am planning

Re: Standard highlighter returns whole document as a fragment

2015-08-11 Thread Duke DAI
Seems we are encountering same problem. (thread: bug of highlighter/SimpleSpanFragmenter, returned longer fragment than expected?) When debugging, your fragmenter is SimpleSpanFragmenter? isNewFragment() returns true due to below logic? boolean isNewFrag = offsetAtt.endOffset() >= (fragmentSize * c

Re: bug of highlighter/SimpleSpanFragmenter, returned longer fragment than expected?

2015-08-11 Thread Duke DAI
Greetings! Any body has input on this? Best regards, Duke If not now, when? If not me, who? On Fri, Aug 7, 2015 at 10:58 AM, Duke DAI wrote: > Hi experts, > > I'm trying to reproduce a bug from Lucene side, and found something. > > In latest codeline, 5.2.1, I

Re: Re: memory cost in forceMerge(1)

2015-08-11 Thread Duke DAI
>From my experience, you must hit some system issue. You should check disk performance at first, disk queue length on Windows. Or you can enable gc verbose to know the gc activities in details. I designed auto upgrade mechanism in application by calling forceMerge(1), to eradicate hybrid index for

bug of highlighter/SimpleSpanFragmenter, returned longer fragment than expected?

2015-08-06 Thread Duke DAI
Hi experts, I'm trying to reproduce a bug from Lucene side, and found something. In latest codeline, 5.2.1, I modified test case HighlighterTest.testSimpleQueryTermScorerHighlighter a little to below, mainly to use SimpleSpanFragmenter to get only one fragment with length 64. public void testS

Inconsistency of LogMergePolicy and IWC.useCompoundFile

2014-06-19 Thread Duke DAI
Hi Simon, guys, I see LUCENE-5038, useCompoundFile stuff had been refactored. Now I think there are some problems with LogMergePolicy. Example: 1. setting useCompoundFile as false and no changing NOCFSRatio(1.0 by default). 2. starting index, new segment will not use compound file even it's small

Re: Retrieving values for a NumericDocValuesField [SEC=UNOFFICIAL]

2013-10-23 Thread Duke DAI
Hi Stephen, I have the same scenario with you. I verified with simple pure Lucene test, same way as Mike mentioned, performance with NumericDocValue is 10x faster than retrieving stored field. Hope you can get similar performance measurement. Best regards, Duke If not now, when? If not me, who?

Re: problem found with DiskDocValuesFormat

2013-10-22 Thread Duke DAI
ss queries. > > Maybe you can boil down the issue you are seeing into a small test case? > > Mike McCandless > > http://blog.mikemccandless.com > > > On Mon, Oct 21, 2013 at 10:35 AM, Duke DAI wrote: > > Hi Mike, > > > > My scenario, query thread from a

Re: problem found with DiskDocValuesFormat

2013-10-21 Thread Duke DAI
ng for the same doc values. > > Mike McCandless > > http://blog.mikemccandless.com > > > On Mon, Oct 21, 2013 at 6:28 AM, Duke DAI wrote: > > Hi guys, > > > > Seems I have the same problem with Lucene45DocValuesFormat, no problem > with > > MemoryDocValues

Re: problem found with DiskDocValuesFormat

2013-10-21 Thread Duke DAI
cases. Do you have any idea about this? Information is enough? Thanks, Duke Best regards, Duke If not now, when? If not me, who? On Tue, Aug 13, 2013 at 4:54 PM, Duke DAI wrote: > Hi experts, > > I'm upgrading Lucene 4.4 and trying to use DocValues instead of store > fiel

Re: SPI class of type org.apache.lucene.codecs.Codec error

2013-08-21 Thread Duke DAI
POM of the project is here > http://lesimisped.free.fr/servo/pom2.xml. It reuses a project which uses > Lucene, and the POM is this project is > http://lesimisped.free.fr/servo/pom.xml. > With a similar project which uses lucene 2.9 we didn't experienced such an > issue. > > Hope that may

Re: Question on wildcard queries, filters, scoring and TooManyClauses exception

2013-08-21 Thread Duke DAI
Some share for this topic. QueryParser queryParser = new QueryParser(Version.LUCENE_30, "my_field", new StandardAnalyzer(Version.LUCENE_30)); Query prefixQuery = queryParser.parse("t*"); indexSearcher.search(prefixQuery, collector); MultiTermQuery.default(forgot the name) rewriter will be used, if

Re: SPI class of type org.apache.lucene.codecs.Codec error

2013-08-20 Thread Duke DAI
The link http://maven.apache.org/plugins/maven-shade-plugin/examples/resource-transformers.html#ServicesResourceTransformer will help. Best regards, Duke If not now, when? If not me, who? On Mon, Aug 19, 2013 at 8:48 PM, Amal Kammoun wrote: > Dear All, > > Please do you have any advice regardin

Re: problem found with DiskDocValuesFormat

2013-08-13 Thread Duke DAI
eases; maybe that's what you are seeing? So, you must fully > re-index after any DiskDVFormat field after upgrading ... > > Only the default formats support index back compatibility between releases. > > > Mike McCandless > > http://blog.mikemccandless.com > > >

problem found with DiskDocValuesFormat

2013-08-13 Thread Duke DAI
Hi experts, I'm upgrading Lucene 4.4 and trying to use DocValues instead of store field for performance reason. But due to unknown size of index(depends on customer), so I will use DiskDocValuesFormat, especially for some binary field. Then I wrote my customized Codec: final Codec codec = n

Re: PayloadFunctions don't work the same since 4.1

2013-03-22 Thread Duke DAI
Most likely, the cause is what I said. I guess when you try to convert bytes to number you didn't use the payload.offset to locate the right start of bytes. Before 4.1, the start of payload is the expected value. But since 4.1, you must use the offset and length to get the correct bytes you wanted.

Re: PayloadFunctions don't work the same since 4.1

2013-03-21 Thread Duke DAI
I'm not sure your problem relates to the function or getting payload itslef. But after 4.1, in DefaultSimilarity.scorePayload(int doc, int start, int end, BytesRef payload), you must leverage payload.offset and payload.length to get the bytes. (start and end won't get the exact bytes you want). Ho

Re: ArrayIndexOutOfBoundsException: -65536

2012-01-19 Thread Duke DAI
t possible you are indexing an absurdly enormous document...? > > Finally, it's possible this is a hardware issue; does it happen on > other machines? > > Mike McCandless > > http://blog.mikemccandless.com > > On Wed, Jan 18, 2012 at 8:15 AM, Duke DAI wrote: >

Re: ArrayIndexOutOfBoundsException: -65536

2012-01-18 Thread Duke DAI
ou have a full traceback of the exception? > > Mike McCandless > > http://blog.mikemccandless.com > > On Sun, Jan 15, 2012 at 7:21 PM, Duke DAI wrote: > > Hi friends, > > Any one meet ArrayIndexOutOfBoundsException: -65536 described in > > https://issues.apache.org

ArrayIndexOutOfBoundsException: -65536

2012-01-15 Thread Duke DAI
Hi friends, Any one meet ArrayIndexOutOfBoundsException: -65536 described in https://issues.apache.org/jira/browse/LUCENE-1995 after it declared being fixed? My lucene version is 3.0.3 and MaxRAMBufferSize is 3M. All other configurations seem to be normal. It's hard to describe the environment and