[ 
https://issues.apache.org/jira/browse/LUCENE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13500177#comment-13500177
 ] 

Simon Willnauer commented on LUCENE-4547:
-----------------------------------------

hey folks,

I looked at the branch and I would want to suggest we move a little slower 
here. we are doing too many things at once. Like I really don't like the trend 
to make FieldCache the single source for caching. FieldCache has many problems 
in my opininon like is uses this DEFAULT singleton, has a single way of how 
things are cached per reader some users might want to use different access to 
DV like in ES we don't use FieldCache at all for many reasons.I think we are 
going into the right direction here but exposing everything through FC is a 
no-go IMO. I do see why we should merge the interfaces and expose un-inverted 
fields via the new DV interface - nice! but hiding it behind FC is no good. 
I also don't like the way how "in-ram" DV are exposed. I don't think we should 
have newRAMInstance() on the interface. Lets keep the interface clean and don't 
mix in how it is represented. I'd rather vote for dedicated producers or 
SimpleDocValuesProducer#getNumericDocValues(boolean inMemory). Then we can 
still do caching on top. The producer should really be simple and shouldn't do 
caching. We can also separate the default in-memory impls in a simple helper 
class with methods like static NumericDocValues load(NumericDocValues 
directDocValues)
                
> DocValues field broken on large indexes
> ---------------------------------------
>
>                 Key: LUCENE-4547
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4547
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>            Priority: Blocker
>             Fix For: 4.1
>
>         Attachments: test.patch
>
>
> I tried to write a test to sanity check LUCENE-4536 (first running against 
> svn revision 1406416, before the change).
> But i found docvalues is already broken here for large indexes that have a 
> PackedLongDocValues field:
> {code}
> final int numDocs = 500000000;
> for (int i = 0; i < numDocs; ++i) {
>   if (i == 0) {
>     field.setLongValue(0L); // force > 32bit deltas
>   } else {
>     field.setLongValue(1<<33L); 
>   }
>   w.addDocument(doc);
> }
> w.forceMerge(1);
> w.close();
> dir.close(); // checkindex
> {code}
> {noformat}
> [junit4:junit4]   2> WARNING: Uncaught exception in thread: Thread[Lucene 
> Merge Thread #0,6,TGRP-Test2GBDocValues]
> [junit4:junit4]   2> org.apache.lucene.index.MergePolicy$MergeException: 
> java.lang.ArrayIndexOutOfBoundsException: -65536
> [junit4:junit4]   2>  at 
> __randomizedtesting.SeedInfo.seed([5DC54DB14FA5979]:0)
> [junit4:junit4]   2>  at 
> org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:535)
> [junit4:junit4]   2>  at 
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:508)
> [junit4:junit4]   2> Caused by: java.lang.ArrayIndexOutOfBoundsException: 
> -65536
> [junit4:junit4]   2>  at 
> org.apache.lucene.util.ByteBlockPool.deref(ByteBlockPool.java:305)
> [junit4:junit4]   2>  at 
> org.apache.lucene.codecs.lucene40.values.FixedStraightBytesImpl$FixedBytesWriterBase.set(FixedStraightBytesImpl.java:115)
> [junit4:junit4]   2>  at 
> org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.writePackedInts(PackedIntValues.java:109)
> [junit4:junit4]   2>  at 
> org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.finish(PackedIntValues.java:80)
> [junit4:junit4]   2>  at 
> org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:130)
> [junit4:junit4]   2>  at 
> org.apache.lucene.codecs.PerDocConsumer.merge(PerDocConsumer.java:65)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to