[ https://issues.apache.org/jira/browse/LUCENE-2662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914627#action_12914627 ]
Simon Willnauer commented on LUCENE-2662: ----------------------------------------- bq. In the class jdocs, I think state that this is basically a Map<BytesRef,int>? yeah that simplifies it - will do. bq. Maybe we also move ByteBlockPool --> oal.util? yeah I did that already - that makes totally sense bq. Maybe move out the ByteBlockAllocator to its own class (in util)? RecyclingByteBlockAllocator? +1 yeah I like that - I also think we should allow to pass the blockpool to the byteshash instead of the allocator. From what I can tell now I think this is necessary for the refactoring anyway since we share pools with secondary TermsHash instances in the termvector case. {quote} Maybe rename ords -> keys? And hash -> values? (The key isn't really an "ord" (I think?) because it increases by more than 1 each time... it's more like an address since it references an address in the byte-pool space). {quote} yeah that depends how you see it - the array index really is the ord though. but I like those names. I will change. {quote} We should advertise the limits in the jdocs - limited to <= 2GB total byte storage, each key must be <= BLOCK SIZE-2 in length. {quote} I think I have done the latter already but I will add the other too. {quote} Can we have sortedEntries() not allocate a new iterator object? Ie, just return the sorted bytesStart int[]? (This is what's done today, and, for term vectors on small docs, this method is pretty hot). And the javadocs for this should be stronger - it's not that the behaviour is undefined after, it's that you must .clear() after you're done consume the sorted entries. {quote} Ah I see - good point. I think what you refer to is public int[] sort(Comparator<BytesRef> comp) - the iterator one is just more convenient one. I will change though. thanks mike! > BytesHash > --------- > > Key: LUCENE-2662 > URL: https://issues.apache.org/jira/browse/LUCENE-2662 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Affects Versions: Realtime Branch, 4.0 > Reporter: Jason Rutherglen > Assignee: Simon Willnauer > Priority: Minor > Fix For: Realtime Branch, 4.0 > > Attachments: LUCENE-2662.patch, LUCENE-2662.patch > > > This issue will have the BytesHash separated out from LUCENE-2186 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org