[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex
[ https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated LUCENE-7091: - Assignee: Martijn van Groningen (was: David Smiley) > Add doc values support to MemoryIndex > - > > Key: LUCENE-7091 > URL: https://issues.apache.org/jira/browse/LUCENE-7091 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Martijn van Groningen >Assignee: Martijn van Groningen > Fix For: 6.0 > > Attachments: LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, > LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, > LUCENE-7091.patch, LUCENE-7091.patch > > > Sometimes queries executed via the MemoryIndex require certain things to be > stored as doc values. Today this isn't possible because the memory index > doesn't support this and these queries silently return no results. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex
[ https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated LUCENE-7091: -- Fix Version/s: 6.0 I'll backport this change to 6.0 and 6.x branches. Otherwise LUCENE-7093 can't be backported and that would be bad as simple queries on numeric fields wouldn't work anymore, since numerics have moved to use points instead. Because of this I'll backport LUCENE-7087 to 6.0 too. > Add doc values support to MemoryIndex > - > > Key: LUCENE-7091 > URL: https://issues.apache.org/jira/browse/LUCENE-7091 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Martijn van Groningen >Assignee: David Smiley > Fix For: 6.0 > > Attachments: LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, > LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, > LUCENE-7091.patch, LUCENE-7091.patch > > > Sometimes queries executed via the MemoryIndex require certain things to be > stored as doc values. Today this isn't possible because the memory index > doesn't support this and these queries silently return no results. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex
[ https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated LUCENE-7091: -- Attachment: LUCENE-7091.patch Ah, that makes more sense. I misunderstood what you meant earlier. I've updated the patch. Thank you for this thorough review! > Add doc values support to MemoryIndex > - > > Key: LUCENE-7091 > URL: https://issues.apache.org/jira/browse/LUCENE-7091 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Martijn van Groningen >Assignee: David Smiley > Attachments: LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, > LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, > LUCENE-7091.patch, LUCENE-7091.patch > > > Sometimes queries executed via the MemoryIndex require certain things to be > stored as doc values. Today this isn't possible because the memory index > doesn't support this and these queries silently return no results. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex
[ https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated LUCENE-7091: -- Attachment: LUCENE-7091.patch Updated the patch. bq. (still applies to the other addField): I think the javadocs sentence you added to addField meant to use "if" not "is". Changed. bq. At first I thought there might have been a bug for double-applying the boost since I see you're still passing "boost" as a constructor argument to Info. But now I see you only apply when numTokens > 0. I think it would be much clearer (and simpler) to remove boost from the constructor to Info, and simply apply it in storeTerms (no matter what numTokens is). It's hard to judge the testDocValuesDoNotAffectBoostPositionsOrOffset for this problem... it'd get encoded in the norm and I have no idea what norm it should be; your test asserts -127. H. Perhaps simply leave a check of that nature to the test that asserts parity with the real index in RAMDirectory Removed the boost constructor parameter and added a dedicated test for this in TestMemoryIndexAgainstRAMDir. bq. in storeDocValues() SORTED_NUMERIC: you call ArrayUtil.grow with only the array. This results in the same O(N^2) we're trying to avoid! Pass in a second argument of the desired length. Changed, the size is array doubled when growing is necessary. > Add doc values support to MemoryIndex > - > > Key: LUCENE-7091 > URL: https://issues.apache.org/jira/browse/LUCENE-7091 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Martijn van Groningen >Assignee: David Smiley > Attachments: LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, > LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, > LUCENE-7091.patch > > > Sometimes queries executed via the MemoryIndex require certain things to be > stored as doc values. Today this isn't possible because the memory index > doesn't support this and these queries silently return no results. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex
[ https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated LUCENE-7091: -- Attachment: LUCENE-7091.patch Updated the patch with the good points that you've raised. bq. It's a shame that SORTED & BINARY use a BytesRefHash (adds overhead) and ultimately get sorted when, really, it's not necessary of course. The ByteBlockPool could be used directly to store it (see BytesRefArray for examples) with a little bit of code. This isn't a blocker but it would sure be nice. Agreed, that would be nicer. I think we should do this in a follow up issue. bq. Add term text here too, and under same field names as DV ones at that. I think this is covered in TestMemoryIndexAgainstRAMDir#testDocValuesMemoryIndexVsNormalIndex() test, in this test regular fields are randomilly added. > Add doc values support to MemoryIndex > - > > Key: LUCENE-7091 > URL: https://issues.apache.org/jira/browse/LUCENE-7091 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Martijn van Groningen > Attachments: LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, > LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch > > > Sometimes queries executed via the MemoryIndex require certain things to be > stored as doc values. Today this isn't possible because the memory index > doesn't support this and these queries silently return no results. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex
[ https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated LUCENE-7091: -- Attachment: LUCENE-7091.patch Small update: * Moved the doc values prepare state to Info class * Use spare BytesRef in BinaryDocValuesProducer to fetch binary / sorted / sorted set DV. > Add doc values support to MemoryIndex > - > > Key: LUCENE-7091 > URL: https://issues.apache.org/jira/browse/LUCENE-7091 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Martijn van Groningen > Attachments: LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, > LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch > > > Sometimes queries executed via the MemoryIndex require certain things to be > stored as doc values. Today this isn't possible because the memory index > doesn't support this and these queries silently return no results. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex
[ https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated LUCENE-7091: -- Attachment: LUCENE-7091.patch I've updated the patch based on David's comments. I did not split `DocValuesHolder` up in a class per DV type, but instead I've split it up in a class for the binary based DV and a class for the numeric based DV. So that the actual storage (long[] and BytesRefHash) is shared amongst the different DV impls. For all DV type but sorted set DV the patch now returns a pre set DV class. > Add doc values support to MemoryIndex > - > > Key: LUCENE-7091 > URL: https://issues.apache.org/jira/browse/LUCENE-7091 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Martijn van Groningen > Attachments: LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch > > > Sometimes queries executed via the MemoryIndex require certain things to be > stored as doc values. Today this isn't possible because the memory index > doesn't support this and these queries silently return no results. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex
[ https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated LUCENE-7091: -- Attachment: LUCENE-7091.patch Updated the patch: * Made sure the sorted set doc values doesn't have duplicate values. * Added extra test to verify that unsupported usage fails > Add doc values support to MemoryIndex > - > > Key: LUCENE-7091 > URL: https://issues.apache.org/jira/browse/LUCENE-7091 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Martijn van Groningen > Attachments: LUCENE-7091.patch, LUCENE-7091.patch > > > Sometimes queries executed via the MemoryIndex require certain things to be > stored as doc values. Today this isn't possible because the memory index > doesn't support this and these queries silently return no results. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex
[ https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated LUCENE-7091: -- Description: Sometimes queries executed via the MemoryIndex require certain things to be stored as doc values. Today this isn't possible because the memory index doesn't support this and these queries silently return no results. > Add doc values support to MemoryIndex > - > > Key: LUCENE-7091 > URL: https://issues.apache.org/jira/browse/LUCENE-7091 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Martijn van Groningen > Attachments: LUCENE-7091.patch > > > Sometimes queries executed via the MemoryIndex require certain things to be > stored as doc values. Today this isn't possible because the memory index > doesn't support this and these queries silently return no results. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex
[ https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated LUCENE-7091: -- Attachment: LUCENE-7091.patch Added a test that adds simple doc values support for doc values. Nothing fancy here, but should just work. > Add doc values support to MemoryIndex > - > > Key: LUCENE-7091 > URL: https://issues.apache.org/jira/browse/LUCENE-7091 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Martijn van Groningen > Attachments: LUCENE-7091.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org