[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex

2016-03-15 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated LUCENE-7091:
-
Assignee: Martijn van Groningen  (was: David Smiley)

> Add doc values support to MemoryIndex
> -
>
> Key: LUCENE-7091
> URL: https://issues.apache.org/jira/browse/LUCENE-7091
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Martijn van Groningen
>Assignee: Martijn van Groningen
> Fix For: 6.0
>
> Attachments: LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, 
> LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, 
> LUCENE-7091.patch, LUCENE-7091.patch
>
>
> Sometimes queries executed via the MemoryIndex require certain things to be 
> stored as doc values. Today this isn't possible because the memory index 
> doesn't support this and these queries silently return no results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex

2016-03-15 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-7091:
--
Fix Version/s: 6.0

I'll backport this change to 6.0 and 6.x branches. Otherwise LUCENE-7093 can't 
be backported and that would be bad as simple queries on numeric fields 
wouldn't work anymore, since numerics have moved to use points instead.

Because of this I'll backport LUCENE-7087 to 6.0 too.

> Add doc values support to MemoryIndex
> -
>
> Key: LUCENE-7091
> URL: https://issues.apache.org/jira/browse/LUCENE-7091
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Martijn van Groningen
>Assignee: David Smiley
> Fix For: 6.0
>
> Attachments: LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, 
> LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, 
> LUCENE-7091.patch, LUCENE-7091.patch
>
>
> Sometimes queries executed via the MemoryIndex require certain things to be 
> stored as doc values. Today this isn't possible because the memory index 
> doesn't support this and these queries silently return no results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex

2016-03-14 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-7091:
--
Attachment: LUCENE-7091.patch

Ah, that makes more sense. I misunderstood what you meant earlier. I've updated 
the patch. Thank you for this thorough review! 

> Add doc values support to MemoryIndex
> -
>
> Key: LUCENE-7091
> URL: https://issues.apache.org/jira/browse/LUCENE-7091
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Martijn van Groningen
>Assignee: David Smiley
> Attachments: LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, 
> LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, 
> LUCENE-7091.patch, LUCENE-7091.patch
>
>
> Sometimes queries executed via the MemoryIndex require certain things to be 
> stored as doc values. Today this isn't possible because the memory index 
> doesn't support this and these queries silently return no results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex

2016-03-14 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-7091:
--
Attachment: LUCENE-7091.patch

Updated the patch.

bq. (still applies to the other addField): I think the javadocs sentence you 
added to addField meant to use "if" not "is".

Changed. 

bq. At first I thought there might have been a bug for double-applying the 
boost since I see you're still passing "boost" as a constructor argument to 
Info. But now I see you only apply when numTokens > 0. I think it would be much 
clearer (and simpler) to remove boost from the constructor to Info, and simply 
apply it in storeTerms (no matter what numTokens is). It's hard to judge the 
testDocValuesDoNotAffectBoostPositionsOrOffset for this problem... it'd get 
encoded in the norm and I have no idea what norm it should be; your test 
asserts -127. H. Perhaps simply leave a check of that nature to the test 
that asserts parity with the real index in RAMDirectory

Removed the boost constructor parameter and added a dedicated test for this in 
TestMemoryIndexAgainstRAMDir.

bq. in storeDocValues() SORTED_NUMERIC: you call ArrayUtil.grow with only the 
array. This results in the same O(N^2) we're trying to avoid! Pass in a second 
argument of the desired length.

Changed, the size is array doubled when growing is necessary.

> Add doc values support to MemoryIndex
> -
>
> Key: LUCENE-7091
> URL: https://issues.apache.org/jira/browse/LUCENE-7091
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Martijn van Groningen
>Assignee: David Smiley
> Attachments: LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, 
> LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, 
> LUCENE-7091.patch
>
>
> Sometimes queries executed via the MemoryIndex require certain things to be 
> stored as doc values. Today this isn't possible because the memory index 
> doesn't support this and these queries silently return no results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex

2016-03-14 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-7091:
--
Attachment: LUCENE-7091.patch

Updated the patch with the good points that you've raised.

bq. It's a shame that SORTED & BINARY use a BytesRefHash (adds overhead) and 
ultimately get sorted when, really, it's not necessary of course. The 
ByteBlockPool could be used directly to store it (see BytesRefArray for 
examples) with a little bit of code. This isn't a blocker but it would sure be 
nice.

Agreed, that would be nicer. I think we should do this in a follow up issue.

bq. Add term text here too, and under same field names as DV ones at that.

I think this is covered in 
TestMemoryIndexAgainstRAMDir#testDocValuesMemoryIndexVsNormalIndex() test, in 
this test regular fields are randomilly added.

> Add doc values support to MemoryIndex
> -
>
> Key: LUCENE-7091
> URL: https://issues.apache.org/jira/browse/LUCENE-7091
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Martijn van Groningen
> Attachments: LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, 
> LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch
>
>
> Sometimes queries executed via the MemoryIndex require certain things to be 
> stored as doc values. Today this isn't possible because the memory index 
> doesn't support this and these queries silently return no results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex

2016-03-13 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-7091:
--
Attachment: LUCENE-7091.patch

Small update:
* Moved the doc values prepare state to Info class
* Use spare BytesRef in BinaryDocValuesProducer to fetch binary / sorted / 
sorted set DV.

> Add doc values support to MemoryIndex
> -
>
> Key: LUCENE-7091
> URL: https://issues.apache.org/jira/browse/LUCENE-7091
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Martijn van Groningen
> Attachments: LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch, 
> LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch
>
>
> Sometimes queries executed via the MemoryIndex require certain things to be 
> stored as doc values. Today this isn't possible because the memory index 
> doesn't support this and these queries silently return no results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex

2016-03-11 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-7091:
--
Attachment: LUCENE-7091.patch

I've updated the patch based on David's comments.

I did not split `DocValuesHolder` up in a class per DV type, but instead I've 
split it up in a class for the binary based DV and a class for the numeric 
based DV. So that the actual storage (long[] and BytesRefHash) is shared 
amongst the different DV impls. For all DV type but sorted set DV the patch now 
returns a pre set DV class.

> Add doc values support to MemoryIndex
> -
>
> Key: LUCENE-7091
> URL: https://issues.apache.org/jira/browse/LUCENE-7091
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Martijn van Groningen
> Attachments: LUCENE-7091.patch, LUCENE-7091.patch, LUCENE-7091.patch
>
>
> Sometimes queries executed via the MemoryIndex require certain things to be 
> stored as doc values. Today this isn't possible because the memory index 
> doesn't support this and these queries silently return no results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex

2016-03-10 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-7091:
--
Attachment: LUCENE-7091.patch

Updated the patch:
* Made sure the sorted set doc values doesn't have duplicate values.
* Added extra test to verify that unsupported usage fails

> Add doc values support to MemoryIndex
> -
>
> Key: LUCENE-7091
> URL: https://issues.apache.org/jira/browse/LUCENE-7091
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Martijn van Groningen
> Attachments: LUCENE-7091.patch, LUCENE-7091.patch
>
>
> Sometimes queries executed via the MemoryIndex require certain things to be 
> stored as doc values. Today this isn't possible because the memory index 
> doesn't support this and these queries silently return no results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex

2016-03-10 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-7091:
--
Description: Sometimes queries executed via the MemoryIndex require certain 
things to be stored as doc values. Today this isn't possible because the memory 
index doesn't support this and these queries silently return no results.

> Add doc values support to MemoryIndex
> -
>
> Key: LUCENE-7091
> URL: https://issues.apache.org/jira/browse/LUCENE-7091
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Martijn van Groningen
> Attachments: LUCENE-7091.patch
>
>
> Sometimes queries executed via the MemoryIndex require certain things to be 
> stored as doc values. Today this isn't possible because the memory index 
> doesn't support this and these queries silently return no results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7091) Add doc values support to MemoryIndex

2016-03-10 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-7091:
--
Attachment: LUCENE-7091.patch

Added a test that adds simple doc values support for doc values. Nothing fancy 
here, but should just work.

> Add doc values support to MemoryIndex
> -
>
> Key: LUCENE-7091
> URL: https://issues.apache.org/jira/browse/LUCENE-7091
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Martijn van Groningen
> Attachments: LUCENE-7091.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org