[jira] [Updated] (LUCENE-3199) Add non-desctructive sort to BytesRefHash

2011-09-02 Thread Jason Rutherglen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Rutherglen updated LUCENE-3199:
-

Attachment: LUCENE-3199.patch

This is a minor update when compared with the last patch.  

It adds the option of pruning the [oversized] int[] returned by the compact 
method.  

Added are additional unit tests.

 Add non-desctructive sort to BytesRefHash
 -

 Key: LUCENE-3199
 URL: https://issues.apache.org/jira/browse/LUCENE-3199
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Jason Rutherglen
Priority: Minor
 Attachments: LUCENE-3199.patch, LUCENE-3199.patch


 Currently the BytesRefHash is destructive.  We can add a method that returns 
 a non-destructively generated int[].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3199) Add non-desctructive sort to BytesRefHash

2011-09-02 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-3199:


Attachment: LUCENE-3199.patch

hey jason, I actually moved this a little further and added a ReadOnly View To 
BytesRefHash. This View provides next(), seekExact() and seekCeil() methods 
just like we have TermsEnum. 
The view is actually sorted if needed and can incrementally merge with a 
previously created view. 
Initially I wondered if this approach would be feasible performance wise but in 
fact this  is actually really fast. I did some poor-mans benchmarks where I 
opened a new view every 500 to 1000 new unique terms and this takes around 
0.001 to 0.01 millisecond on average. I have never seen it taking longer than 
0.1 ms. I think it would be worth while exploring if we can go that simple and 
reopen such a view for each document while we are indexing. The view actually 
allocates only one additional array and reuses all other references from the 
BytesRefHash instance. It seems this one additional int[] is not too bad though.

the patch is still rough. I will work further on it next week. 

 Add non-desctructive sort to BytesRefHash
 -

 Key: LUCENE-3199
 URL: https://issues.apache.org/jira/browse/LUCENE-3199
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Jason Rutherglen
Priority: Minor
 Attachments: LUCENE-3199.patch, LUCENE-3199.patch, LUCENE-3199.patch


 Currently the BytesRefHash is destructive.  We can add a method that returns 
 a non-destructively generated int[].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3199) Add non-desctructive sort to BytesRefHash

2011-09-02 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-3199:


Attachment: LUCENE-3199.patch

new version, fixed one concurrency issue and added some doc strings

 Add non-desctructive sort to BytesRefHash
 -

 Key: LUCENE-3199
 URL: https://issues.apache.org/jira/browse/LUCENE-3199
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Jason Rutherglen
Priority: Minor
 Attachments: LUCENE-3199.patch, LUCENE-3199.patch, LUCENE-3199.patch, 
 LUCENE-3199.patch


 Currently the BytesRefHash is destructive.  We can add a method that returns 
 a non-destructively generated int[].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3199) Add non-desctructive sort to BytesRefHash

2011-09-01 Thread Jason Rutherglen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Rutherglen updated LUCENE-3199:
-

Attachment: LUCENE-3199.patch

Here's a version of this issue.  Added are a couple of new methods to 
TestBytesRefHash to test the new frozen compact and sorting functionality of 
BytesRefHash.

This is being posted now because it's useful in relation to LUCENE-2312 and a 
terms dictionary that is composed of sorted by term[id]s int[]s.

 Add non-desctructive sort to BytesRefHash
 -

 Key: LUCENE-3199
 URL: https://issues.apache.org/jira/browse/LUCENE-3199
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Jason Rutherglen
Priority: Minor
 Attachments: LUCENE-3199.patch


 Currently the BytesRefHash is destructive.  We can add a method that returns 
 a non-destructively generated int[].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org