[jira] Commented: (LUCENE-756) Maintain norms in a single file .nrm

2007-01-16 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12465274
 ] 

Michael McCandless commented on LUCENE-756:
---

OK thanks Doron.  I will make the fixes you suggested!

I like "single" -- I will redo the "non backwards compatible for past 10 days" 
patch with these fixes!

> Maintain norms in a single file .nrm
> 
>
> Key: LUCENE-756
> URL: https://issues.apache.org/jira/browse/LUCENE-756
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Doron Cohen
> Assigned To: Doron Cohen
>Priority: Minor
> Attachments: index.premergednorms.cfs.zip, 
> index.premergednorms.nocfs.zip, LUCENE-756-Jan16.patch, 
> LUCENE-756-Jan16.Take2.patch, nrm.patch.2.txt, nrm.patch.3.txt, nrm.patch.txt
>
>
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity 
> comparing to compound indexes. But their file descriptors foot print is much 
> higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number 
> of files used by non compound indexes, and possibly allow more applications 
> to use this format.
> More details on the motivation for this in: 
> http://www.nabble.com/potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-tf2826909.html
>  (in particular 
> http://www.nabble.com/Re%3A-potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-p7910403.html).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-756) Maintain norms in a single file .nrm

2007-01-16 Thread Doron Cohen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12465273
 ] 

Doron Cohen commented on LUCENE-756:


Just to let you know - I checked this with recent patch for Lucene-741 (Field 
norm modifier) --> working as is with this improvement.

> Maintain norms in a single file .nrm
> 
>
> Key: LUCENE-756
> URL: https://issues.apache.org/jira/browse/LUCENE-756
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Doron Cohen
> Assigned To: Doron Cohen
>Priority: Minor
> Attachments: index.premergednorms.cfs.zip, 
> index.premergednorms.nocfs.zip, LUCENE-756-Jan16.patch, 
> LUCENE-756-Jan16.Take2.patch, nrm.patch.2.txt, nrm.patch.3.txt, nrm.patch.txt
>
>
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity 
> comparing to compound indexes. But their file descriptors foot print is much 
> higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number 
> of files used by non compound indexes, and possibly allow more applications 
> to use this format.
> More details on the motivation for this in: 
> http://www.nabble.com/potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-tf2826909.html
>  (in particular 
> http://www.nabble.com/Re%3A-potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-p7910403.html).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-756) Maintain norms in a single file .nrm

2007-01-16 Thread Doron Cohen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12465271
 ] 

Doron Cohen commented on LUCENE-756:


Catenated?

> Maintain norms in a single file .nrm
> 
>
> Key: LUCENE-756
> URL: https://issues.apache.org/jira/browse/LUCENE-756
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Doron Cohen
> Assigned To: Doron Cohen
>Priority: Minor
> Attachments: index.premergednorms.cfs.zip, 
> index.premergednorms.nocfs.zip, LUCENE-756-Jan16.patch, 
> LUCENE-756-Jan16.Take2.patch, nrm.patch.2.txt, nrm.patch.3.txt, nrm.patch.txt
>
>
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity 
> comparing to compound indexes. But their file descriptors foot print is much 
> higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number 
> of files used by non compound indexes, and possibly allow more applications 
> to use this format.
> More details on the motivation for this in: 
> http://www.nabble.com/potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-tf2826909.html
>  (in particular 
> http://www.nabble.com/Re%3A-potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-p7910403.html).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-756) Maintain norms in a single file .nrm

2007-01-16 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12465265
 ] 

Doug Cutting commented on LUCENE-756:
-

> the term "merged" (in hasMergedNorms) is a little overloaded with other 
> semantics (in Lucene)

Unified?  Single?

> Maintain norms in a single file .nrm
> 
>
> Key: LUCENE-756
> URL: https://issues.apache.org/jira/browse/LUCENE-756
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Doron Cohen
> Assigned To: Doron Cohen
>Priority: Minor
> Attachments: index.premergednorms.cfs.zip, 
> index.premergednorms.nocfs.zip, LUCENE-756-Jan16.patch, 
> LUCENE-756-Jan16.Take2.patch, nrm.patch.2.txt, nrm.patch.3.txt, nrm.patch.txt
>
>
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity 
> comparing to compound indexes. But their file descriptors foot print is much 
> higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number 
> of files used by non compound indexes, and possibly allow more applications 
> to use this format.
> More details on the motivation for this in: 
> http://www.nabble.com/potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-tf2826909.html
>  (in particular 
> http://www.nabble.com/Re%3A-potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-p7910403.html).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-756) Maintain norms in a single file .nrm

2007-01-16 Thread Doron Cohen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12465260
 ] 

Doron Cohen commented on LUCENE-756:


Michael, I like this improvement!

(At first I considered adding such FORMAT level but decided that it is not 
worth it, - aiming backwards compatibility with pre-lockless indexes. Then I 
had to add that file check - wrong trade-off indeed.)

Two minor comments:
- getHasMergedNorms() is private and now the method has no logic - I would 
remove that method and refer to hasMergedNorms instead.
- the term "merged" (in hasMergedNorms) is a little overloaded with other 
semantics (in Lucene), though I cannot think of other matching descriptive 
(short) term.

Thanks for improving this,
Doron

> Maintain norms in a single file .nrm
> 
>
> Key: LUCENE-756
> URL: https://issues.apache.org/jira/browse/LUCENE-756
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Doron Cohen
> Assigned To: Doron Cohen
>Priority: Minor
> Attachments: index.premergednorms.cfs.zip, 
> index.premergednorms.nocfs.zip, LUCENE-756-Jan16.patch, 
> LUCENE-756-Jan16.Take2.patch, nrm.patch.2.txt, nrm.patch.3.txt, nrm.patch.txt
>
>
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity 
> comparing to compound indexes. But their file descriptors foot print is much 
> higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number 
> of files used by non compound indexes, and possibly allow more applications 
> to use this format.
> More details on the motivation for this in: 
> http://www.nabble.com/potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-tf2826909.html
>  (in particular 
> http://www.nabble.com/Re%3A-potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-p7910403.html).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-756) Maintain norms in a single file .nrm

2007-01-16 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12465250
 ] 

Michael McCandless commented on LUCENE-756:
---

Actually, if you apply my first change above, regen your index, then the format 
will be readable to the 2nd patch.

Chuck, I think this latest patch would likely be the "final" index file format 
for this issue, pending any more feedback on it though!

> Maintain norms in a single file .nrm
> 
>
> Key: LUCENE-756
> URL: https://issues.apache.org/jira/browse/LUCENE-756
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Doron Cohen
> Assigned To: Doron Cohen
>Priority: Minor
> Attachments: index.premergednorms.cfs.zip, 
> index.premergednorms.nocfs.zip, LUCENE-756-Jan16.patch, 
> LUCENE-756-Jan16.Take2.patch, nrm.patch.2.txt, nrm.patch.3.txt, nrm.patch.txt
>
>
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity 
> comparing to compound indexes. But their file descriptors foot print is much 
> higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number 
> of files used by non compound indexes, and possibly allow more applications 
> to use this format.
> More details on the motivation for this in: 
> http://www.nabble.com/potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-tf2826909.html
>  (in particular 
> http://www.nabble.com/Re%3A-potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-p7910403.html).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-756) Maintain norms in a single file .nrm

2007-01-16 Thread Chuck Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12465240
 ] 

Chuck Williams commented on LUCENE-756:
---

I may have the only app that will be broken by the 10-day backwards 
incompatibility, but the change seems worth it.  I need to create some large 
indexes to take on the road for demos.  Is the index format in the latest patch 
final?


> Maintain norms in a single file .nrm
> 
>
> Key: LUCENE-756
> URL: https://issues.apache.org/jira/browse/LUCENE-756
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Doron Cohen
> Assigned To: Doron Cohen
>Priority: Minor
> Attachments: index.premergednorms.cfs.zip, 
> index.premergednorms.nocfs.zip, LUCENE-756-Jan16.patch, 
> LUCENE-756-Jan16.Take2.patch, nrm.patch.2.txt, nrm.patch.3.txt, nrm.patch.txt
>
>
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity 
> comparing to compound indexes. But their file descriptors foot print is much 
> higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number 
> of files used by non compound indexes, and possibly allow more applications 
> to use this format.
> More details on the motivation for this in: 
> http://www.nabble.com/potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-tf2826909.html
>  (in particular 
> http://www.nabble.com/Re%3A-potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-p7910403.html).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-756) Maintain norms in a single file .nrm

2007-01-16 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12465230
 ] 

Michael McCandless commented on LUCENE-756:
---

OK, take two!  I attached LUCENE-756-Jan16.Take2.patch

I removed backwards compatibility for the past 10 days of Lucene
nightly trunk builds.  I also fixed fileformats.xml to describe the
new "HasMergedNorms" entry in the segments_N file.


> Maintain norms in a single file .nrm
> 
>
> Key: LUCENE-756
> URL: https://issues.apache.org/jira/browse/LUCENE-756
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Doron Cohen
> Assigned To: Doron Cohen
>Priority: Minor
> Attachments: index.premergednorms.cfs.zip, 
> index.premergednorms.nocfs.zip, LUCENE-756-Jan16.patch, 
> LUCENE-756-Jan16.Take2.patch, nrm.patch.2.txt, nrm.patch.3.txt, nrm.patch.txt
>
>
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity 
> comparing to compound indexes. But their file descriptors foot print is much 
> higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number 
> of files used by non compound indexes, and possibly allow more applications 
> to use this format.
> More details on the motivation for this in: 
> http://www.nabble.com/potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-tf2826909.html
>  (in particular 
> http://www.nabble.com/Re%3A-potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-p7910403.html).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-756) Maintain norms in a single file .nrm

2007-01-16 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12465216
 ] 

Yonik Seeley commented on LUCENE-756:
-

As an aside, I think we need to start making more frequent releases... then 
"trunk" could be designated as a work-in-progress and unstable, and hence 
compatibility concerns could be limited to those releases.

> Maintain norms in a single file .nrm
> 
>
> Key: LUCENE-756
> URL: https://issues.apache.org/jira/browse/LUCENE-756
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Doron Cohen
> Assigned To: Doron Cohen
>Priority: Minor
> Attachments: index.premergednorms.cfs.zip, 
> index.premergednorms.nocfs.zip, LUCENE-756-Jan16.patch, nrm.patch.2.txt, 
> nrm.patch.3.txt, nrm.patch.txt
>
>
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity 
> comparing to compound indexes. But their file descriptors foot print is much 
> higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number 
> of files used by non compound indexes, and possibly allow more applications 
> to use this format.
> More details on the motivation for this in: 
> http://www.nabble.com/potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-tf2826909.html
>  (in particular 
> http://www.nabble.com/Re%3A-potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-p7910403.html).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-756) Maintain norms in a single file .nrm

2007-01-16 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12465214
 ] 

Michael McCandless commented on LUCENE-756:
---

> No hard rule on this, but IMO that may be a small enough window that 
> compatibility is not needed.

This is a good question.  I had flip/flop'd on it.  It would be nice
to not have to support reading indices that were written to based on
the past 10 days of Lucene trunk builds.  Then we could greatly
simplify the "SegmentInfo.getHasMergedNorms" to not create then
destroy the CompoundFileReader.

Any objections to this?

If not I will re-work the patch (it makes things a fair bit cleaner).

> Maintain norms in a single file .nrm
> 
>
> Key: LUCENE-756
> URL: https://issues.apache.org/jira/browse/LUCENE-756
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Doron Cohen
> Assigned To: Doron Cohen
>Priority: Minor
> Attachments: index.premergednorms.cfs.zip, 
> index.premergednorms.nocfs.zip, LUCENE-756-Jan16.patch, nrm.patch.2.txt, 
> nrm.patch.3.txt, nrm.patch.txt
>
>
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity 
> comparing to compound indexes. But their file descriptors foot print is much 
> higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number 
> of files used by non compound indexes, and possibly allow more applications 
> to use this format.
> More details on the motivation for this in: 
> http://www.nabble.com/potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-tf2826909.html
>  (in particular 
> http://www.nabble.com/Re%3A-potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-p7910403.html).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-756) Maintain norms in a single file .nrm

2007-01-16 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12465208
 ] 

Yonik Seeley commented on LUCENE-756:
-

I agree that reducing the IO operations on an index open is a good thing.

> For indices written to before this gets
> committed but after the first version was committed (10 days ago),

No hard rule on this, but IMO that may be a small enough window that 
compatibility is not needed.

> Maintain norms in a single file .nrm
> 
>
> Key: LUCENE-756
> URL: https://issues.apache.org/jira/browse/LUCENE-756
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Doron Cohen
> Assigned To: Doron Cohen
>Priority: Minor
> Attachments: index.premergednorms.cfs.zip, 
> index.premergednorms.nocfs.zip, LUCENE-756-Jan16.patch, nrm.patch.2.txt, 
> nrm.patch.3.txt, nrm.patch.txt
>
>
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity 
> comparing to compound indexes. But their file descriptors foot print is much 
> higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number 
> of files used by non compound indexes, and possibly allow more applications 
> to use this format.
> More details on the motivation for this in: 
> http://www.nabble.com/potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-tf2826909.html
>  (in particular 
> http://www.nabble.com/Re%3A-potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-p7910403.html).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-756) Maintain norms in a single file .nrm

2007-01-07 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12462914
 ] 

Yonik Seeley commented on LUCENE-756:
-

Hmmm, I actually did an "svn status" to see if there was anything to add too.
Problem is, my current tree is too messy and I missed it.
Thanks for the double-check.

> Maintain norms in a single file .nrm
> 
>
> Key: LUCENE-756
> URL: https://issues.apache.org/jira/browse/LUCENE-756
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Doron Cohen
> Assigned To: Doron Cohen
>Priority: Minor
> Attachments: nrm.patch.2.txt, nrm.patch.3.txt, nrm.patch.txt
>
>
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity 
> comparing to compound indexes. But their file descriptors foot print is much 
> higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number 
> of files used by non compound indexes, and possibly allow more applications 
> to use this format.
> More details on the motivation for this in: 
> http://www.nabble.com/potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-tf2826909.html
>  (in particular 
> http://www.nabble.com/Re%3A-potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-p7910403.html).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-756) Maintain norms in a single file .nrm

2007-01-06 Thread Doron Cohen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12462774
 ] 

Doron Cohen commented on LUCENE-756:


Thanks for commiting this Yonik!

Seems the added test TestNorms was not commited..?

> Maintain norms in a single file .nrm
> 
>
> Key: LUCENE-756
> URL: https://issues.apache.org/jira/browse/LUCENE-756
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Doron Cohen
> Assigned To: Doron Cohen
>Priority: Minor
> Attachments: nrm.patch.2.txt, nrm.patch.3.txt, nrm.patch.txt
>
>
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity 
> comparing to compound indexes. But their file descriptors foot print is much 
> higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number 
> of files used by non compound indexes, and possibly allow more applications 
> to use this format.
> More details on the motivation for this in: 
> http://www.nabble.com/potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-tf2826909.html
>  (in particular 
> http://www.nabble.com/Re%3A-potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-p7910403.html).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-756) Maintain norms in a single file .nrm

2007-01-03 Thread Doron Cohen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12462069
 ] 

Doron Cohen commented on LUCENE-756:


I am updating the patch (nrm.patch.3.txt): 

- using a single constant for the norms file extension:
  static final String NORMS_EXTENSION = "nrm";
(This is more in line with existing extension constants in the code.)
(As a side comment, there are various extension names (e.g. ".cfs") in the code 
that are also candidate for factoring as a constant, but this is a separate 
issue.)

- adding a test - TestNorms
This test verifies that norm values assigned with field.setBoost() are 
preserved during the life cycle of an index, including adding documents, 
updating norms values (separate norms), addIndexes(), and optimize.

All tests pass.
On my side this is ready to go in.


> Maintain norms in a single file .nrm
> 
>
> Key: LUCENE-756
> URL: https://issues.apache.org/jira/browse/LUCENE-756
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Doron Cohen
> Assigned To: Doron Cohen
>Priority: Minor
> Attachments: nrm.patch.2.txt, nrm.patch.txt
>
>
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity 
> comparing to compound indexes. But their file descriptors foot print is much 
> higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number 
> of files used by non compound indexes, and possibly allow more applications 
> to use this format.
> More details on the motivation for this in: 
> http://www.nabble.com/potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-tf2826909.html
>  (in particular 
> http://www.nabble.com/Re%3A-potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-p7910403.html).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-756) Maintain norms in a single file .nrm

2006-12-21 Thread Doron Cohen (JIRA)
[ 
http://issues.apache.org/jira/browse/LUCENE-756?page=comments#action_12460316 ] 

Doron Cohen commented on LUCENE-756:


Thanks for the comments, Doug. 
You're right of course, I will add both the header and the constant.
(that would be either today or only in a week from now.)

> Maintain norms in a single file .nrm
> 
>
> Key: LUCENE-756
> URL: http://issues.apache.org/jira/browse/LUCENE-756
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Doron Cohen
> Assigned To: Doron Cohen
>Priority: Minor
> Attachments: nrm.patch.txt
>
>
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity 
> comparing to compound indexes. But their file descriptors foot print is much 
> higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number 
> of files used by non compound indexes, and possibly allow more applications 
> to use this format.
> More details on the motivation for this in: 
> http://www.nabble.com/potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-tf2826909.html
>  (in particular 
> http://www.nabble.com/Re%3A-potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-p7910403.html).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-756) Maintain norms in a single file .nrm

2006-12-21 Thread Doug Cutting (JIRA)
[ 
http://issues.apache.org/jira/browse/LUCENE-756?page=comments#action_12460313 ] 

Doug Cutting commented on LUCENE-756:
-

Since we're adding a new file, shouldn't we give it a header, so that it's 
format can be revised?  Something like:
  new byte[] {'N','R','M',VERSION}
as the first four bytes.  We might someday decide to change the representation 
used, e.g., a different one-byte-float format, or permit higher resolution, or 
compression, or somesuch.

Also, should we use a constant for ".nrm" extension, so that it's checked at 
compile-time?

> Maintain norms in a single file .nrm
> 
>
> Key: LUCENE-756
> URL: http://issues.apache.org/jira/browse/LUCENE-756
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Doron Cohen
> Assigned To: Doron Cohen
>Priority: Minor
> Attachments: nrm.patch.txt
>
>
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity 
> comparing to compound indexes. But their file descriptors foot print is much 
> higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number 
> of files used by non compound indexes, and possibly allow more applications 
> to use this format.
> More details on the motivation for this in: 
> http://www.nabble.com/potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-tf2826909.html
>  (in particular 
> http://www.nabble.com/Re%3A-potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-p7910403.html).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-756) Maintain norms in a single file .nrm

2006-12-21 Thread Doron Cohen (JIRA)
[ 
http://issues.apache.org/jira/browse/LUCENE-756?page=comments#action_12460292 ] 

Doron Cohen commented on LUCENE-756:


> Does this mean a separate file outside the final .cfs files? 

Oh no - there's a single .nrm file in the .cfs file (instead of multiple .fN 
files in the .cfs file). 
As before, only .sN files (separated norm files) are outside of .cfs file.


> Maintain norms in a single file .nrm
> 
>
> Key: LUCENE-756
> URL: http://issues.apache.org/jira/browse/LUCENE-756
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Doron Cohen
> Assigned To: Doron Cohen
>Priority: Minor
> Attachments: nrm.patch.txt
>
>
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity 
> comparing to compound indexes. But their file descriptors foot print is much 
> higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number 
> of files used by non compound indexes, and possibly allow more applications 
> to use this format.
> More details on the motivation for this in: 
> http://www.nabble.com/potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-tf2826909.html
>  (in particular 
> http://www.nabble.com/Re%3A-potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-p7910403.html).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-756) Maintain norms in a single file .nrm

2006-12-21 Thread Yonik Seeley (JIRA)
[ 
http://issues.apache.org/jira/browse/LUCENE-756?page=comments#action_12460287 ] 

Yonik Seeley commented on LUCENE-756:
-

> - CFS now also maintains all norms in a single file. 

Does this mean a separate file outside the final .cfs files?

> Maintain norms in a single file .nrm
> 
>
> Key: LUCENE-756
> URL: http://issues.apache.org/jira/browse/LUCENE-756
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Doron Cohen
> Assigned To: Doron Cohen
>Priority: Minor
> Attachments: nrm.patch.txt
>
>
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity 
> comparing to compound indexes. But their file descriptors foot print is much 
> higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number 
> of files used by non compound indexes, and possibly allow more applications 
> to use this format.
> More details on the motivation for this in: 
> http://www.nabble.com/potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-tf2826909.html
>  (in particular 
> http://www.nabble.com/Re%3A-potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-p7910403.html).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-756) Maintain norms in a single file .nrm

2006-12-21 Thread Yonik Seeley (JIRA)
[ 
http://issues.apache.org/jira/browse/LUCENE-756?page=comments#action_12460285 ] 

Yonik Seeley commented on LUCENE-756:
-

Seems like a good idea... given that norms are read once on-demand, I wouldn't 
expect anything search related to be slower with this.  Opening a new reader 
should actually be slightly faster due to fewer files to open.


> Maintain norms in a single file .nrm
> 
>
> Key: LUCENE-756
> URL: http://issues.apache.org/jira/browse/LUCENE-756
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Doron Cohen
> Assigned To: Doron Cohen
>Priority: Minor
> Attachments: nrm.patch.txt
>
>
> Non-compound indexes are ~10% faster at indexing, and perform 50% IO activity 
> comparing to compound indexes. But their file descriptors foot print is much 
> higher. 
> By maintaining all field norms in a single .nrm file, we can bound the number 
> of files used by non compound indexes, and possibly allow more applications 
> to use this format.
> More details on the motivation for this in: 
> http://www.nabble.com/potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-tf2826909.html
>  (in particular 
> http://www.nabble.com/Re%3A-potential-indexing-perormance-improvement-for-compound-index---cut-IO---have-more-files-though-p7910403.html).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]