[
https://issues.apache.org/jira/browse/LUCENE-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188828#comment-13188828
]
Robert Muir commented on LUCENE-3706:
-------------------------------------
Here's just a quick comparison of size against term vectors with
the current patch. I can play around and see if we can do better.
{noformat}
EnglishVectors
total 860961
drwx------+ 2 rmuir None 40960 Jan 18 19:03 .
drwx------+ 6 rmuir None 0 Jan 18 18:54 ..
-rwx------+ 1 rmuir None 328731623 Jan 18 19:03 _f.fdt
-rwx------+ 1 rmuir None 1004692 Jan 18 19:03 _f.fdx
-rwx------+ 1 rmuir None 31 Jan 18 19:02 _f.fnm
-rwx------+ 1 rmuir None 63 Jan 18 19:03 _f.per
-rwx------+ 1 rmuir None 251106 Jan 18 19:03 _f.tvd
-rwx------+ 1 rmuir None 417626058 Jan 18 19:03 _f.tvf
-rwx------+ 1 rmuir None 2009380 Jan 18 19:03 _f.tvx
-rwx------+ 1 rmuir None 50348161 Jan 18 19:03 _f_0.frq
-rwx------+ 1 rmuir None 75146262 Jan 18 19:03 _f_0.prx
-rwx------+ 1 rmuir None 6164520 Jan 18 19:03 _f_0.tim
-rwx------+ 1 rmuir None 146233 Jan 18 19:03 _f_0.tip
-rwx------+ 1 rmuir None 31 Jan 18 19:03 _f_nrm.cfe
-rwx------+ 1 rmuir None 125608 Jan 18 19:03 _f_nrm.cfs
-rwx------+ 1 rmuir None 20 Jan 18 19:03 segments.gen
-rwx------+ 1 rmuir None 265 Jan 18 19:03 segments_1
EnglishOffsets
total 552569
drwx------+ 2 rmuir None 28672 Jan 18 19:08 .
drwx------+ 7 rmuir None 4096 Jan 18 19:06 ..
-rwx------+ 1 rmuir None 328731623 Jan 18 19:07 _s.fdt
-rwx------+ 1 rmuir None 1004692 Jan 18 19:07 _s.fdx
-rwx------+ 1 rmuir None 31 Jan 18 19:07 _s.fnm
-rwx------+ 1 rmuir None 63 Jan 18 19:08 _s.per
-rwx------+ 1 rmuir None 52354303 Jan 18 19:08 _s_0.frq
-rwx------+ 1 rmuir None 177235787 Jan 18 19:08 _s_0.prx
-rwx------+ 1 rmuir None 6181626 Jan 18 19:08 _s_0.tim
-rwx------+ 1 rmuir None 146248 Jan 18 19:08 _s_0.tip
-rwx------+ 1 rmuir None 31 Jan 18 19:08 _s_nrm.cfe
-rwx------+ 1 rmuir None 125608 Jan 18 19:08 _s_nrm.cfs
-rwx------+ 1 rmuir None 20 Jan 18 19:08 segments.gen
-rwx------+ 1 rmuir None 265 Jan 18 19:08 segments_1
{noformat}
> add offsets into lucene40 postings
> ----------------------------------
>
> Key: LUCENE-3706
> URL: https://issues.apache.org/jira/browse/LUCENE-3706
> Project: Lucene - Java
> Issue Type: New Feature
> Affects Versions: 4.0
> Reporter: Robert Muir
> Attachments: LUCENE-3706.patch
>
>
> LUCENE-3684 added support for
> IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS, but
> only SimpleText implements it.
> I think we should implement it in the other 4.0 codecs (starting with
> Lucene40PostingsFormat).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]