[
https://issues.apache.org/jira/browse/LUCENE-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir reassigned LUCENE-2946:
-----------------------------------
Assignee: Robert Muir
> change file format documentation from "bit-for-bit" to highlevel
> ----------------------------------------------------------------
>
> Key: LUCENE-2946
> URL: https://issues.apache.org/jira/browse/LUCENE-2946
> Project: Lucene - Java
> Issue Type: Task
> Components: general/website
> Reporter: Robert Muir
> Assignee: Robert Muir
> Fix For: 4.0
>
>
> While reviewing website docs in LUCENE-2924,
> I noticed the the existing fileformats is going to be pretty hopeless for 4.0.
> Before it described the format "bit-for-bit", but with flexible indexing this
> is
> somewhat silly (and who really wants a bit-for-bit explanation of some of the
> new formats!)
> I think it would be much better to give a high-level overview, perhaps
> linking to javadocs or
> even source code for the low-level details.
> We probably should delay this until 4.0 is really close in sight (since
> things are changing so fast) but we can go ahead and think about it some now.
> For example:
> * high level explanation of what a codec is, and the various subsystems one
> is usually composed of (terms index, terms data, skiplist impl, postings
> impl, etc). We can reiterate that you can make your own, and hopefully this
> kind of documentation will actually encourage that.
> * high level explanation of what StandardCodec is "composed of". For example
> assume its Variable Terms Index, Block Terms Reader, PForDelta docs and freqs
> and Simple64 positions. I think really this is the only codec we should try
> to "diagram" in any way.
> * high level explanation (probably with links) of some of the components. For
> example we could explain what the purpose of a Terms Index is, and that this
> implementation uses a finite state transducer to find the terms block for a
> given term. In this case maybe we have an image now that Dawid made the toDot
> useful.
> * high level explanation (probably with links) of some of the compression
> algorithms. For example, we could explain the basics of the available
> algorithms we have (vbyte/simple/for/pfor/...) and what their advantages and
> disadvantages are.
> Some of the things i mentioned here are probably optional, for instance I
> think its "enough" to give a high-level overview of StandardCodec, but I
> can't help but think that explaining some of the architecture will be useful
> for new developers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]