[ 
https://issues.apache.org/jira/browse/LUCENE-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir reassigned LUCENE-2946:
-----------------------------------

    Assignee: Robert Muir
    
> change file format documentation from "bit-for-bit" to highlevel
> ----------------------------------------------------------------
>
>                 Key: LUCENE-2946
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2946
>             Project: Lucene - Java
>          Issue Type: Task
>          Components: general/website
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>             Fix For: 4.0
>
>
> While reviewing website docs in LUCENE-2924,
> I noticed the the existing fileformats is going to be pretty hopeless for 4.0.
> Before it described the format "bit-for-bit", but with flexible indexing this 
> is 
> somewhat silly (and who really wants a bit-for-bit explanation of some of the 
> new formats!)
> I think it would be much better to give a high-level overview, perhaps 
> linking to javadocs or
> even source code for the low-level details. 
> We probably should delay this until 4.0 is really close in sight (since 
> things are changing so fast) but we can go ahead and think about it some now.
> For example:
> * high level explanation of what a codec is, and the various subsystems one 
> is usually composed of (terms index, terms data, skiplist impl, postings 
> impl, etc). We can reiterate that you can make your own, and hopefully this 
> kind of documentation will actually encourage that.
> * high level explanation of what StandardCodec is "composed of". For example 
> assume its Variable Terms Index, Block Terms Reader, PForDelta docs and freqs 
> and Simple64 positions. I think really this is the only codec we should try 
> to "diagram" in any way.
> * high level explanation (probably with links) of some of the components. For 
> example we could explain what the purpose of a Terms Index is, and that this 
> implementation uses a finite state transducer to find the terms block for a 
> given term. In this case maybe we have an image now that Dawid made the toDot 
> useful.
> * high level explanation (probably with links) of some of the compression 
> algorithms. For example, we could explain the basics of the available 
> algorithms we have (vbyte/simple/for/pfor/...) and what their advantages and 
> disadvantages are.
> Some of the things i mentioned here are probably optional, for instance I 
> think its "enough" to give a high-level overview of StandardCodec, but I 
> can't help but think that explaining some of the architecture will be useful 
> for new developers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to