[jira] Commented: (LUCENE-2126) Split up IndexInput and IndexOutput into DataInput and DataOutput

Marvin Humphrey (JIRA) Sun, 13 Dec 2009 06:19:44 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789895#action_12789895
 ]


Marvin Humphrey commented on LUCENE-2126:
-----------------------------------------

> I disagree with you here: introducing DataInput/Output makes IMO the API
> actually easier for the "normal" user to understand.
> 
> I would think that most users don't implement IndexInput/Output extensions,
> but simply use the out-of-the-box Directory implementations, which provide
> IndexInput/Output impls. Also, most users probably don't even call the
> IndexInput/Output APIs directly. 

I agree with everything you say in the second paragraph, but I don't see how
any of that supports the assertion you make in the first paragraph.

Lucene's file system has a directory class, named "Directory", and a pair of
classes which representing files, named "IndexInput" and "IndexOutput".
Directories and files.  Easy to understand.

All common IO systems have entities which represent data streaming to/from a
file.  They might be called "file handles", "file descriptors", "readers" and 
"writers", "streams", or whatever, but they're all basically the same thing.

What this patch does is fragment the pair of classes that representing file
IO... why?

What does a "normal" user do with a file?

   Step 1: Open the file.
   Step 2: Write data to the file.
   Step 3: Close the file.

Then, later...

   Step 1: Open the file.
   Step 2: Read data from the file.
   Step 3: Close the file.

You're saying that Lucene's file abstraction is easier to understand if you
break that up?

I grokked your first rationale -- that you don't want people to be able to
call close() on an IndexInput that they're essentially borrowing for a bit.
OK, I think it's overkill to create an entire class to thwart something nobody
was going to do anyway, but at least I understand why you might want to do
that.

But the idea that this strange fragmentation of the IO hierarchy makes things
*easier* -- I don't get it at all.  And I certainly don't see how it's such an 
improvement over what exists now that it justifies a change to the public API.

> Split up IndexInput and IndexOutput into DataInput and DataOutput
> -----------------------------------------------------------------
>
>                 Key: LUCENE-2126
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2126
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: Flex Branch
>            Reporter: Michael Busch
>            Assignee: Michael Busch
>            Priority: Minor
>             Fix For: Flex Branch
>
>         Attachments: lucene-2126.patch
>
>
> I'd like to introduce the two new classes DataInput and DataOutput
> that contain all methods from IndexInput and IndexOutput that actually
> decode or encode data, such as readByte()/writeByte(),
> readVInt()/writeVInt().
> Methods like getFilePointer(), seek(), close(), etc., which are not
> related to data encoding, but to files as input/output source stay in
> IndexInput/IndexOutput.
> This patch also changes ByteSliceReader/ByteSliceWriter to extend
> DataInput/DataOutput. Previously ByteSliceReader implemented the
> methods that stay in IndexInput by throwing RuntimeExceptions.
> See also LUCENE-2125.
> All tests pass.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Commented: (LUCENE-2126) Split up IndexInput and IndexOutput into DataInput and DataOutput

Reply via email to