[ 
https://issues.apache.org/jira/browse/LUCENE-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16736378#comment-16736378
 ] 

Dawid Weiss commented on LUCENE-8525:
-------------------------------------

I think the spirit of throwing generic IOException in DataInput (and other 
places) is twofold: one thing is that you can't really make it any more 
specific for all the implementing classes (and sure, some classes can declare a 
more specific type or lift the exception altogether, but in the parent class 
you're still left with the most generic type). The second reason is probably 
more pragmatic: a more specific subclass of IOException for signalling certain 
data corruptions simply doesn't make much practical sense. If there is a data 
corruption in data input (that can be detected) then it's really an exceptional 
situation: you can't reasonably recover from it. I really can't think of a 
scenario where it'd be reasonable to declare an exception like 
"VLongEncodingInvalid" or something like that. 

Some criticize the Java exception handling syntax as too verbose altogether... 
When you think about practical differences in usage between checked vs. 
unchecked exceptions (RuntimeException and Error) or even between 
NoSuchFileException and FileNotFoundException, it all becomes quite a mess when 
you get into the details. Sometimes keeping it simple has benefits.

I don't know how to reasonably solve your problem, but to me, coming from code, 
a corrupted input in a class where input is expected to not be malformed is an 
I/O exception and should be signaled as such. The exception's message, not its 
type, carries the details (for logging, for example).

Just my opinion though, perhaps others will differ.



> throw more specific exception on data corruption
> ------------------------------------------------
>
>                 Key: LUCENE-8525
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8525
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Vladimir Dolzhenko
>            Priority: Major
>
> DataInput throws generic IOException if data looks odd
> [DataInput:141|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L141]
> there are other examples like 
> [BufferedIndexInput:219|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/BufferedIndexInput.java#L219],
>  
> [CompressionMode:226|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/codecs/compressing/CompressionMode.java#L226]
>  and maybe 
> [DocIdsWriter:81|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/util/bkd/DocIdsWriter.java#L81]
> That leads to some difficulties - see [elasticsearch 
> #34322|https://github.com/elastic/elasticsearch/issues/34322]
> It would be better if it throws more specific exception.
> As a consequence 
> [SegmentInfos.readCommit|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java#L281]
>  violates its own contract
> {code:java}
> /**
>    * @throws CorruptIndexException if the index is corrupt
>    * @throws IOException if there is a low-level IO error
>    */
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to