[
https://issues.apache.org/jira/browse/OPENNLP-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987614#action_12987614
]
Steven Bethard commented on OPENNLP-99:
---------------------------------------
> We still need to take the exception handling into account
Ok, rewritten with exception handling:
// using ObjectStream
try {
Event event = stream.read();
while (event != null) {
...
event = stream.read();
}
} catch (IOException e) {
...
}
// using Iterator
Event event;
try {
while (iterator.hasNext()) {
event = iterator.next();
...
}
} catch (OpenNLPIOException e) {
...
}
Looks about the same to me. But maybe you had something else in mind? If you
could write the code that you're concerned about, that would help me understand
better.
> our users are usually implementing the ObjectStream,
> but not using it to actually read the data them self
Perhaps it would help to see the use case where I wanted this? Here's the code:
val testStream = new RealValueFileEventStream(testPath)
val events = // wrapper for testStream that makes it into a real Iterator<Event>
for (event <- events) {
val prediction = model.getBestOutcome(model.eval(event.getContext))
...
}
Basically, I take each event from a test file, and ask the model to predict an
outcome. Is this an atypical use case?
> I still believe that it is easier to implement these streams
> when the underlying data source is an InputStream or Reader.
Yes, I agreed to this earlier. But as a user, I care less about how the streams
are implemented, and more about the API I have to use when calling them.
Or are you suggesting that most users are going to be implementing custom
ObjectStreams? If that's the case, you could easily define ObjectStream as an
abstract class that implements Iterator<T> e.g.:
public abstract class ObjectStream<T> implements Iterator<T> {
public abstract T read();
...
private T next = this.read();
public boolean hasNext() {
return this.next != null
}
public T next() {
T result = this.next;
this.next = this.read();
return result;
}
}
That way, if you think it's easier to implement the ObjectStream API, you just
subclass from ObjectStream and you get an Iterator for free.
> And ObjectStream has also a reset method ...
> we would need this method on an Iterator like interface too
Well, the standard Java solution to this would be to declare it as an
Iterable<T> instead. Then code like:
Event event = stream.read();
while (event != null) {
...
event = stream.read();
}
stream.reset()
while (event != null) {
...
event = stream.read();
}
would instead be written as:
for (Event event : iterable) {
...
}
for (Event event: iterable) {
...
}
Basically, anything that can be reset is an Iterable, and any single pass
through it is an Iterator.
I also agree that remove() isn't necessary but the Iterable interface declares
it as optional, so I don't think that's a problem. Just throw an
UnsupportedOperationException like it says to in the Iterator javadocs. (Also,
note that your AbstractEventStream already has a remove() method that does
exactly this.)
> EventStream should extend Iterator<Event>
> -----------------------------------------
>
> Key: OPENNLP-99
> URL: https://issues.apache.org/jira/browse/OPENNLP-99
> Project: OpenNLP
> Issue Type: New Feature
> Components: Maxent
> Affects Versions: maxent-3.0.0-sourceforge
> Reporter: Steven Bethard
>
> [As requested, brought over from Sourceforge.]
> Conceptually, EventStream is just an Iterator<Event>. You would get better
> interoperability with other Java libraries if EventStream were declared as
> such. If you didn't care about backwards compatibility, I'd say just get rid
> of EventStream entirely and use Iterator<Event> everywhere instead.
> If you care about backwards compatibility, you could at least declare
> AbstractEventStream as implementing Iterator<Event> - it declares all of
> hasNext(), next() and remove(). I believe that shouldn't break anything, and
> should make all the current EventStream implementations into Iterator<Event>s.
> Why do I want this? Because, when using OpenNLP maxent from Scala, if a
> RealValueFileEventStream were an Iterator<Event>, I could write:
> for (event <- stream) {
> ...
> }
> But since it's not, I instead have to wrap it in an Iterator:
> val events = new Iterator[Event] {
> def hasNext = stream.hasNext
> def next = stream.next
> }
> for (event <- events) {
> ...
> }
> Or write the while loop version:
> while (stream.hasNext) {
> val event = stream.next
> ...
> }
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.