[
https://issues.apache.org/jira/browse/FLINK-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14070183#comment-14070183
]
ASF GitHub Bot commented on FLINK-987:
--------------------------------------
GitHub user aljoscha opened a pull request:
https://github.com/apache/incubator-flink/pull/77
[FLINK-987] Add seeking to DataOutputView
This addresses the output side of FLINK-987. Output views operate
directly on memory segments and provide an interface akin to a random
access buffer (with some restrictions, such as required
locking/unlocking).
Introduce lock(), unlock(), tell(), and seek(long) methods in
DataOutputView interface and all child classes. Users of the interface
(for example Serializers) must take care to manually reset the writing
position to the end of the data. Data left after the writing positions
might be discarded or overwritten.
Refactor AbstractPagedOutputView into superclass
AbstractMemorySegmentOutputView of which AbstractPagedOutputView is now
a child. AbstractMemorySegmentOutputView is for output views that don't
need locking because they always have all the segments available.
AbstractPagedOutputView is for those output views that always want to
flush segments away but use locking to temporarilly keep them ( for
tell() and seek(long)).
Method nextSegment() of AbstractPagedOutputView refactored into
returnSegment() and requestSegment() since we sometimes must return
several segments at once while only requesting once new segment. (For
example when flushing all the locked segments in
AbstractPagedOutputView.)
Replace OutputViewDataOutputStreamWrapper and
OutputViewObjectOutputStreamWrapper by OutputViewDataOutputWrapper which
is more generic than both.
Add tell() and seek(long) calls in some SerializationTestType child
classes to simulate header writing in PagedViewsTests. Add new
LongAsciiStringType whose length is always bigger than the segment size
to test locking and seeking across segment boundaries.
General cleanup: Consolidate TestOutputViews where several tests had
inner class TestOutputView with same functionality.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/aljoscha/incubator-flink serializer-rework
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-flink/pull/77.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #77
----
commit d51e9b5d4634752dbc2bb0c5e7876bc04410950d
Author: Aljoscha Krettek <[email protected]>
Date: 2014-07-15T09:36:38Z
[FLINK-987] Add seeking to DataOutputView
This addresses the output side of FLINK-987. Output views operate
directly on memory segments and provide an interface akin to a random
access buffer (with some restrictions, such as required
locking/unlocking).
Introduce lock(), unlock(), tell(), and seek(long) methods in
DataOutputView interface and all child classes. Users of the interface
(for example Serializers) must take care to manually reset the writing
position to the end of the data. Data left after the writing positions
might be discarded or overwritten.
Refactor AbstractPagedOutputView into superclass
AbstractMemorySegmentOutputView of which AbstractPagedOutputView is now
a child. AbstractMemorySegmentOutputView is for output views that don't
need locking because they always have all the segments available.
AbstractPagedOutputView is for those output views that always want to
flush segments away but use locking to temporarilly keep them ( for
tell() and seek(long)).
Method nextSegment() of AbstractPagedOutputView refactored into
returnSegment() and requestSegment() since we sometimes must return
several segments at once while only requesting once new segment. (For
example when flushing all the locked segments in
AbstractPagedOutputView.)
Replace OutputViewDataOutputStreamWrapper and
OutputViewObjectOutputStreamWrapper by OutputViewDataOutputWrapper which
is more generic than both.
Add tell() and seek(long) calls in some SerializationTestType child
classes to simulate header writing in PagedViewsTests. Add new
LongAsciiStringType whose length is always bigger than the segment size
to test locking and seeking across segment boundaries.
General cleanup: Consolidate TestOutputViews where several tests had
inner class TestOutputView with same functionality.
----
> Extend TypeSerializers and -Comparators to work directly on Memory Segments
> ---------------------------------------------------------------------------
>
> Key: FLINK-987
> URL: https://issues.apache.org/jira/browse/FLINK-987
> Project: Flink
> Issue Type: Improvement
> Components: Local Runtime
> Affects Versions: 0.6-incubating
> Reporter: Stephan Ewen
> Assignee: Aljoscha Krettek
> Fix For: 0.6-incubating
>
>
> As per discussion with [~till.rohrmann], [~uce], [~aljoscha], we suggest to
> change the way that the TypeSerialzers/Comparators and
> DataInputViews/DataOutputViews work.
> The goal is to allow more flexibility in the construction on the binary
> representation of data types, and to allow partial deserialization of
> individual fields. Both is currently prohibited by the fact that the
> abstraction of the memory (into which the data goes) is a stream abstraction
> ({{DataInputView}}, {{DataOutputView}}).
> An idea is to offer a random-access buffer like view for construction and
> random-access deserialization, as well as various methods to copy elements in
> a binary fashion between such buffers and streams.
> A possible set of methods for the {{TypeSerializer}} could be:
> {code}
> long serialize(T record, TargetBuffer buffer);
>
> T deserialize(T reuse, SourceBuffer source);
>
> void ensureBufferSufficientlyFilled(SourceBuffer source);
>
> <X> X deserializeField(X reuse, int logicalPos, SourceBuffer buffer);
>
> int getOffsetForField(int logicalPos, int offset, SourceBuffer buffer);
>
> void copy(DataInputView in, TargetBuffer buffer);
>
> void copy(SourceBuffer buffer,, DataOutputView out);
>
> void copy(DataInputView source, DataOutputView target);
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)