[jira] [Commented] (FLINK-987) Extend TypeSerializers and -Comparators to work directly on Memory Segments

ASF GitHub Bot (JIRA) Tue, 22 Jul 2014 06:34:52 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14070249#comment-14070249
 ]


ASF GitHub Bot commented on FLINK-987:
--------------------------------------

Github user aljoscha commented on a diff in the pull request:

    https://github.com/apache/incubator-flink/pull/77#discussion_r15227266
  
    --- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/memorymanager/AbstractMemorySegmentOutputView.java
 ---
    @@ -0,0 +1,337 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.flink.runtime.memorymanager;
    +
    +import java.io.IOException;
    +import java.io.UTFDataFormatException;
    +
    +import org.apache.flink.core.memory.DataInputView;
    +import org.apache.flink.core.memory.DataOutputView;
    +import org.apache.flink.core.memory.MemorySegment;
    +
    +/**
    + * The base class for all output views that are backed by multiple memory 
pages. This base class contains all
    + * encoding methods to write data to a page and detect page boundary 
crossing. The concrete sub classes must
    + * implement {@link #advance()} for moving through the memory segments and 
the tell/seek related methods
    + * from {@link org.apache.flink.core.memory.DataOutputView}.
    + * <p>
    + * The paging assumes that all memory segments are of the same size.
    + */
    +public abstract class AbstractMemorySegmentOutputView implements 
DataOutputView {
    +   protected final int segmentSize;                                // the 
size of the memory segments
    +   protected final int headerLength;                               // the 
number of bytes to skip at the beginning of each segment
    +   protected MemorySegment currentSegment;                 // the current 
memory segment to write to
    +   protected int positionInSegment;                                        
// the offset in the current segment
    +   private byte[] utfBuffer;                                               
// the reusable array for UTF encodings
    --- End diff --
    
    Yes, I know. I can't fix it though. In IntelliJ with tab-size=4 it looks 
good. In the terminal with tab-size=8 it looks bad, and here as well. That's 
why I prefer tabs vs spaces but I'm not the guy who's going to fight for 
that... :smile: 


> Extend TypeSerializers and -Comparators to work directly on Memory Segments
> ---------------------------------------------------------------------------
>
>                 Key: FLINK-987
>                 URL: https://issues.apache.org/jira/browse/FLINK-987
>             Project: Flink
>          Issue Type: Improvement
>          Components: Local Runtime
>    Affects Versions: 0.6-incubating
>            Reporter: Stephan Ewen
>            Assignee: Aljoscha Krettek
>             Fix For: 0.6-incubating
>
>
> As per discussion with [~till.rohrmann], [~uce], [~aljoscha], we suggest to 
> change the way that the TypeSerialzers/Comparators and 
> DataInputViews/DataOutputViews work.
> The goal is to allow more flexibility in the construction on the binary 
> representation of data types, and to allow partial deserialization of 
> individual fields. Both is currently prohibited by the fact that the 
> abstraction of the memory (into which the data goes) is a stream abstraction 
> ({{DataInputView}}, {{DataOutputView}}).
> An idea is to offer a random-access buffer like view for construction and 
> random-access deserialization, as well as various methods to copy elements in 
> a binary fashion between such buffers and streams.
> A possible set of methods for the {{TypeSerializer}} could be:
> {code}
> long serialize(T record, TargetBuffer buffer);
>       
> T deserialize(T reuse, SourceBuffer source);
>       
> void ensureBufferSufficientlyFilled(SourceBuffer source);
>       
> <X> X deserializeField(X reuse, int logicalPos, SourceBuffer buffer);
>       
> int getOffsetForField(int logicalPos, int offset, SourceBuffer buffer);
>       
> void copy(DataInputView in, TargetBuffer buffer);
>       
> void copy(SourceBuffer buffer,, DataOutputView out);
>       
> void copy(DataInputView source, DataOutputView target);
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (FLINK-987) Extend TypeSerializers and -Comparators to work directly on Memory Segments

Reply via email to