[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

cloud-fan Thu, 18 Jan 2018 20:10:23 -0800

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19285#discussion_r162534289
  
    --- Diff: 
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
    @@ -162,26 +162,33 @@ private[spark] class MemoryStore(
       }
     
       /**
    -   * Attempt to put the given block in memory store as values.
    +   * Attempt to put the given block in memory store as values or bytes.
        *
        * It's possible that the iterator is too large to materialize and store 
in memory. To avoid
        * OOM exceptions, this method will gradually unroll the iterator while 
periodically checking
        * whether there is enough free memory. If the block is successfully 
materialized, then the
        * temporary unroll memory used during the materialization is 
"transferred" to storage memory,
        * so we won't acquire more memory than is actually needed to store the 
block.
        *
    -   * @return in case of success, the estimated size of the stored data. In 
case of failure, return
    -   *         an iterator containing the values of the block. The returned 
iterator will be backed
    -   *         by the combination of the partially-unrolled block and the 
remaining elements of the
    -   *         original input iterator. The caller must either fully consume 
this iterator or call
    -   *         `close()` on it in order to free the storage memory consumed 
by the partially-unrolled
    -   *         block.
    +   * @param blockId The block id.
    +   * @param values The values which need be stored.
    +   * @param classTag the [[ClassTag]] for the block.
    +   * @param memoryMode The values saved mode.
    +   * @param storeValue Store the record of values to the MemoryStore.
    +   * @param estimateSize Get the memory size which used to unroll the 
block. The parameters
    +   *                     determine whether we need precise size.
    +   * @param createMemoryEntry Using [[MemoryEntry]] to hold the stored 
values or bytes.
    +   * @return if the block is stored successfully, return the stored data 
size. Else return the
    +   *         memory has used for unroll the block.
        */
    -  private[storage] def putIteratorAsValues[T](
    +  private def putIterator[T](
           blockId: BlockId,
           values: Iterator[T],
    -      classTag: ClassTag[T]): Either[PartiallyUnrolledIterator[T], Long] = 
{
    -
    +      classTag: ClassTag[T],
    +      memoryMode: MemoryMode,
    +      storeValue: T => Unit,
    +      estimateSize: Boolean => Long,
    +      createMemoryEntry: () => MemoryEntry[T]): Either[Long, Long] = {
    --- End diff --
    
    instead of passing 3 functions, I'd like to introduce 
    ```
    class ValuesHolder {
      def store(value)
      def esitimatedSize()
      def build(): MemoryEntry
    }
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

Reply via email to