Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r162534289 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -162,26 +162,33 @@ private[spark] class MemoryStore( } /** - * Attempt to put the given block in memory store as values. + * Attempt to put the given block in memory store as values or bytes. * * It's possible that the iterator is too large to materialize and store in memory. To avoid * OOM exceptions, this method will gradually unroll the iterator while periodically checking * whether there is enough free memory. If the block is successfully materialized, then the * temporary unroll memory used during the materialization is "transferred" to storage memory, * so we won't acquire more memory than is actually needed to store the block. * - * @return in case of success, the estimated size of the stored data. In case of failure, return - * an iterator containing the values of the block. The returned iterator will be backed - * by the combination of the partially-unrolled block and the remaining elements of the - * original input iterator. The caller must either fully consume this iterator or call - * `close()` on it in order to free the storage memory consumed by the partially-unrolled - * block. + * @param blockId The block id. + * @param values The values which need be stored. + * @param classTag the [[ClassTag]] for the block. + * @param memoryMode The values saved mode. + * @param storeValue Store the record of values to the MemoryStore. + * @param estimateSize Get the memory size which used to unroll the block. The parameters + * determine whether we need precise size. + * @param createMemoryEntry Using [[MemoryEntry]] to hold the stored values or bytes. + * @return if the block is stored successfully, return the stored data size. Else return the + * memory has used for unroll the block. */ - private[storage] def putIteratorAsValues[T]( + private def putIterator[T]( blockId: BlockId, values: Iterator[T], - classTag: ClassTag[T]): Either[PartiallyUnrolledIterator[T], Long] = { - + classTag: ClassTag[T], + memoryMode: MemoryMode, + storeValue: T => Unit, + estimateSize: Boolean => Long, + createMemoryEntry: () => MemoryEntry[T]): Either[Long, Long] = { --- End diff -- instead of passing 3 functions, I'd like to introduce ``` class ValuesHolder { def store(value) def esitimatedSize() def build(): MemoryEntry } ```
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org