I guess it has to do with the Tungsten explicit memory management that
builds on sun.misc.Unsafe. The "ConvertToUnsafe" class converts
Java-object-based rows into UnsafeRows, which has the Spark internal
memory-efficient format.

Here is the related code in 1.6:

ConvertToUnsafe is defined in:
https://github.com/apache/spark/blob/branch-1.6/sql/core/src/main/scala/org/apache/spark/sql/execution/rowFormatConverters.scala

/**
* Converts Java-object-based rows into [[UnsafeRow]]s.
*/
case class ConvertToUnsafe(child: SparkPlan) extends UnaryNode

And, UnsafeRow is defined in:
https://github.com/apache/spark/blob/branch-1.6/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java

/**
* An Unsafe implementation of Row which is backed by raw memory instead of
Java objects.
*
* Each tuple has three parts: [null bit set] [values] [variable length
portion]
*
* The bit set is used for null tracking and is aligned to 8-byte word
boundaries. It stores
* one bit per field.
*
* In the `values` region, we store one 8-byte word per field. For fields
that hold fixed-length
* primitive types, such as long, double, or int, we store the value
directly in the word. For
* fields with non-primitive or variable-length values, we store a relative
offset (w.r.t. the
* base address of the row) that points to the beginning of the
variable-length field, and length
* (they are combined into a long).
*
* Instances of `UnsafeRow` act as pointers to row data stored in this
format.
*/
public final class UnsafeRow extends MutableRow implements Externalizable,
KryoSerializable

Xinh

On Sun, Jun 26, 2016 at 1:11 PM, Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

>
> Hi,
>
> In Spark's Physical Plan what is the explanation for ConvertToUnsafe?
>
> Example:
>
> scala> sorted.filter($"prod_id" ===13).explain
> == Physical Plan ==
> Filter (prod_id#10L = 13)
> +- Sort [prod_id#10L ASC,cust_id#11L ASC,time_id#12 ASC,channel_id#13L
> ASC,promo_id#14L ASC], true, 0
>    +- ConvertToUnsafe
>       +- Exchange rangepartitioning(prod_id#10L ASC,cust_id#11L
> ASC,time_id#12 ASC,channel_id#13L ASC,promo_id#14L ASC,200), None
>          +- HiveTableScan
> [prod_id#10L,cust_id#11L,time_id#12,channel_id#13L,promo_id#14L],
> MetastoreRelation oraclehadoop, sales2, None
>
>
> My inclination is that it is a temporary construct like tempTable created
> as part of Physical Plan?
>
>
> Thanks
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>

Reply via email to