I'm using Spark 2.0. I've created a dataset from a parquet file and repartition on one of the columns (docId) and persist the repartitioned dataset.
val om = ds.repartition($"docId").persist(StorageLevel.MEMORY_AND_DISK) When I try to confirm the partitioner, with om.rdd.partitioner I get Option[org.apache.spark.Partitioner] = None I would have thought it would be HashPartitioner. Does anyone know why this would be None and not HashPartitioner? Thanks. Darin.