Found the answer. It is the block size. Thanks.
On Wed, Feb 3, 2016 at 5:05 PM, Gavin Yue <yue.yuany...@gmail.com> wrote: > I am doing a simple count like: > > sqlContext.read.parquet("path").count > > I have only 5000 parquet files. But generate over 20000 tasks. > > Each parquet file is converted from one gz text file. > > Please give some advice. > > Thanks > > > >