Here are 2 links I could find: http://archive.cloudera.com/cdh4/cdh/4/hadoop/api/org/apache/hadoop/fs/FileSystem.html#create(org.apache.hadoop.fs.Path,%20boolean,%20int,%20short,%20long)
http://archive.cloudera.com/cdh4/cdh/4/hadoop/api/org/apache/hadoop/fs/FileSystem.html#create(org.apache.hadoop.fs.Path,%20boolean,%20int,%20short,%20long) Francois On Wed, Mar 22, 2017 at 4:29 PM, Padma Penumarthy <ppenumar...@mapr.com> wrote: > I think we create one file for each parquet block. > If underlying HDFS block size is 128 MB and parquet block size is > > 128MB, > it will create more blocks on HDFS. > Can you let me know what is the HDFS API that would allow you to > do otherwise ? > > Thanks, > Padma > > > > On Mar 22, 2017, at 11:54 AM, François Méthot <fmetho...@gmail.com> > wrote: > > > > Hi, > > > > Is there a way to force Drill to store CTAS generated parquet file as a > > single block when using HDFS? Java HDFS API allows to do that, files > could > > be created with the Parquet block-size. > > > > We are using Drill on hdfs configured with block size of 128MB. Changing > > this size is not an option at this point. > > > > It would be ideal for us to have single parquet file per hdfs block, > setting > > store.parquet.block-size to 128MB would fix our issue but we end up with > a > > lot more files to deal with. > > > > Thanks > > Francois > >