> Does parquet file has limit in size ( 1TB ) ? I did’t see any problem but 1TB is too big to operation need to divide into small pieces. > Should we use SaveMode.APPEND for long running streaming app ? Yes, but you need to partition it by time so it easy to maintain like update or delete a specific time without rebuild them all. > How should we store in HDFS (directory structure, ... )? Should partition the file into small pieces.
> On Aug 28, 2016, at 9:43 PM, Kevin Tran <kevin...@gmail.com> wrote: > > Hi, > Does anyone know what is the best practises to store data to parquet file? > Does parquet file has limit in size ( 1TB ) ? > Should we use SaveMode.APPEND for long running streaming app ? > How should we store in HDFS (directory structure, ... )? > > Thanks, > Kevin. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org