Yes it provides but whatever have seen its line by line update. Please see below link https://gist.github.com/QwertyManiac/4724582
This is very slow because of append Avro , am thinking of something which we normally do for test files where we buffer the data to a size and the flush the buffer. On Mon, Jan 9, 2017 at 3:17 PM, Jörn Franke <jornfra...@gmail.com> wrote: > Avro itself supports it, but I am not sure if this functionality is > available through the Spark API. Just out of curiosity, if your use case is > only write to HDFS then you might use simply flume. > > On 9 Jan 2017, at 09:58, awkysam <contactsanto...@gmail.com> wrote: > > Currently for our project we are collecting data and pushing into Kafka > with messages are in Avro format. We need to push this data into HDFS and > we are using SparkStreaming and in HDFS also it is stored in Avro format. > We are partitioning the data per each day. So when we write data into HDFS > we need to append to the same file. Curenttly we are using > GenericRecordWriter and we will be using saveAsNewAPIHadoopFile for writing > into HDFS. Is there a way to append data into file in HDFS with Avro format > using saveAsNewAPIHadoopFile ? Thanks, Santosh B > ------------------------------ > View this message in context: AVRO Append HDFS using > saveAsNewAPIHadoopFile > <http://apache-spark-user-list.1001560.n3.nabble.com/AVRO-Append-HDFS-using-saveAsNewAPIHadoopFile-tp28292.html> > Sent from the Apache Spark User List mailing list archive > <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com. > >