Hi,
is it possible to append to an already existing avro file when it was written
and closed before?
If I use
outputStream = fs.append(avroFilePath);
then later on I get: java.io.IOException: Invalid sync!
Probably because the schema is written twice and some other issues.
If I use
Hi,
maybe a simple question. But is it ok if schemaCache in SpecificData contains
an entry:
class
java.util.ArrayList={type:record,name:ArrayList,namespace:java.util,fields:[]}
I thought any type of collection should get an array schema type.
I'm using avro 1.6.1
Vyacheslav
, Vyacheslav Zholudev vyacheslav.zholu...@gmail.com
wrote:
There is a possible reason:
It seems that there is an upper limit of 10,001 records per reduce input
group. (or is there a setting?)
If I output one million rows with the same key, I get:
Map output records: 1,000,000
Reduce input
One more update:
running the job with the -XX:-UseLoopPredicate option gave the same results.
The difference between mapper output records and reducer input records is
persistent.
Thanks!
Vyacheslav
On Aug 17, 2011, at 3:56 AM, Scott Carey wrote:
On 8/16/11 3:56 PM, Vyacheslav Zholudev
I'm assuming for now that you are using a specific writer and you have a union
schema with two records FOO and BAR (you should get two classes FOO and BAR
generated by avro tools):
FOO fooObj =
BAR barObj =
BAR barObj2 =
ByteArrayOutputStream out =