I cannot answer what made us move to Avro. However, I prefer Avro because you don't have to build the thrift compiler and you aren't required to do code generation.
On Wed, Aug 1, 2012 at 11:06 PM, Juhani Connolly < [email protected]> wrote: > It looks to me like this was because of the transceiver I was using. > > Unfortunately it seems like avro doesn't have a python implementation of a > transceiver that fits the format expected by netty/avro(in fact it only has > one transceiver... HTTPTransceiver). > > To address this, I'm thinking of putting together a thrift source(the > legacy source doesn't seem to be usable as it returns nothing, and lacks > batching). Does this seem like a reasonable solution to making it possible > to send data to flume from other languages(and allowing backoff on > failure?). Historically, what made us move away from thrift to avro? > > > On 07/30/2012 05:34 PM, Juhani Connolly wrote: > >> I'm playing around with making a standalone tail client in python(so that >> I can access inode data) that tracks position in a file and then sends it >> across avro to an avro sink. >> >> However I'm having issues with the avro part of this and wondering if >> anyone more familiar with it could help. >> >> I took the flume.avdl file and converted it using "java -jar >> ~/Downloads/avro-tools-1.6.3.**jar idl flume.avdl flume.avpr" >> >> I then run it through a simple test program to see if its sending the >> data correctly and it sends from the python client fine, but the sink end >> OOM's because presumably the wire format is wrong: >> >> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, / >> 172.22.114.32:55671 => /172.28.19.112:41414] OPEN >> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, / >> 172.22.114.32:55671 => /172.28.19.112:41414] BOUND: /172.28.19.112:41414 >> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, / >> 172.22.114.32:55671 => /172.28.19.112:41414] CONNECTED: / >> 172.22.114.32:55671 >> 2012-07-30 17:22:57,646 WARN ipc.NettyServer: Unexpected exception from >> downstream. >> java.lang.OutOfMemoryError: Java heap space >> at java.util.ArrayList.<init>(**ArrayList.java:112) >> at org.apache.avro.ipc.**NettyTransportCodec$**NettyFrameDecoder. >> **decodePackHeader(**NettyTransportCodec.java:154) >> at org.apache.avro.ipc.**NettyTransportCodec$** >> NettyFrameDecoder.decode(**NettyTransportCodec.java:131) >> at org.jboss.netty.handler.codec.**frame.FrameDecoder.callDecode( >> **FrameDecoder.java:282) >> at org.jboss.netty.handler.codec.**frame.FrameDecoder.** >> messageReceived(FrameDecoder.**java:216) >> at org.jboss.netty.channel.**Channels.fireMessageReceived(** >> Channels.java:274) >> at org.jboss.netty.channel.**Channels.fireMessageReceived(** >> Channels.java:261) >> at org.jboss.netty.channel.**socket.nio.NioWorker.read(** >> NioWorker.java:351) >> at org.jboss.netty.channel.**socket.nio.NioWorker.** >> processSelectedKeys(NioWorker.**java:282) >> at org.jboss.netty.channel.**socket.nio.NioWorker.run(** >> NioWorker.java:202) >> at java.util.concurrent.**ThreadPoolExecutor$Worker.** >> runTask(ThreadPoolExecutor.**java:886) >> at java.util.concurrent.**ThreadPoolExecutor$Worker.run(** >> ThreadPoolExecutor.java:908) >> at java.lang.Thread.run(Thread.**java:619) >> 2012-07-30 17:22:57,647 INFO ipc.NettyServer: [id: 0x5fc6e818, / >> 172.22.114.32:55671 :> /172.28.19.112:41414] DISCONNECTED >> 2012-07-30 17:22:57,647 INFO ipc.NettyServer: [id: 0x5fc6e818, / >> 172.22.114.32:55671 :> /172.28.19.112:41414] UNBOUND >> 2012-07-30 17:22:57,647 INFO ipc.NettyServer: [id: 0x5fc6e818, / >> 172.22.114.32:55671 :> /172.28.19.112:41414] CLOSED >> >> I've dumped the test program and its output >> >> http://pastebin.com/1DtXZyTu >> http://pastebin.com/T9kaqKHY >> >> > -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
