Yeah I agree. FWIW, I am hoping in few weeks I will have a little more spare time and I was planning on writing the Avro patches to ensure languages such as Python, C#, etc could write messages to Flume.
On Fri, Aug 3, 2012 at 1:30 AM, Juhani Connolly < [email protected]> wrote: > On paper it certainly seems like a good solution, it's just unfortunate > that some "supported" languages can't actually interface to it. I > understand that thrift can be quite a nuisance to deal with at times. > > > On 08/02/2012 11:01 PM, Brock Noland wrote: > >> I cannot answer what made us move to Avro. However, I prefer Avro because >> you don't have to build the thrift compiler and you aren't required to do >> code generation. >> >> On Wed, Aug 1, 2012 at 11:06 PM, Juhani Connolly < >> [email protected].**jp <[email protected]>> >> wrote: >> >> It looks to me like this was because of the transceiver I was using. >>> >>> Unfortunately it seems like avro doesn't have a python implementation of >>> a >>> transceiver that fits the format expected by netty/avro(in fact it only >>> has >>> one transceiver... HTTPTransceiver). >>> >>> To address this, I'm thinking of putting together a thrift source(the >>> legacy source doesn't seem to be usable as it returns nothing, and lacks >>> batching). Does this seem like a reasonable solution to making it >>> possible >>> to send data to flume from other languages(and allowing backoff on >>> failure?). Historically, what made us move away from thrift to avro? >>> >>> >>> On 07/30/2012 05:34 PM, Juhani Connolly wrote: >>> >>> I'm playing around with making a standalone tail client in python(so >>>> that >>>> I can access inode data) that tracks position in a file and then sends >>>> it >>>> across avro to an avro sink. >>>> >>>> However I'm having issues with the avro part of this and wondering if >>>> anyone more familiar with it could help. >>>> >>>> I took the flume.avdl file and converted it using "java -jar >>>> ~/Downloads/avro-tools-1.6.3.****jar idl flume.avdl flume.avpr" >>>> >>>> >>>> I then run it through a simple test program to see if its sending the >>>> data correctly and it sends from the python client fine, but the sink >>>> end >>>> OOM's because presumably the wire format is wrong: >>>> >>>> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, / >>>> 172.22.114.32:55671 => /172.28.19.112:41414] OPEN >>>> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, / >>>> 172.22.114.32:55671 => /172.28.19.112:41414] BOUND: / >>>> 172.28.19.112:41414 >>>> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, / >>>> 172.22.114.32:55671 => /172.28.19.112:41414] CONNECTED: / >>>> 172.22.114.32:55671 >>>> 2012-07-30 17:22:57,646 WARN ipc.NettyServer: Unexpected exception from >>>> downstream. >>>> java.lang.OutOfMemoryError: Java heap space >>>> at java.util.ArrayList.<init>(****ArrayList.java:112) >>>> at org.apache.avro.ipc.****NettyTransportCodec$**** >>>> NettyFrameDecoder. >>>> **decodePackHeader(****NettyTransportCodec.java:154) >>>> at org.apache.avro.ipc.****NettyTransportCodec$** >>>> NettyFrameDecoder.decode(****NettyTransportCodec.java:131) >>>> at org.jboss.netty.handler.codec.****frame.FrameDecoder.** >>>> callDecode( >>>> **FrameDecoder.java:282) >>>> at org.jboss.netty.handler.codec.****frame.FrameDecoder.** >>>> messageReceived(FrameDecoder.****java:216) >>>> at org.jboss.netty.channel.****Channels.fireMessageReceived(** >>>> ** >>>> Channels.java:274) >>>> at org.jboss.netty.channel.****Channels.fireMessageReceived(** >>>> ** >>>> Channels.java:261) >>>> at org.jboss.netty.channel.****socket.nio.NioWorker.read(** >>>> NioWorker.java:351) >>>> at org.jboss.netty.channel.****socket.nio.NioWorker.** >>>> processSelectedKeys(NioWorker.****java:282) >>>> at org.jboss.netty.channel.****socket.nio.NioWorker.run(** >>>> NioWorker.java:202) >>>> at java.util.concurrent.****ThreadPoolExecutor$Worker.** >>>> runTask(ThreadPoolExecutor.****java:886) >>>> at java.util.concurrent.****ThreadPoolExecutor$Worker.run(**** >>>> ThreadPoolExecutor.java:908) >>>> at java.lang.Thread.run(Thread.****java:619) >>>> >>>> 2012-07-30 17:22:57,647 INFO ipc.NettyServer: [id: 0x5fc6e818, / >>>> 172.22.114.32:55671 :> /172.28.19.112:41414] DISCONNECTED >>>> 2012-07-30 17:22:57,647 INFO ipc.NettyServer: [id: 0x5fc6e818, / >>>> 172.22.114.32:55671 :> /172.28.19.112:41414] UNBOUND >>>> 2012-07-30 17:22:57,647 INFO ipc.NettyServer: [id: 0x5fc6e818, / >>>> 172.22.114.32:55671 :> /172.28.19.112:41414] CLOSED >>>> >>>> I've dumped the test program and its output >>>> >>>> http://pastebin.com/1DtXZyTu >>>> http://pastebin.com/T9kaqKHY >>>> >>>> >>>> >> > -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
