Yeah I agree. FWIW, I am hoping in few weeks I will have a little more
spare time and I was planning on writing the Avro patches to ensure
languages such as Python, C#, etc could write messages to Flume.

On Fri, Aug 3, 2012 at 1:30 AM, Juhani Connolly <
[email protected]> wrote:

> On paper it certainly seems like a good solution, it's just unfortunate
> that some "supported" languages can't actually interface to it. I
> understand that thrift can be quite a nuisance to deal with at times.
>
>
> On 08/02/2012 11:01 PM, Brock Noland wrote:
>
>> I cannot answer what made us move to Avro. However, I prefer Avro because
>> you don't have to build the thrift compiler and you aren't required to do
>> code generation.
>>
>> On Wed, Aug 1, 2012 at 11:06 PM, Juhani Connolly <
>> [email protected].**jp <[email protected]>>
>> wrote:
>>
>>  It looks to me like this was because of the transceiver I was using.
>>>
>>> Unfortunately it seems like avro doesn't have a python implementation of
>>> a
>>> transceiver that fits the format expected by netty/avro(in fact it only
>>> has
>>> one transceiver... HTTPTransceiver).
>>>
>>> To address this, I'm thinking of putting together a thrift source(the
>>> legacy source doesn't seem to be usable as it returns nothing, and lacks
>>> batching). Does this seem like a reasonable solution to making it
>>> possible
>>> to send data to flume from other languages(and allowing backoff on
>>> failure?). Historically, what made us move away from thrift to avro?
>>>
>>>
>>> On 07/30/2012 05:34 PM, Juhani Connolly wrote:
>>>
>>>  I'm playing around with making a standalone tail client in python(so
>>>> that
>>>> I can access inode data) that tracks position in a file and then sends
>>>> it
>>>> across avro to an avro sink.
>>>>
>>>> However I'm having issues with the avro part of this and wondering if
>>>> anyone more familiar with it could help.
>>>>
>>>> I took the flume.avdl file and converted it using "java -jar
>>>> ~/Downloads/avro-tools-1.6.3.****jar idl flume.avdl flume.avpr"
>>>>
>>>>
>>>> I then run it through a simple test program to see if its sending the
>>>> data correctly and it sends from the python client fine, but the sink
>>>> end
>>>> OOM's because presumably the wire format is wrong:
>>>>
>>>> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, /
>>>> 172.22.114.32:55671 => /172.28.19.112:41414] OPEN
>>>> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, /
>>>> 172.22.114.32:55671 => /172.28.19.112:41414] BOUND: /
>>>> 172.28.19.112:41414
>>>> 2012-07-30 17:22:57,565 INFO ipc.NettyServer: [id: 0x5fc6e818, /
>>>> 172.22.114.32:55671 => /172.28.19.112:41414] CONNECTED: /
>>>> 172.22.114.32:55671
>>>> 2012-07-30 17:22:57,646 WARN ipc.NettyServer: Unexpected exception from
>>>> downstream.
>>>> java.lang.OutOfMemoryError: Java heap space
>>>>          at java.util.ArrayList.<init>(****ArrayList.java:112)
>>>>          at org.apache.avro.ipc.****NettyTransportCodec$****
>>>> NettyFrameDecoder.
>>>> **decodePackHeader(****NettyTransportCodec.java:154)
>>>>          at org.apache.avro.ipc.****NettyTransportCodec$**
>>>> NettyFrameDecoder.decode(****NettyTransportCodec.java:131)
>>>>          at org.jboss.netty.handler.codec.****frame.FrameDecoder.**
>>>> callDecode(
>>>> **FrameDecoder.java:282)
>>>>          at org.jboss.netty.handler.codec.****frame.FrameDecoder.**
>>>> messageReceived(FrameDecoder.****java:216)
>>>>          at org.jboss.netty.channel.****Channels.fireMessageReceived(**
>>>> **
>>>> Channels.java:274)
>>>>          at org.jboss.netty.channel.****Channels.fireMessageReceived(**
>>>> **
>>>> Channels.java:261)
>>>>          at org.jboss.netty.channel.****socket.nio.NioWorker.read(**
>>>> NioWorker.java:351)
>>>>          at org.jboss.netty.channel.****socket.nio.NioWorker.**
>>>> processSelectedKeys(NioWorker.****java:282)
>>>>          at org.jboss.netty.channel.****socket.nio.NioWorker.run(**
>>>> NioWorker.java:202)
>>>>          at java.util.concurrent.****ThreadPoolExecutor$Worker.**
>>>> runTask(ThreadPoolExecutor.****java:886)
>>>>          at java.util.concurrent.****ThreadPoolExecutor$Worker.run(****
>>>> ThreadPoolExecutor.java:908)
>>>>          at java.lang.Thread.run(Thread.****java:619)
>>>>
>>>> 2012-07-30 17:22:57,647 INFO ipc.NettyServer: [id: 0x5fc6e818, /
>>>> 172.22.114.32:55671 :> /172.28.19.112:41414] DISCONNECTED
>>>> 2012-07-30 17:22:57,647 INFO ipc.NettyServer: [id: 0x5fc6e818, /
>>>> 172.22.114.32:55671 :> /172.28.19.112:41414] UNBOUND
>>>> 2012-07-30 17:22:57,647 INFO ipc.NettyServer: [id: 0x5fc6e818, /
>>>> 172.22.114.32:55671 :> /172.28.19.112:41414] CLOSED
>>>>
>>>> I've dumped the test program and its output
>>>>
>>>> http://pastebin.com/1DtXZyTu
>>>> http://pastebin.com/T9kaqKHY
>>>>
>>>>
>>>>
>>
>


-- 
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/

Reply via email to