Thanks for you help ! ^_^
在 2011年12月30日 下午3:16,Eric Yang <[email protected]> 写道: > Data would be discarded if in memory queue is full. The current > implementation is to preserve the system rather than data. If you want to > have full reliability then I recommend to write to file and use > utf8filetailing adaptor to ensure all entries transportation are tracked. In > production, there are usually a lot of collectors for both high availability > and throughput, hence agents are not likely to fill up in memory queue. > However, there are still areas for improvement, ie. add algorithm to discard > most recent data or oldest data. Patches are welcome. :) > > Sent from my iPhone > > On Dec 29, 2011, at 10:38 PM, 陈镇海 <[email protected]> wrote: > >> Hi Eric, >> When no collector is available,data is stored in memory queue.In this >> case , if the amount of data is large and the memory size is limited. >> Will it be "out of memory" and whether the data will be lost? >> >> 2011/12/30 Eric Yang <[email protected]>: >>> Hi, >>> >>> Data is stored in Agent in memory queue. Agent queues messages if no >>> collector is available. The reason that data is out of order in >>> chukwa/repos because data does not contain a time stamp. The demux >>> parser does not know how to sort the given data, hence the data is >>> stored in random order. You might be able to improve the order of the >>> data by modifying the demux parser to use SeqID for ordering to get >>> original order. Hope this helps. >>> >>> regards, >>> Eric >>> >>> On Thu, Dec 29, 2011 at 6:22 PM, 陈镇海 <[email protected]> wrote: >>>> Hello, >>>> I'm using chukwa-0.4.0. The agent and collector are in the same >>>> machine. When I use UDPAdaptor, I found a problem. >>>> The initial_adaptor is written "add UDPAdaptor Packets 1234 0". After >>>> start agent,collector and start_data_processor, I use "nc" to send >>>> some data to this udp port as followed: >>>> echo "hello" | nc -u 127.0.0.1 >>>> echo "world" | nc -u 127.0.0.1 >>>> echo "this is a test" | nc -u 127.0.0.1 >>>> echo "good job" | nc -u 127.0.0.1 >>>> echo "OK" | nc -u 127.0.0.1 >>>> After it works for a while, I found something was written in HDFS. In >>>> the directory "/chukwa/dataSinkArchives", I found the data was written >>>> in correct order. But in the directory "/chukwa/repos", I found the >>>> data was written in a wrong order as followed: >>>> ............body this is a test >>>> ............body OK >>>> ............body good job >>>> ............body hello >>>> ............body world >>>> How it happened? >>>> Another problem,when I keep the agent running and stop the collector, >>>> I continue to send data to the udp port.After a while,when I start the >>>> collector,I found the data was not lost.I want to know how and where >>>> the data is stored. >>>> Thanks a lot.
