Perfect Iain. Worked like a charm.

> On Aug 31, 2015, at 11:19 AM, iain wright <[email protected]> wrote:
> 
> I'd expect it to work with any source, ive used it with exec & 
> spoolingdirsource
> 
> Cheers,
> 
> -- 
> Iain Wright
> 
> This email message is confidential, intended only for the recipient(s) named 
> above and may contain information that is privileged, exempt from disclosure 
> under applicable law. If you are not the intended recipient, do not disclose 
> or disseminate the message to anyone except the intended recipient. If you 
> have received this message in error, or are not the named recipient(s), 
> please immediately notify the sender by return email, and delete all copies 
> of this message.
> 
> On Mon, Aug 31, 2015 at 11:14 AM, Guyle M. Taber <[email protected] 
> <mailto:[email protected]>> wrote:
> Fantastic.
> 
> So with this deserializer setting, it’s not dependent on the source being a 
> logger type?
> 
> 
>> On Aug 31, 2015, at 11:12 AM, iain wright <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Hi Guyle,
>> 
>> We ran into the same thing.
>> 
>> Please see https://flume.apache.org/FlumeUserGuide.html#line 
>> <https://flume.apache.org/FlumeUserGuide.html#line>
>> 
>> On the originating source/where the event enters flume for the first time, 
>> increase maxLineLength, ie:
>> ...
>> agent1.sources.source1.deserializer.maxLineLength = 1048576
>> ...
>> 
>> Best,
>> 
>> -- 
>> Iain Wright
>> 
>> This email message is confidential, intended only for the recipient(s) named 
>> above and may contain information that is privileged, exempt from disclosure 
>> under applicable law. If you are not the intended recipient, do not disclose 
>> or disseminate the message to anyone except the intended recipient. If you 
>> have received this message in error, or are not the named recipient(s), 
>> please immediately notify the sender by return email, and delete all copies 
>> of this message.
>> 
>> On Mon, Aug 31, 2015 at 11:03 AM, Guyle M. Taber <[email protected] 
>> <mailto:[email protected]>> wrote:
>> I’m using an Avrosink to send events to HDFS and we’re seeing with long 
>> content lines, our lines seem to be getting truncated at about the 2060 
>> character mark. How can I prevent long lines from being truncated when using 
>> an Avro sink in this fashion?
>> 
>> Here’s a snippet of an event from the raw logs before flume is involved. 
>> I’ve toggled hidden characters so you can see the EOL character being 
>> inserted, which breaks up the event into two lines.
>> 
>> …utm_campaign=%E5%81%A5%E5%BA%B7%E7%BE%8E%E6%8A%A4&camp=%E5%81%A5%E5%BA%B7%E7%BE%8E%E6%8A%A4^Isearch-term[=]^Isession-id[=]720D69AB19F1DD17D27A948C9B31D380^Istore-id[=]^Itracking-ticket-id[=]^Itracking-ticket-number[=]^Ievent-session-id[=]98df4905-51ab-43a9-92d9-35d879a69b9a
>>  $
>> 
>> Here’s a snippet of an event that gets truncated.
>> 
>> …utm_campaign=%E5%81%A5%E5%BA%B7%E7%BE%8E%E6%8A%A4&camp=%E5%81%A5%E5%BA%$
>> 
>> B7%E7%BE%8E%E6%8A%A4^Isearch-term[=]^Isession-id[=]720D69AB19F1DD17D27A948C9B31D380^Istore-id[=]^Itracking-ticket-id[=]^Itracking-ticket-number[=]^Ievent-session-id[=]98df4905-51ab-43a9-92d9-35d879a69b9a
>>  $
>> 
>> Here is our sink on the sending node.
>> 
>> agent.sinks = AvroSink
>> agent.sinks.AvroSink.type = avro
>> agent.sinks.AvroSink.channel = memoryChannel
>> agent.sinks.AvroSink.hostname = flume.mydomain.int 
>> <http://flume.mydomain.int/>
>> agent.sinks.AvroSink.port = 4169
>> agent.sinks.AvroSink.batchSize = 0
>> agent.sinks.AvroSink.rollSize = 0
>> agent.sinks.AvroSink.rollInterval = 0
>> agent.sinks.AvroSink.rollCount = 0
>> agent.sinks.AvroSink.idleTimeout = 0
>> agent.sinks.AvroSink.useLocalTimeStamp = true
>> 
>> Here is our sink on the HDFS receiving side.
>> 
>> dp1.sinks.sinkCN.type = hdfs
>> dp1.sinks.sinkCN.channel = channelCN
>> dp1.sinks.sinkCN.hdfs.filePrefix = %{basename}-
>> dp1.sinks.sinkCN.hdfs.path = 
>> hdfs://sf1-hadoopnn1.mydomain.int/flume/events/ods/cn/fe_event/%{host}/%y-%m-%d
>>  <>
>> dp1.sinks.sinkCN.hdfs.fileType = DataStream
>> dp1.sinks.sinkCN.hdfs.writeFormat = Text
>> dp1.sinks.sinkCN.hdfs.rollSize = 0
>> dp1.sinks.sinkCN.hdfs.rollCount = 0
>> dp1.sinks.sinkCN.hdfs.batchSize = 5000
>> 
> 
> 

Reply via email to