[jira] [Updated] (FLUME-3050) add counters for error conditions and expose to monitor URL

2017-03-05 Thread Yuval Lifshitz (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuval Lifshitz updated FLUME-3050:
--
Summary: add counters for error conditions and expose to monitor URL  (was: 
add error stats to monitor URL)

> add counters for error conditions and expose to monitor URL
> ---
>
> Key: FLUME-3050
> URL: https://issues.apache.org/jira/browse/FLUME-3050
> Project: Flume
>  Issue Type: Improvement
>  Components: Channel, Shell, Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: Yuval Lifshitz
>  Labels: features
>
> currently error counters are not present when getting stats. for example:
> {code}
>  > curl http://my-flume-host:4/metrics
> {"SINK.k1":{"ConnectionCreatedCount":"1","ConnectionClosedCount":"0","Type":"SINK","BatchCompleteCount":"0","BatchEmptyCount":"4","EventDrainAttemptCount":"10","StartTime":"1485348138992","EventDrainSuccessCount":"10","BatchUnderflowCount":"1","StopTime":"0","ConnectionFailedCount":"0"},"CHANNEL.c1":{"ChannelCapacity":"100","ChannelFillPercentage":"0.0","Type":"CHANNEL","ChannelSize":"0","EventTakeSuccessCount":"10","EventTakeAttemptCount":"15","StartTime":"1485348138990","EventPutAttemptCount":"10","EventPutSuccessCount":"10","StopTime":"0"},"SOURCE.r1":{"EventReceivedCount":"10","AppendBatchAcceptedCount":"0","Type":"SOURCE","AppendReceivedCount":"0","EventAcceptedCount":"10","StartTime":"1485348138993","AppendAcceptedCount":"0","OpenConnectionCount":"0","AppendBatchReceivedCount":"0","StopTime":"0"}}
> {code}
> return only "good" stats for source, channel and sink.
> to get error you need to look into the log file. this makes it hard to 
> integrate flume into automatic monitoring systems, NMS etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLUME-3050) add error stats to monitor URL

2017-03-05 Thread Yuval Lifshitz (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15896189#comment-15896189
 ] 

Yuval Lifshitz commented on FLUME-3050:
---

Hi Tristan,
Yes you are right. If these are not already available then they will have to be 
added first. I guess that actually exposing them to the API would be the easy 
part.
Regarding comparing attempts with successes, this might be tricky due to the 
nature of sampling at high rates - the numbers might not match even if there 
are no errors. So, triggering alarms based on that might have false positives.
Will change the title.

Thanks,

Yuval

> add error stats to monitor URL
> --
>
> Key: FLUME-3050
> URL: https://issues.apache.org/jira/browse/FLUME-3050
> Project: Flume
>  Issue Type: Improvement
>  Components: Channel, Shell, Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: Yuval Lifshitz
>  Labels: features
>
> currently error counters are not present when getting stats. for example:
> {code}
>  > curl http://my-flume-host:4/metrics
> {"SINK.k1":{"ConnectionCreatedCount":"1","ConnectionClosedCount":"0","Type":"SINK","BatchCompleteCount":"0","BatchEmptyCount":"4","EventDrainAttemptCount":"10","StartTime":"1485348138992","EventDrainSuccessCount":"10","BatchUnderflowCount":"1","StopTime":"0","ConnectionFailedCount":"0"},"CHANNEL.c1":{"ChannelCapacity":"100","ChannelFillPercentage":"0.0","Type":"CHANNEL","ChannelSize":"0","EventTakeSuccessCount":"10","EventTakeAttemptCount":"15","StartTime":"1485348138990","EventPutAttemptCount":"10","EventPutSuccessCount":"10","StopTime":"0"},"SOURCE.r1":{"EventReceivedCount":"10","AppendBatchAcceptedCount":"0","Type":"SOURCE","AppendReceivedCount":"0","EventAcceptedCount":"10","StartTime":"1485348138993","AppendAcceptedCount":"0","OpenConnectionCount":"0","AppendBatchReceivedCount":"0","StopTime":"0"}}
> {code}
> return only "good" stats for source, channel and sink.
> to get error you need to look into the log file. this makes it hard to 
> integrate flume into automatic monitoring systems, NMS etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLUME-3050) add error stats to monitor URL

2017-02-28 Thread Yuval Lifshitz (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15888208#comment-15888208
 ] 

Yuval Lifshitz commented on FLUME-3050:
---

some more error counters to consider:
* hdfs sink: error in file rotation
* file channel: error is reading/writing to file

> add error stats to monitor URL
> --
>
> Key: FLUME-3050
> URL: https://issues.apache.org/jira/browse/FLUME-3050
> Project: Flume
>  Issue Type: Improvement
>  Components: Channel, Shell, Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: Yuval Lifshitz
>  Labels: features
>
> currently error counters are not present when getting stats. for example:
> {code}
>  > curl http://my-flume-host:4/metrics
> {"SINK.k1":{"ConnectionCreatedCount":"1","ConnectionClosedCount":"0","Type":"SINK","BatchCompleteCount":"0","BatchEmptyCount":"4","EventDrainAttemptCount":"10","StartTime":"1485348138992","EventDrainSuccessCount":"10","BatchUnderflowCount":"1","StopTime":"0","ConnectionFailedCount":"0"},"CHANNEL.c1":{"ChannelCapacity":"100","ChannelFillPercentage":"0.0","Type":"CHANNEL","ChannelSize":"0","EventTakeSuccessCount":"10","EventTakeAttemptCount":"15","StartTime":"1485348138990","EventPutAttemptCount":"10","EventPutSuccessCount":"10","StopTime":"0"},"SOURCE.r1":{"EventReceivedCount":"10","AppendBatchAcceptedCount":"0","Type":"SOURCE","AppendReceivedCount":"0","EventAcceptedCount":"10","StartTime":"1485348138993","AppendAcceptedCount":"0","OpenConnectionCount":"0","AppendBatchReceivedCount":"0","StopTime":"0"}}
> {code}
> return only "good" stats for source, channel and sink.
> to get error you need to look into the log file. this makes it hard to 
> integrate flume into automatic monitoring systems, NMS etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLUME-3050) add error stats to monitor URL

2017-02-27 Thread Yuval Lifshitz (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15885378#comment-15885378
 ] 

Yuval Lifshitz commented on FLUME-3050:
---

Hi Attila,
Thanks for looking into that. Some errors counters we though about:
* taildir source: fail to read file
* spooldir source: fail to read file; fail to delete file; file changed while 
reading
* tcp source: I assume we don't drop messages, but apply pushback on socket if 
channels is full. But do we handle malformed messages? message too long? 
connection lost in the middle of a message?
* hdfs sink: fail to write file; connectivity error; failovers
* avro sink: fail to write event; connection error
* kafka sink: not sure about specific errors but there could be some as well
* avro interceptor: conversion failure, since this is based on kite sdk, we may 
need an interface that allow 3rd party to publish stats as well?

having the above as counters and not as one time indicators in the log file is 
very helpful when integrating with NMS, and reporting systems.


> add error stats to monitor URL
> --
>
> Key: FLUME-3050
> URL: https://issues.apache.org/jira/browse/FLUME-3050
> Project: Flume
>  Issue Type: Improvement
>  Components: Channel, Shell, Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: Yuval Lifshitz
>  Labels: features
>
> currently error counters are not present when getting stats. for example:
> {code}
>  > curl http://my-flume-host:4/metrics
> {"SINK.k1":{"ConnectionCreatedCount":"1","ConnectionClosedCount":"0","Type":"SINK","BatchCompleteCount":"0","BatchEmptyCount":"4","EventDrainAttemptCount":"10","StartTime":"1485348138992","EventDrainSuccessCount":"10","BatchUnderflowCount":"1","StopTime":"0","ConnectionFailedCount":"0"},"CHANNEL.c1":{"ChannelCapacity":"100","ChannelFillPercentage":"0.0","Type":"CHANNEL","ChannelSize":"0","EventTakeSuccessCount":"10","EventTakeAttemptCount":"15","StartTime":"1485348138990","EventPutAttemptCount":"10","EventPutSuccessCount":"10","StopTime":"0"},"SOURCE.r1":{"EventReceivedCount":"10","AppendBatchAcceptedCount":"0","Type":"SOURCE","AppendReceivedCount":"0","EventAcceptedCount":"10","StartTime":"1485348138993","AppendAcceptedCount":"0","OpenConnectionCount":"0","AppendBatchReceivedCount":"0","StopTime":"0"}}
> {code}
> return only "good" stats for source, channel and sink.
> to get error you need to look into the log file. this makes it hard to 
> integrate flume into automatic monitoring systems, NMS etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLUME-3050) add error stats to monitor URL

2017-02-02 Thread Yuval Lifshitz (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuval Lifshitz updated FLUME-3050:
--
Description: 
currently error counters are not present when getting stats. for example:
{code}
 > curl http://my-flume-host:4/metrics
{"SINK.k1":{"ConnectionCreatedCount":"1","ConnectionClosedCount":"0","Type":"SINK","BatchCompleteCount":"0","BatchEmptyCount":"4","EventDrainAttemptCount":"10","StartTime":"1485348138992","EventDrainSuccessCount":"10","BatchUnderflowCount":"1","StopTime":"0","ConnectionFailedCount":"0"},"CHANNEL.c1":{"ChannelCapacity":"100","ChannelFillPercentage":"0.0","Type":"CHANNEL","ChannelSize":"0","EventTakeSuccessCount":"10","EventTakeAttemptCount":"15","StartTime":"1485348138990","EventPutAttemptCount":"10","EventPutSuccessCount":"10","StopTime":"0"},"SOURCE.r1":{"EventReceivedCount":"10","AppendBatchAcceptedCount":"0","Type":"SOURCE","AppendReceivedCount":"0","EventAcceptedCount":"10","StartTime":"1485348138993","AppendAcceptedCount":"0","OpenConnectionCount":"0","AppendBatchReceivedCount":"0","StopTime":"0"}}
{code}
return only "good" stats for source, channel and sink.
to get error you need to look into the log file. this makes it hard to 
integrate flume into automatic monitoring systems, NMS etc.


  was:
currently error counters are not present when getting stats. for example:
{{
> curl http://my-flume-host:4/metrics
{"SINK.k1":{"ConnectionCreatedCount":"1","ConnectionClosedCount":"0","Type":"SINK","BatchCompleteCount":"0","BatchEmptyCount":"4","EventDrainAttemptCount":"10","StartTime":"1485348138992","EventDrainSuccessCount":"10","BatchUnderflowCount":"1","StopTime":"0","ConnectionFailedCount":"0"},"CHANNEL.c1":{"ChannelCapacity":"100","ChannelFillPercentage":"0.0","Type":"CHANNEL","ChannelSize":"0","EventTakeSuccessCount":"10","EventTakeAttemptCount":"15","StartTime":"1485348138990","EventPutAttemptCount":"10","EventPutSuccessCount":"10","StopTime":"0"},"SOURCE.r1":{"EventReceivedCount":"10","AppendBatchAcceptedCount":"0","Type":"SOURCE","AppendReceivedCount":"0","EventAcceptedCount":"10","StartTime":"1485348138993","AppendAcceptedCount":"0","OpenConnectionCount":"0","AppendBatchReceivedCount":"0","StopTime":"0"}}
}}
return only "good" stats for source, channel and sink.
to get error you need to look into the log file. this makes it hard to 
integrate flume into automatic monitoring systems, NMS etc.



> add error stats to monitor URL
> --
>
> Key: FLUME-3050
> URL: https://issues.apache.org/jira/browse/FLUME-3050
> Project: Flume
>  Issue Type: Improvement
>  Components: Channel, Shell, Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: Yuval Lifshitz
>  Labels: features
>
> currently error counters are not present when getting stats. for example:
> {code}
>  > curl http://my-flume-host:4/metrics
> {"SINK.k1":{"ConnectionCreatedCount":"1","ConnectionClosedCount":"0","Type":"SINK","BatchCompleteCount":"0","BatchEmptyCount":"4","EventDrainAttemptCount":"10","StartTime":"1485348138992","EventDrainSuccessCount":"10","BatchUnderflowCount":"1","StopTime":"0","ConnectionFailedCount":"0"},"CHANNEL.c1":{"ChannelCapacity":"100","ChannelFillPercentage":"0.0","Type":"CHANNEL","ChannelSize":"0","EventTakeSuccessCount":"10","EventTakeAttemptCount":"15","StartTime":"1485348138990","EventPutAttemptCount":"10","EventPutSuccessCount":"10","StopTime":"0"},"SOURCE.r1":{"EventReceivedCount":"10","AppendBatchAcceptedCount":"0","Type":"SOURCE","AppendReceivedCount":"0","EventAcceptedCount":"10","StartTime":"1485348138993","AppendAcceptedCount":"0","OpenConnectionCount":"0","AppendBatchReceivedCount":"0","StopTime":"0"}}
> {code}
> return only "good" stats for source, channel and sink.
> to get error you need to look into the log file. this makes it hard to 
> integrate flume into automatic monitoring systems, NMS etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLUME-3050) add error stats to monitor URL

2017-02-02 Thread Yuval Lifshitz (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuval Lifshitz updated FLUME-3050:
--
Description: 
currently error counters are not present when getting stats. for example:
{{
> curl http://my-flume-host:4/metrics
{"SINK.k1":{"ConnectionCreatedCount":"1","ConnectionClosedCount":"0","Type":"SINK","BatchCompleteCount":"0","BatchEmptyCount":"4","EventDrainAttemptCount":"10","StartTime":"1485348138992","EventDrainSuccessCount":"10","BatchUnderflowCount":"1","StopTime":"0","ConnectionFailedCount":"0"},"CHANNEL.c1":{"ChannelCapacity":"100","ChannelFillPercentage":"0.0","Type":"CHANNEL","ChannelSize":"0","EventTakeSuccessCount":"10","EventTakeAttemptCount":"15","StartTime":"1485348138990","EventPutAttemptCount":"10","EventPutSuccessCount":"10","StopTime":"0"},"SOURCE.r1":{"EventReceivedCount":"10","AppendBatchAcceptedCount":"0","Type":"SOURCE","AppendReceivedCount":"0","EventAcceptedCount":"10","StartTime":"1485348138993","AppendAcceptedCount":"0","OpenConnectionCount":"0","AppendBatchReceivedCount":"0","StopTime":"0"}}
}}
return only "good" stats for source, channel and sink.
to get error you need to look into the log file. this makes it hard to 
integrate flume into automatic monitoring systems, NMS etc.


  was:
currently error counters are not present when getting stats. for example:
# curl http://my-flume-host:4/metrics
{"SINK.k1":{"ConnectionCreatedCount":"1","ConnectionClosedCount":"0","Type":"SINK","BatchCompleteCount":"0","BatchEmptyCount":"4","EventDrainAttemptCount":"10","StartTime":"1485348138992","EventDrainSuccessCount":"10","BatchUnderflowCount":"1","StopTime":"0","ConnectionFailedCount":"0"},"CHANNEL.c1":{"ChannelCapacity":"100","ChannelFillPercentage":"0.0","Type":"CHANNEL","ChannelSize":"0","EventTakeSuccessCount":"10","EventTakeAttemptCount":"15","StartTime":"1485348138990","EventPutAttemptCount":"10","EventPutSuccessCount":"10","StopTime":"0"},"SOURCE.r1":{"EventReceivedCount":"10","AppendBatchAcceptedCount":"0","Type":"SOURCE","AppendReceivedCount":"0","EventAcceptedCount":"10","StartTime":"1485348138993","AppendAcceptedCount":"0","OpenConnectionCount":"0","AppendBatchReceivedCount":"0","StopTime":"0"}}

return only "good" stats for source, channel and sink.
to get error you need to look into the log file. this makes it hard to 
integrate flume into automatic monitoring systems, NMS etc



> add error stats to monitor URL
> --
>
> Key: FLUME-3050
> URL: https://issues.apache.org/jira/browse/FLUME-3050
> Project: Flume
>  Issue Type: Improvement
>  Components: Channel, Shell, Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: Yuval Lifshitz
>  Labels: features
>
> currently error counters are not present when getting stats. for example:
> {{
> > curl http://my-flume-host:4/metrics
> {"SINK.k1":{"ConnectionCreatedCount":"1","ConnectionClosedCount":"0","Type":"SINK","BatchCompleteCount":"0","BatchEmptyCount":"4","EventDrainAttemptCount":"10","StartTime":"1485348138992","EventDrainSuccessCount":"10","BatchUnderflowCount":"1","StopTime":"0","ConnectionFailedCount":"0"},"CHANNEL.c1":{"ChannelCapacity":"100","ChannelFillPercentage":"0.0","Type":"CHANNEL","ChannelSize":"0","EventTakeSuccessCount":"10","EventTakeAttemptCount":"15","StartTime":"1485348138990","EventPutAttemptCount":"10","EventPutSuccessCount":"10","StopTime":"0"},"SOURCE.r1":{"EventReceivedCount":"10","AppendBatchAcceptedCount":"0","Type":"SOURCE","AppendReceivedCount":"0","EventAcceptedCount":"10","StartTime":"1485348138993","AppendAcceptedCount":"0","OpenConnectionCount":"0","AppendBatchReceivedCount":"0","StopTime":"0"}}
> }}
> return only "good" stats for source, channel and sink.
> to get error you need to look into the log file. this makes it hard to 
> integrate flume into automatic monitoring systems, NMS etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLUME-3050) add error stats to monitor URL

2017-02-02 Thread Yuval Lifshitz (JIRA)
Yuval Lifshitz created FLUME-3050:
-

 Summary: add error stats to monitor URL
 Key: FLUME-3050
 URL: https://issues.apache.org/jira/browse/FLUME-3050
 Project: Flume
  Issue Type: Improvement
  Components: Channel, Shell, Sinks+Sources
Affects Versions: v1.7.0
Reporter: Yuval Lifshitz


currently error counters are not present when getting stats. for example:
# curl http://my-flume-host:4/metrics
{"SINK.k1":{"ConnectionCreatedCount":"1","ConnectionClosedCount":"0","Type":"SINK","BatchCompleteCount":"0","BatchEmptyCount":"4","EventDrainAttemptCount":"10","StartTime":"1485348138992","EventDrainSuccessCount":"10","BatchUnderflowCount":"1","StopTime":"0","ConnectionFailedCount":"0"},"CHANNEL.c1":{"ChannelCapacity":"100","ChannelFillPercentage":"0.0","Type":"CHANNEL","ChannelSize":"0","EventTakeSuccessCount":"10","EventTakeAttemptCount":"15","StartTime":"1485348138990","EventPutAttemptCount":"10","EventPutSuccessCount":"10","StopTime":"0"},"SOURCE.r1":{"EventReceivedCount":"10","AppendBatchAcceptedCount":"0","Type":"SOURCE","AppendReceivedCount":"0","EventAcceptedCount":"10","StartTime":"1485348138993","AppendAcceptedCount":"0","OpenConnectionCount":"0","AppendBatchReceivedCount":"0","StopTime":"0"}}

return only "good" stats for source, channel and sink.
to get error you need to look into the log file. this makes it hard to 
integrate flume into automatic monitoring systems, NMS etc




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)