[hdfs] [metrics] RpcAuthenticationSuccesses

2024-03-09 Thread Anatoly
Hi There is a question about two hdfs metrics that arose as a result of my attempts to calculate the load on the KDC for an industrial cluster There are two parameters in hdfs metricsRpcAuthenticationSuccesses - Total number of successful authentication attemptsRpcAuthenticationFailures - Total

Metrics polling

2022-10-14 Thread Daniel Kashtan
Hello, I am running Accumulo and using the JMX_Exporter<https://github.com/prometheus/jmx_exporter> to gather metrics Hadoop and Accumulo. I am not sure which property controls how often metrics are updated. Is it here? https://github.com/c9n/hadoop/blob/master/hadoop-common-project/

Re: Haddop prometheus metrics

2021-08-19 Thread Akira Ajisaka
Hi Shailesh, This feature is not implemented in Hadoop 3.2.1. Please try Hadoop 3.3.1. -Akira On Sat, Aug 14, 2021 at 5:56 PM Shailesh Ligade wrote: > > Hello, > > I am using haddop 3.2.1 and saw the documentation that i can add > > hadoop.prometheus.endpoint.enabled to true in core-site.xml yo

Haddop prometheus metrics

2021-08-14 Thread Shailesh Ligade
Hello, I am using haddop 3.2.1 and saw the documentation that i can add hadoop.prometheus.endpoint.enabled to true in core-site.xml you will /prom endpoint on the namenode webui page. (http://:9870/prom) but it is not working for me. Any ideas? Thanks -S

hadoop metrics in prometheus

2021-08-13 Thread Ligade, Shailesh [USA]
Hello, I am using haddop 3.2.1 and saw the documentation that i can add hadoop.prometheus.endpoint.enabled to true in core-site.xml you will /prom endpoint on the namenode webui page. (http://:9870/prom) but it is not working for me. Any ideas? Thanks -S

about Yarn Cluster Metrics API's return value

2019-03-18 Thread daily
I confuse about Yarn Cluster Metrics API ( http://rm-http-address:port/ws/v1/cluster/metrics ), it does't get the correct info in my cluster.Did I misunderstand some properties? (hadoop version=2.7.3) 1) The number of applications in my cluster is like below ( total=18) : ACC

Which metrics to alert on (seeking expert advice)?

2018-04-09 Thread Mark Bonetti
Hi, I'm building a monitoring system for Hadoop and want to set up default alerts (threshold or anomaly) on 2-3 key metrics everyone who uses Hadoop would typically want to alert on, but I don't yet have production-grade experience with Hadoop. Alert rules have to be generally useful, s

Re: Representing hadoop metrics on ganglia web interface

2017-09-01 Thread Akira Ajisaka
Hi Nishant, Multicast is used to communicate between Ganglia daemons by default and it is banned in AWS EC2. Would you try unicast setting? Regards, Akira On 2017/08/04 12:37, Nishant Verma wrote: Hello We are supposed to collect hadoop metrics and see the cluster health and performance

Representing hadoop metrics on ganglia web interface

2017-08-04 Thread Nishant Verma
Hello We are supposed to collect hadoop metrics and see the cluster health and performance. I was going through below link from apache which lists different metrics exposed by hadoop. https://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-common/Metrics.html I also read that we can

Gathering counters and metrics of individual hadoop jobs

2016-09-13 Thread Sergey Zhemzhitsky
Hello hadoop experts! I'm looking for a way to gather all the metrics and counters of individual jobs as well as the whole cluster in the event-driven way to store all this data within elasticsearch for later troubleshooting and analysis. Using metric exporters seems to be the right w

question about TotalWriteTime in DataNode metrics

2016-09-12 Thread Steven Rand
Hi all, I was looking at the JMX metrics from one of my DataNodes and noticed that if I divide BytesWritten by TotalWriteTime, I get a result of about 1000 Mb/sec. However, I can only dd to the disk storing HDFS data on that node at about 95 Mb/sec. To try to figure out what's going

Logging garbage collection metrics into yarn nodemanager application log directory for mappers in Hadoop

2016-09-01 Thread Gopi Krishnan
What is the correct way to log garbage collection metrics into the same location where yarn syslogs/stderr and stdout logs are located for mappers in Hadoop Yarn. Any insights wold be helpful: Here are the setting I tried in mapred-site.xml, but no gc logs were available in the location

Hadoop metrics can be only sent to the first ganglia gmond server

2016-08-30 Thread Jerry Xin
Hi all, I am currently using ganglia to receive hadoop metrics,but I find that metrics can be only sent to the first gmond server.I did some digging,found out that GangliaSink(AbstractGangliaSink) parse metric servers use conf.getString(conf is SubsetConfiguration) ,which return only the first

Seeing DataNode's Metrics

2016-08-12 Thread Atri Sharma
Hi All, I am adding some new metrics in Hadoop-common/client and I registered a new class with DefaultMetricsSystem with "rpc" as the tag. If I read correctly rpc metrics are tracked in DataNode. Can anyone please guide me as to the standard way of seeing DataNode/rpc metrics? Regards, Atri

metrics api

2016-08-11 Thread rammohan ganapavarapu
Hi, Is metrics api available in hadoop 2.7.1? i am trying " curl -i http://localhost:50070/metrics"; but its giving empty 200 response. Ram

Re: Filtering hadoop metrics

2015-04-23 Thread Akmal Abbasov
Hi, I was wondering is there is a way to disable some of the sources of metrics in Hadoop? In fact I do not want to receive ugi, metricssystem and defaut metrics. Is there any way to achieve this? Thank you. > On 21 Apr 2015, at 19:49, Akmal Abbasov wrote: > > Hi Ted, > I’v

Re: Filtering hadoop metrics

2015-04-21 Thread Akmal Abbasov
Hi Ted, I’ve tried [Ugi]*Metrics[System]*, and in this case the output contains both of them. Thank you. > On 21 Apr 2015, at 19:39, Akmal Abbasov wrote: > > Hi Ted, > I am using hadoop-2.5.1 > Thank you. > >> On 21 Apr 2015, at 19:32, Ted Yu > <mailto:yuzhih...@

Re: Filtering hadoop metrics

2015-04-21 Thread Akmal Abbasov
Hi Ted, I am using hadoop-2.5.1 Thank you. > On 21 Apr 2015, at 19:32, Ted Yu wrote: > > What release of hadoop are you using ? > > Maybe try regex such as: > [Ugi]*Metrics[System]* > > Cheers > > On Tue, Apr 21, 2015 at 9:43 AM, Akmal Abbasov <mailto:akmal

Re: Filtering hadoop metrics

2015-04-21 Thread Ted Yu
What release of hadoop are you using ? Maybe try regex such as: [Ugi]*Metrics[System]* Cheers On Tue, Apr 21, 2015 at 9:43 AM, Akmal Abbasov wrote: > Hi, I am now working on hadoop cluster monitoring, and currently playing > with hadoop-metrics2.properties file. I would like to use filt

Filtering hadoop metrics

2015-04-21 Thread Akmal Abbasov
Hi, I am now working on hadoop cluster monitoring, and currently playing with hadoop-metrics2.properties file. I would like to use filters to exclude the records which I am not going to use. In fact I am trying to exclude “UgiMetrics” and “MetricsSystem” records, by using the following datanode

Re: QueueMetrics.AppsKilled/Failed metrics and failure reasons

2015-02-04 Thread Suma Shivaprasad
Thanks for your inputs. The cluster Metrics API is giving correct numbers for the failed/killed apps and is matching with the RM audit logs and we are planning to use that instead. Suma On Wed, Feb 4, 2015 at 12:04 PM, Rohith Sharma K S < rohithsharm...@huawei.com> wrote: > There ar

RE: QueueMetrics.AppsKilled/Failed metrics and failure reasons

2015-02-03 Thread Rohith Sharma K S
Since metrics values are displayed in ganglia is incorrect, I get doubt that 1. does ganglia is pointing out to correct RM cluster? Or 2. what is the method ganglia uses to retrieve QueueMetrics? 3. Any client program calculates you have written retrieve apps and calculate it? Thanks & Regar

Re: QueueMetrics.AppsKilled/Failed metrics and failure reasons

2015-02-03 Thread Suma Shivaprasad
Using hadoop 2.4.0. #of Applications running on average is small ~ 40 -60. The metrics in Ganglia shows around around 10-30 apps killed every 5 mins which is very high wrt to the apps running at any given time(40-60). The RM logs though show 0 failed apps in audit logs during that hour. The RM UI

RE: QueueMetrics.AppsKilled/Failed metrics and failure reasons

2015-02-03 Thread Rohith Sharma K S
Hi Could you give more information, which version of hadoop are you using? >> QueueMetrics.AppsKilled/Failed metrics shows much higher nos i.e ~100. >> However RMAuditLogger shows 1 or 2 Apps as Killed/Failed in the logs. May be I suspect that Logs might be rolled out. Does more

QueueMetrics.AppsKilled/Failed metrics and failure reasons

2015-02-03 Thread Suma Shivaprasad
Hello, Was trying to debug reasons for Killed/Failed apps and was checking for the applications that were killed/failed in RM logs - from RMAuditLogger. QueueMetrics.AppsKilled/Failed metrics shows much higher nos i.e ~100. However RMAuditLogger shows 1 or 2 Apps as Killed/Failed in the logs. Is

Re: YARN cluster metrics period

2014-09-09 Thread Jian He
That's for the whole history of the active RM. Jian On Tue, Sep 9, 2014 at 3:46 PM, Neal Yin wrote: > In YARN resource manager webapp, there is “Cluster Metrics” section. > What is period for this metrics?E.g. AppsSubmitted is for past day, > past week or past mont

YARN cluster metrics period

2014-09-09 Thread Neal Yin
In YARN resource manager webapp, there is "Cluster Metrics" section. What is period for this metrics?E.g. AppsSubmitted is for past day, past week or past month? Thanks, -Neal

Spoofing Ganglia Metrics

2014-01-24 Thread Calvin Jia
Is there a way to configure hdfs/hbase/mapreduce to spoof the ganglia metrics being sent? This is because the machines are behind a NAT and the monitoring box is outside, so all the metrics are recognized as coming from the same machine. Thanks!

Re: Problem sending metrics to multiple targets

2013-11-22 Thread Ivan Tretyakov
We investigated the problem and found root cause. Metrics2 framework uses different from first version config parser (Metrics2 uses apache-commons, Metrics uses hadoop's). org.apache.hadoop.metrics2.sink.ganglia.AbstractGangliaSink uses commas as separators by default. So when we provide li

issue about TotalLoad metrics on FSNamesystem

2013-11-18 Thread ch huang
hi,all: what is TotalLoad mean? i have 4 DN, and the option "dfs.datanode.max.transfer.threads" is 4096 ,but i check this metric the value is 4597 ,this number is great than dfs.datanode.max.transfer.threads ,why?

issue about FSNamesystem metrics

2013-11-18 Thread ch huang
hi,all: i check the NN metrics ,and find the "TransactionsSinceLastCheckpoint" is a negtive number, but why it will be a negtive number,after the last check point ,if no transaction happened ,the number should be zero, if transaction happened,it should be a positive n

Re: Hadoop Metrics Issue in ganglia.

2013-09-14 Thread Leonid Fedotov
> >> @Chris - I haven't tried this on ganglia forums because I thought this >> is something related to Hadoop. Though I'm able to see default metrics >> from every nodes. >> >> @Artem >> I'm not using any firewall, In addition my host entry

Re: Hadoop Metrics Issue in ganglia.

2013-09-14 Thread orahad bigdata
hris - I haven't tried this on ganglia forums because I thought this >> is something related to Hadoop. Though I'm able to see default metrics >> from every nodes. >> >> @Artem >> I'm not using any firewall, In addition my host entry looks like below. >&

Re: Hadoop Metrics Issue in ganglia.

2013-09-13 Thread orahad bigdata
Hi, Thanks for your reply. @Chris - I haven't tried this on ganglia forums because I thought this is something related to Hadoop. Though I'm able to see default metrics from every nodes. @Artem I'm not using any firewall, In addition my host entry looks like below. 192.168

Re: Hadoop Metrics Issue in ganglia.

2013-09-12 Thread Yusaku Sako
2013 at 11:55 AM, orahad bigdata wrote: > Hi, > > Thanks for your reply. > > @Chris - I haven't tried this on ganglia forums because I thought this > is something related to Hadoop. Though I'm able to see default metrics > from every nodes. > > @Artem > I&

Re: Hadoop Metrics Issue in ganglia.

2013-09-12 Thread Artem Ervits
Check firewall and /etc/hosts also make sure hosts lines up with result of hostname -f command. Both hostname -f and hosts entries should have fqdn names. I use ambari to install my cluster, including ganglia metrics and I had identical issue. Once I corrected that it started working. Artem

Re: Hadoop Metrics Issue in ganglia.

2013-09-11 Thread Chris Embree
Did you try ganglia forums/lists? On 9/11/13, orahad bigdata wrote: > Hi All, > > Can somebody help me please? > > Thanks > On 9/11/13, orahad bigdata wrote: >> Hi All, >> >> I'm facing an issue while showing Hadoop metrics in ganglia, Though I >

Re: Hadoop Metrics Issue in ganglia.

2013-09-11 Thread orahad bigdata
Hi All, Can somebody help me please? Thanks On 9/11/13, orahad bigdata wrote: > Hi All, > > I'm facing an issue while showing Hadoop metrics in ganglia, Though I > have installed ganglia on my master/slaves nodes and I'm able to see > all the default metrics on ganglia

Hadoop Metrics Issue in ganglia.

2013-09-10 Thread orahad bigdata
Hi All, I'm facing an issue while showing Hadoop metrics in ganglia, Though I have installed ganglia on my master/slaves nodes and I'm able to see all the default metrics on ganglia UI from all the nodes but I'm not able to see Hadoop metrics in metrics section. versions:- Hadoo

which metrics is important for monitor?

2013-08-15 Thread ch huang
hi,all i try to set some nagios alert for hadoop important metrics,but i do not know which metrics are valuable for track,any one has a good idea?

Re: MutableCounterLong metrics display in ganglia

2013-08-10 Thread lei liu
.period=10 > > *.sink.ganglia.supportsparse=true > > namenode.sink.ganglia.servers=10.232.98.74:8649 > > datanode.sink.ganglia.servers=10.232.98.74:8649 > > > > I write one programme that call FSDataOutputStream.hsync() method once > per > > second. > > > > There is &quo

Re: MutableCounterLong metrics display in ganglia

2013-08-09 Thread Harsh J
=10.232.98.74:8649 > > I write one programme that call FSDataOutputStream.hsync() method once per > second. > > There is "@Metric MutableCounterLong fsyncCount" metrics in DataNodeMetrics, > when FSDataOutputStream.hsync() method is called, the value of fsyncCount >

Re: MutableCounterLong and MutableCounterLong class difference in metrics v2

2013-08-09 Thread Jun Ping Du
Hi Lei, MutableCounterLong is a type of counter which can be increased only (count number is often large comparing with MutableCounterInt). It is used a lot in Hadoop metrics system, i.e. DatanodeMetrics. You can find more details on metrics v2 in Hadoop wiki link ( http://wiki.apache.org

MutableCounterLong and MutableCounterLong class difference in metrics v2

2013-08-08 Thread lei liu
I use hadoop-2.0.5, there are MutableCounterLong and MutableCounterLong class in metrics v2. I am studing metrics v2 code. What are difference MutableCounterLong and MutableCounterLong class ? I find the MutableCounterLong is used to calculate throughput, is that right? How does the metrics

MutableCounterLong metrics display in ganglia

2013-08-07 Thread lei liu
=10.232.98.74:8649 I write one programme that call FSDataOutputStream.hsync() method once per second. There is "@Metric MutableCounterLong fsyncCount" metrics in DataNodeMetrics, when FSDataOutputStream.hsync() method is called, the value of fsyncCount is increased, dataNode send th

Re: throughput metrics in hadoop-2.0.5

2013-08-06 Thread lei liu
Is the the value of MutableCounterLong class set to zero per 10 seconds? 2013/8/6 lei liu > Is the the value of MutableCounterLong class set to zreo per 10 seconds? > > > 2013/8/6 lei liu > >> There is "@Metric MutableCounterLong fsyncCount" met

Re: throughput metrics in hadoop-2.0.5

2013-08-06 Thread lei liu
Is the the value of MutableCounterLong class set to zreo per 10 seconds? 2013/8/6 lei liu > There is "@Metric MutableCounterLong fsyncCount" metrics in > DataNodeMetrics, the MutableCounterLong class continuously increase the > value, so I think the value in ganglia should

Re: throughput metrics in hadoop-2.0.5

2013-08-06 Thread lei liu
There is "@Metric MutableCounterLong fsyncCount" metrics in DataNodeMetrics, the MutableCounterLong class continuously increase the value, so I think the value in ganglia should be "10, 20 ,30, 40" and so on. but the value the value is fsyncCount.value/10, that is in "

MutableRate metrics in hadoop-2.0.5

2013-08-05 Thread lei liu
There is code in MutableRate class: public synchronized void snapshot(MetricsRecordBuilder builder, boolean all) { if (all || changed()) { numSamples += intervalStat.numSamples(); builder.addCounter(numInfo, numSamples) .addGauge(avgInfo, lastStat().mean()); *i

throughput metrics in hadoop-2.0.5

2013-08-05 Thread lei liu
=10.232.98.74:8649 I write one programme that call FSDataOutputStream.hsync() method once per second. There is "@Metric MutableCounterLong fsyncCount" metrics in DataNodeMetrics, the MutableCounterLong class continuously increase the value, so I think the value in ganglia should be 10, 20 ,

Re: about monitor metrics in hadoop

2013-08-04 Thread 闫昆
I use cdh4.3 ,my hadoop_env.sh file in follow directory $HADOOP_HOME/etc/hadoop/ 2013/8/5 ch huang > hi,all: > i installed yarn ,no mapreducev1 install,and it have no > /etc/hadoop/conf/hadoop-env.sh file exist, i use openTSDB to monitor my > cluster,and need some option set on hadoop-e

about monitor metrics in hadoop

2013-08-04 Thread ch huang
hi,all: i installed yarn ,no mapreducev1 install,and it have no /etc/hadoop/conf/hadoop-env.sh file exist, i use openTSDB to monitor my cluster,and need some option set on hadoop-env.sh, and how can i do now? create a new hadoop-env.sh file ?

metrics all use the same namespace

2013-01-17 Thread Ivan Tretyakov
Hi! There is fixed issue in hadoop saying: "jvm metrics all use the same namespace" - https://issues.apache.org/jira/browse/HADOOP-7507 I was able to apply this fix in our cluster using following line in hadoop-metrics2.properties: datanode.sink.ganglia.tagsForPrefix.jvm=* So now I

Set dmax for metrics

2013-01-17 Thread Ivan Tretyakov
Hi! I'm trying to set dmax value for metrics hadoop sends to ganglia. Our HDFS version uses metrics2 context so I tried approach from here: https://svn.apache.org/repos/asf/hadoop/common/branches/HDFS-1073/common/conf/hadoop-metrics2.properties But It didn't work for me. Also there ar

Problem sending metrics to multiple targets

2013-01-17 Thread Ivan Tretyakov
Hi! We have following problem. There are three target hosts to send metrics: 192.168.1.111:8649, 192.168.1.113:8649,192.168.1.115:8649 (node01, node03, node05). But for example datanode (using org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31) sends one metrics to first target host and the

Re: Dynamic reconfiguration of hadoop metrics plugin

2013-01-03 Thread manishbh...@rocketmail.com
You can do this through command line by -D option. E.g. Hadoop jar my.jar -D = Regards, Manish Bhoge sent by HTC device. Excuse typo. - Reply message - From: "Chitresh Deshpande" To: Subject: Dynamic reconfiguration of hadoop metrics plugin Date: Fri, Jan 4, 2013 6:02 AM

Re: Which metrics to track?

2012-09-10 Thread Gulfie
On Mon, Sep 10, 2012 at 08:16:29PM +, Jones, Robert wrote: > Hello all, I am a sysadmin and do not know that much about Hadoop. I run a > stats/metrics tracking system that logs stats over time so you can look at > historical and current data and perform some trend analysis. I k

Re: Which metrics to track?

2012-09-10 Thread Adam Faris
>From an operations perspective, hadoop metrics are a bit different then >watching hosts behind a load balancer as one needs to start thinking in terms >of distributed systems and not individual hosts. The reason being that the >hadoop platform is fairly resilient against m

Which metrics to track?

2012-09-10 Thread Jones, Robert
Hello all, I am a sysadmin and do not know that much about Hadoop. I run a stats/metrics tracking system that logs stats over time so you can look at historical and current data and perform some trend analysis. I know I can access several hadoop metrics via jmx by going to http://localhost

Metrics ..

2012-08-29 Thread Mark Olimpiati
Hi, I enabled the "metrics.properties" to use FileContext, in which jvm metrics values are written to a file as follows: jvm.metrics: hostName= localhost, processName=MAP, sessionId=, gcCount=10, gcTimeMillis=130, logError=0, logFatal=0, logInfo=21, logWarn=0, memHeapCommitted

Re: Info required regarding JobTracker Job Details/Metrics

2012-08-23 Thread Sonal Goyal
Dont the completed job metrics in the job tracker/or bin/hadoop job -history provide you the information you seek? Best Regards, Sonal Crux: Reporting for HBase <https://github.com/sonalgoyal/crux> Nube Technologies <http://www.nubetech.co> <http://in.linkedin.com/in/sonalgoyal>

Re: Info required regarding JobTracker Job Details/Metrics

2012-08-23 Thread Gaurav Dasgupta
;>>> Launched map tasks=2 >>>> SLOTS_MILLIS_REDUCES=0 >>>> ** >>>> *FileSystemCounters* >>>> HDFS_BYTES_READ=158 >>>> FILE_BYTES_WRITTEN=97422 >>>> HDFS_BYTES_WRITTEN=1 >>>> *Map-Reduce Framework* >>

Re: Info required regarding JobTracker Job Details/Metrics

2012-08-23 Thread Sonal Goyal
t; FILE_BYTES_WRITTEN=97422 >>> HDFS_BYTES_WRITTEN=1 >>> *Map-Reduce Framework* >>> Map input records=586006142 >>> Reduce shuffle bytes=53567298 >>> Spilled Records=108996063 >>> Map output bytes=468042247685 >>> CPU time spent (ms

Re: Info required regarding JobTracker Job Details/Metrics

2012-08-23 Thread Bejoy Ks
output bytes=468042247685 >> CPU time spent (ms)=91162220 >> Total committed heap usage (bytes)=981605744640 >> Combine input records=32046224559 >> SPLIT_RAW_BYTES=382500 >> Reduce input records=96063 >> Reduce input groups=1000 >> Combine output records=

Re: Info required regarding JobTracker Job Details/Metrics

2012-08-23 Thread Gaurav Dasgupta
81605744640 > Combine input records=32046224559 > SPLIT_RAW_BYTES=382500 > Reduce input records=96063 > Reduce input groups=1000 > Combine output records=108902950 > Physical memory (bytes) snapshot=1147705057280 > Reduce output records=1000 > Virtual memory (bytes) snapshot=32

Info required regarding JobTracker Job Details/Metrics

2012-08-23 Thread Gaurav Dasgupta
records=1000 Virtual memory (bytes) snapshot=3221902118912 Map output records=31937417672 Can some one explain me all these above metrics? I mainly want to know the "total shuffled bytes" of the jobs. Is is "Reduce shuffle bytes"? Also, how can I calculate the "total shuffle