We have tested the embeded mode to work with upto 400 node cluster and multiple 
services running on it.

You can change the hbase.rootdir in ams-hbase-site and possibly write to 
partition with separate disk mount.

And copy over the data from existing location. It would be good to know what is 
the size of data written to hbase.rootdir to get an idea of what kind of write 
volume are we looking at.

Sid


Sent by Outlook<http://taps.io/outlookmobile> for Android



On Wed, May 6, 2015 at 8:52 PM -0700, "Jayesh Thakrar" 
<[email protected]<mailto:[email protected]>> wrote:

We have a 30-node cluster.
Unfortunately, this is also our production cluster and there's no HDFS as it is 
a dedicated Flume cluster.
We have installed Ambari + Storm + Kafka (HDP) on a cluster on which we have 
production data being flumed.
The flume data is being sent to an HDFS cluster which is a little overloaded, 
so we want to send flume data to Kafka and then "throttle" the data being 
loaded into the HDFS cluster.

But you have given me an idea - maybe I can setup a new HBase file location so 
that I can do away with HBase data corruption, if any.

It will take me some time to do that, will let you know once I have tried it 
out.

Thanks,
jayesh


________________________________
From: Siddharth Wagle <[email protected]>
To: "[email protected]" <[email protected]>; Jayesh Thakrar 
<[email protected]>
Sent: Wednesday, May 6, 2015 10:42 PM
Subject: Re: Kafka broker metrics not appearing in REST API

How big is your cluster in terms of number of nodes?
You can tune settings for HBase based on cluster size.

Following are the instructions for writing metrics to HDFS instead of local FS.

ams-site:::
timeline.metrics.service.operation.mode = distributed

ams-hbase-site:::
hbase.rootdir = hdfs://<namenode-host>:8020/amshbase
hbase.cluster.distributed = true

-Sid



________________________________
From: Jayesh Thakrar <[email protected]>
Sent: Wednesday, May 06, 2015 8:30 PM
To: [email protected]; Siddharth Wagle; Jayesh Thakrar
Subject: Re: Kafka broker metrics not appearing in REST API

More info....

I was doing some "stress-testing" and interestingly, the Metrics Collector 
crashed 2 times and I had to restart it (don't like a file-based HBase for the 
metrics collector, but not very confident of configuring the system to point to 
an existing HBase cluster).

Also, after this email thread, I looked up  the metrics collector logs and see 
errors like this -

METRIC_RECORD' at 
region=METRIC_RECORD,,1429966316307.947cfa22f884d035c09fe804b1f5402c., 
hostname=dtord01flm03p.dc.dotomi.net,60455,1429737430103, seqNum=243930
13:09:37,619  INFO [phoenix-1-thread-349921] RpcRetryingCaller:129 - Call 
exception, tries=11, retries=35, started=835564 ms ago, cancelled=false, 
msg=row 
'kafka.network.RequestMetrics.Metadata-RequestsPerSec.1MinuteRate^@dtord01flm27p.dc.dotomi.net^@^@^@^AL��:�kafka_broker'
 on table 'METRIC_RECORD' at 
region=METRIC_RECORD,kafkark.RequestMetrics.Metadata-RequestsPerSec.1MinuteRate\x00dtord01flm27p.dc.dotomi.net\x00\x00\x00\x01L\xED\xED:\xE5kafka_broker,1429966316307.d488f5e58d54c3251cb81fdfa475dd45.,
 hostname=dtord01flm03p.dc.dotomi.net,60455,1429737430103, seqNum=243931
13:10:58,082  INFO [phoenix-1-thread-349920] RpcRetryingCaller:129 - Call 
exception, tries=12, retries=35, started=916027 ms ago, cancelled=false, 
msg=row '' on table 'METRIC_RECORD' at 
region=METRIC_RECORD,,1429966316307.947cfa22f884d035c09fe804b1f5402c., 
hostname=dtord01flm03p.dc.dotomi.net,60455,1429737430103, seqNum=243930
13:10:58,082  INFO [phoenix-1-thread-349921] RpcRetryingCaller:129 - Call 
exception, tries=12, retries=35, started=916027 ms ago, cancelled=false, 
msg=row 
'kafka.network.RequestMetrics.Metadata-RequestsPerSec.1MinuteRate^@dtord01flm27p.dc.dotomi.net^@^@^@^AL��:�kafka_broker'
 on table 'METRIC_RECORD' at 
region=METRIC_RECORD,kafkark.RequestMetrics.Metadata-RequestsPerSec.1MinuteRate\x00dtord01flm27p.dc.dotomi.net\x00\x00\x00\x01L\xED\xED:\xE5kafka_broker,1429966316307.d488f5e58d54c3251cb81fdfa475dd45.,
 hostname=dtord01flm03p.dc.dotomi.net,60455,1429737430103, seqNum=243931
13:10:58,112 ERROR [Thread-25] TimelineMetricAggregator:221 - Exception during 
aggregating metrics.
org.apache.phoenix.exception.PhoenixIOException: 
org.apache.phoenix.exception.PhoenixIOException: Failed after attempts=36, 
exceptions:
Sat Apr 25 13:10:58 UTC 2015, null, java.net.SocketTimeoutException: 
callTimeout=900000, callDuration=938097: row '' on table 'METRIC_RECORD' at 
region=METRIC_RECORD,,1429966316307.947cfa22f884d035c09fe804b1f5402c., 
hostname=dtord01flm03p.dc.dotomi.net,60455,1429737430103, seqNum=243930

        at 
org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:107)
        at 
org.apache.phoenix.iterate.ParallelIterators.getIterators(ParallelIterators.java:527)
        at 
org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48)
        at 
org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:63)
        at 
org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:90)
        at 
org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87)
        at 
org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739)
        at 
org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.TimelineMetricAggregator.aggregateMetricsFromResultSet(TimelineMetricAggregator.java:104)
        at 
org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.TimelineMetricAggregator.aggregate(TimelineMetricAggregator.java:72)
        at 
org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.AbstractTimelineAggregator.doWork(AbstractTimelineAggregator.java:217)
        at 
org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.AbstractTimelineAggregator.runOnce(AbstractTimelineAggregator.java:94)
        at 
org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.AbstractTimelineAggregator.run(AbstractTimelineAggregator.java:70)



________________________________
From: Jayesh Thakrar <[email protected]>
To: Siddharth Wagle <[email protected]>; "[email protected]" 
<[email protected]>
Sent: Wednesday, May 6, 2015 10:07 PM
Subject: Re: Kafka broker metrics not appearing in REST API

Hi Siddharth,

Yes, I am using Ambari 2.0 with Ambari Metrics service.
The interesting thing is that I got them for some time and not anymore.
And I also know that the metrics are being collected since i can see them on 
the dashboard.
Any pointer for troubleshooting?

And btw, it would be nice to have a count of messages received and not a 
computed metric count / min.
TSDB does a good job of giving me cumulative and rate-per-sec graphs and 
numbers.

Thanks in advance,
Jayesh



________________________________
From: Siddharth Wagle <[email protected]>
To: "[email protected]" <[email protected]>; Jayesh Thakrar 
<[email protected]>
Sent: Wednesday, May 6, 2015 10:03 PM
Subject: Re: Kafka broker metrics not appearing in REST API

Hi Jayesh,

Are you using Ambari 2.0 with Ambari Metrics service?

BR,
Sid


________________________________
From: Jayesh Thakrar <[email protected]>
Sent: Wednesday, May 06, 2015 7:53 PM
To: [email protected]
Subject: Kafka broker metrics not appearing in REST API

Hi,

I have installed 2 clusters with Ambari and Storm and Kafka.
After the install, I was able to get metrics for both Storm and Kafka via REST 
API.
This worked fine for a week, but since the past 2 days, I have not been getting 
Kafka metrics.

I need the metrics to push to an OpenTSDB cluster.
I do get host metrics and Nimbus metrics but not KAFKA_BROKER metrics.

I did have maintenance turned on for some time, but maintenance is turned off 
now.

[jthakrar@dtord01hdp0101d ~]$ curl --user admin:admin 
'http://dtord01flm01p:8080/api/v1/clusters/ord_flume_kafka_prod/components/NIMBUS?fields=metrics'
{
  "href" : 
"http://dtord01flm01p:8080/api/v1/clusters/ord_flume_kafka_prod/components/NIMBUS?fields=metrics";,
  "ServiceComponentInfo" : {
    "cluster_name" : "ord_flume_kafka_prod",
    "component_name" : "NIMBUS",
    "service_name" : "STORM"
  },
  "metrics" : {
    "storm" : {
      "nimbus" : {
        "freeslots" : 54.0,
        "supervisors" : 27.0,
        "topologies" : 0.0,
        "totalexecutors" : 0.0,
        "totalslots" : 54.0,
        "totaltasks" : 0.0,
        "usedslots" : 0.0
      }
    }
  }
}

[jthakrar@dtord01hdp0101d ~]$ curl --user admin:admin 
'http://dtord01flm01p:8080/api/v1/clusters/ord_flume_kafka_prod/components/KAFKA_BROKER?fields=metrics'
{
  "href" : 
"http://dtord01flm01p:8080/api/v1/clusters/ord_flume_kafka_prod/components/KAFKA_BROKER?fields=metrics";,
  "ServiceComponentInfo" : {
    "cluster_name" : "ord_flume_kafka_prod",
    "component_name" : "KAFKA_BROKER",
    "service_name" : "KAFKA"
  }
}

[jthakrar@dtord01hdp0101d ~]$ curl --user admin:admin 
'http://dtord01flm01p:8080/api/v1/clusters/ord_flume_kafka_prod/components/SUPERVISOR?fields=metrics'
{
  "href" : 
"http://dtord01flm01p:8080/api/v1/clusters/ord_flume_kafka_prod/components/SUPERVISOR?fields=metrics";,
  "ServiceComponentInfo" : {
    "cluster_name" : "ord_flume_kafka_prod",
    "component_name" : "SUPERVISOR",
    "service_name" : "STORM"
  }







Reply via email to