Re: Trace HBase/HDFS with HTrace

2015-02-24 Thread Masatake Iwasaki

Hi,

Thanks for trying this. I am sorry for late reply.

I tried this today
by hbase-1.0.1-SANPSHOT built with {{-Dhadoop-two.version=2.7.0-SNAPSHOT}}
in pseudo distributed cluster
but failed to get end-to-end trace.

I checked that
* tracing works for both of hbase and hdfs,
* hbase runs with 2.7.0-SNAPSHOT jar of hadoop.

When I did do put with tracing on,
I saw span named FSHLog.sync with annotations such as
syncing writer and writer synced.
The code for tracing in FSHLog worked at least.

I'm still looking into this.
If it turned out that tracing spans are not reached to
actual HDFS writer thread in HBase, I will file a JIRA.

# We need hadoop-2.6.0 or higher in order to trace HDFS.
# Building hbase from source with {{-Dhadoop-two.version=2.6.0}}
# is straight forward way to do this
# because the binary release of hbase-1.0.0 bundles hadoop-2.5.1 jars.

Masatake

On 2/11/15 08:56, Nick Dimiduk wrote:

Hi Joshua,

In theory there's nothing special for you to do. Just issue your query to
HBase with tracing enabled. The active span will go through HBase, down
into HDFS, and back again. You'll need both systems collecting spans into
the same place so that you can report on the complete trace tree.

I've not recently tested the end-to-end, but I believe it's all there. If
not, it's a bug -- this is an intended use case. Can you give it a try
and let us know how it goes?

FYI, 0.99.x are preview releases of HBase and not for production use. Just
so you know :)

-n

On Wednesday, February 11, 2015, Chunxu Tang chunxut...@gmail.com wrote:


Hi all,

Now I’m exploiting HTrace to trace request level data flows in HBase and
HDFS. I have successfully traced HBase and HDFS by using HTrace,
respectively.

After that, I combine HBase and HDFS together and I want to just send a
PUT/GET request to HBase, but to trace the whole data flow in both HBase
and HDFS. In my opinion, when I send a request such as Get to HBase, it
will at last try to read the blocks on HDFS, so I can construct a whole
data flow tracing through HBase and HDFS. While, the fact is that I can
only get tracing data of HBase, with no data of HDFS.

Could you give me any suggestions on how to trace the data flow in both
HBase and HDFS? Does anyone have similar experience? Do I need to modify
the source code? And maybe which part(s) should I touch? If I need to
modify the code, I will try to create a patch for that.

Thank you.

My Configurations:
Hadoop version: 2.6.0
HBase version: 0.99.2
HTrace version: htrace-master
OS: Ubuntu 12.04


Joshua





Re: Trace HBase/HDFS with HTrace

2015-02-24 Thread Colin P. McCabe
Thanks for trying this, Mastake.  I've got HDFS working on my cluster
with tracing and LocalFileSpanReceiver.  Did you try using HBase +
HDFS with LocalFileSpanReceiver?  Be sure to use a build including
HTRACE-112 since LFSR was kind of busted prior to that.

I'm going to do a longer writeup about getting HDFS + HBase working
with other span receivers just as soon as I finish stomping a few more
bugs.

best,
Colin

On Tue, Feb 24, 2015 at 12:04 PM, Masatake Iwasaki
iwasak...@oss.nttdata.co.jp wrote:
 Hi,

 Thanks for trying this. I am sorry for late reply.

 I tried this today
 by hbase-1.0.1-SANPSHOT built with {{-Dhadoop-two.version=2.7.0-SNAPSHOT}}
 in pseudo distributed cluster
 but failed to get end-to-end trace.

 I checked that
 * tracing works for both of hbase and hdfs,
 * hbase runs with 2.7.0-SNAPSHOT jar of hadoop.

 When I did do put with tracing on,
 I saw span named FSHLog.sync with annotations such as
 syncing writer and writer synced.
 The code for tracing in FSHLog worked at least.

 I'm still looking into this.
 If it turned out that tracing spans are not reached to
 actual HDFS writer thread in HBase, I will file a JIRA.

 # We need hadoop-2.6.0 or higher in order to trace HDFS.
 # Building hbase from source with {{-Dhadoop-two.version=2.6.0}}
 # is straight forward way to do this
 # because the binary release of hbase-1.0.0 bundles hadoop-2.5.1 jars.

 Masatake


 On 2/11/15 08:56, Nick Dimiduk wrote:

 Hi Joshua,

 In theory there's nothing special for you to do. Just issue your query to
 HBase with tracing enabled. The active span will go through HBase, down
 into HDFS, and back again. You'll need both systems collecting spans into
 the same place so that you can report on the complete trace tree.

 I've not recently tested the end-to-end, but I believe it's all there. If
 not, it's a bug -- this is an intended use case. Can you give it a try
 and let us know how it goes?

 FYI, 0.99.x are preview releases of HBase and not for production use. Just
 so you know :)

 -n

 On Wednesday, February 11, 2015, Chunxu Tang chunxut...@gmail.com wrote:

 Hi all,

 Now I’m exploiting HTrace to trace request level data flows in HBase and
 HDFS. I have successfully traced HBase and HDFS by using HTrace,
 respectively.

 After that, I combine HBase and HDFS together and I want to just send a
 PUT/GET request to HBase, but to trace the whole data flow in both HBase
 and HDFS. In my opinion, when I send a request such as Get to HBase, it
 will at last try to read the blocks on HDFS, so I can construct a whole
 data flow tracing through HBase and HDFS. While, the fact is that I can
 only get tracing data of HBase, with no data of HDFS.

 Could you give me any suggestions on how to trace the data flow in both
 HBase and HDFS? Does anyone have similar experience? Do I need to modify
 the source code? And maybe which part(s) should I touch? If I need to
 modify the code, I will try to create a patch for that.

 Thank you.

 My Configurations:
 Hadoop version: 2.6.0
 HBase version: 0.99.2
 HTrace version: htrace-master
 OS: Ubuntu 12.04


 Joshua