Re: Trace HBase/HDFS with HTrace

Nick Dimiduk Wed, 11 Feb 2015 11:29:05 -0800

I don't recall the hadoop release repo restriction being a problem, but I
haven't tested it lately. See if you can just specify the release version
with -Dhadoop.version or -Dhadoop-two.version.


I would go against branch-1.0 as this will be the eminent 1.0.0 release and
had HTrace 3.1.0-incubating.

-n

On Wed, Feb 11, 2015 at 11:13 AM, Colin P. McCabe <[email protected]>
wrote:

> Thanks for trying stuff out!  Sorry that this is a little difficult at
> the moment.
>
> To really do this right, you would want to be using Hadoop with HTrace
> 3.1.0, and HBase with HTrace 3.1.0.  Unfortunately, there hasn't been
> a new release of Hadoop with HTrace 3.1.0.  The only existing releases
> of Hadoop use an older version of the HTrace library.  So you will
> have to build from source.
>
> If you check out Hadoop's "branch-2" branch (currently, this branch
> represents what will be in the 2.7 release, when it is cut), and build
> that, you will get the latest.  Then you have to build a version of
> HBase against the version of Hadoop you have built.
>
> By default, HBase's Maven build will build against upstream release
> versions of Hadoop only. So just setting
> -Dhadoop.version=2.7.0-SNAPSHOT is not enough, since it won't know
> where to find the jars.  To get around this problem, you can create
> your own local maven repo. Here's how.
>
> In hadoop/pom.xml, add these lines to the distributionManagement stanza:
>
> +    <repository>
> +      <id>localdump</id>
> +      <url>file:///home/cmccabe/localdump/releases</url>
> +    </repository>
> +    <snapshotRepository>
> +      <id>localdump</id>
> +      <url>file:///home/cmccabe/localdump/snapshots</url>
> +    </snapshotRepository>
>
> Comment out the repositories that are already there.
>
> Now run mkdir /home/cmccabe/localdump.
>
> Then, in your hadoop tree, run mvn deploy -DskipTests.
>
> You should get a localdump directory that has files kind of like this:
>
> ...
> /home/cmccabe/localdump/snapshots/org/apache/hadoop
> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce
>
> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/maven-metadata.xml.md5
>
> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT
>
> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT/maven-metadata.xml.md5
>
> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT/hadoop-mapreduce-2.7.0-20121120.230341-1.pom.sha1
>
> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT/maven-metadata.xml
> ...
>
> Now, add the following lines to your HBase pom.xml:
>
>    <repositories>
>      <repository>
> +      <id>localdump</id>
> +      <url>file:///home/cmccabe/localdump</url>
> +      <name>Local Dump</name>
> +      <snapshots>
> +        <enabled>true</enabled>
> +      </snapshots>
> +      <releases>
> +        <enabled>true</enabled>
> +      </releases>
> +    </repository>
> +    <repository>
>
> This will allow you to run something like:
> mvn test -Dtest=TestMiniClusterLoadSequential -PlocalTests
> -DredirectTestOutputToFile=true -Dhadoop.profile=2.0
> -Dhadoop.version=2.7.0-SNAPSHOT -Dcdh.hadoop.version=2.7.0-SNAPSHOT
>
> Once we do a new release of Hadoop with HTrace 3.1.0 this will get a lot
> easier.
>
> Related: Does anyone know what the best git branch to build from for
> HBase would be for this kind of testing?  I've been meaning to do some
> end to end testing (it's been on my TODO for a while)
>
> best,
> Colin
>
> On Wed, Feb 11, 2015 at 7:55 AM, Chunxu Tang <[email protected]> wrote:
> > Hi all,
> >
> > Now I’m exploiting HTrace to trace request level data flows in HBase and
> > HDFS. I have successfully traced HBase and HDFS by using HTrace,
> > respectively.
> >
> > After that, I combine HBase and HDFS together and I want to just send a
> > PUT/GET request to HBase, but to trace the whole data flow in both HBase
> > and HDFS. In my opinion, when I send a request such as Get to HBase, it
> > will at last try to read the blocks on HDFS, so I can construct a whole
> > data flow tracing through HBase and HDFS. While, the fact is that I can
> > only get tracing data of HBase, with no data of HDFS.
> >
> > Could you give me any suggestions on how to trace the data flow in both
> > HBase and HDFS? Does anyone have similar experience? Do I need to modify
> > the source code? And maybe which part(s) should I touch? If I need to
> > modify the code, I will try to create a patch for that.
> >
> > Thank you.
> >
> > My Configurations:
> > Hadoop version: 2.6.0
> > HBase version: 0.99.2
> > HTrace version: htrace-master
> > OS: Ubuntu 12.04
> >
> >
> > Joshua
>

Re: Trace HBase/HDFS with HTrace

Reply via email to