Sergey, thanks for those tips. I didn't know CopyTable and ExportSnapshot
had a dependency on Yarn and MR. According to your recommendations I set up
both Yarn and MR. I can now initiate jobs like CopyTable or ExportSnapshot.
NodeManagers get started on all my data nodes with start-yarn.sh and there
seems to be a quorum within zookeeper to elect a master. All seems fine.
After jobs get submitted and picked up though, they all fail with:
https://pastebin.com/dnuFLy8t

This is a directory created by and for the job I'm assuming, which gets 710
as permissions, owned by root.

Here's my yarn-site.xml: https://pastebin.com/aGjNjJbU
My mapred-site.xml only contains the setting you indicated. Is there
anything else that jumps as obviously wrong here?

Thank you so much for the help so far.


On Tue, Apr 25, 2017 at 7:49 PM, Sergey Soldatov <sergeysolda...@gmail.com>
wrote:

> Actually there is enough information about the actual issue.
> According to this line :
> 2017-04-25 09:33:40,702 DEBUG [main] mapreduce.JobSubmitter: Configuring
> job job_local18236289_0001 with
> file:/tmp/hadoop-root/mapred/staging/root18236289/.staging/j
> ob_local18236289_0001
> as the submit dir
> it looks like the MR is configured (or, according to your comment,  not
> configured at all, so used the default values) to local execution. You need
> to configure the mapred framework (mapred-site.xml). For YARN it should be
> something like this:
>
>     <property>
>       <name>mapreduce.framework.name</name>
>       <value>yarn</value>
>     </property>
>
> As well as you will need to configure yarn itself.
>
> Thanks,
> Sergey
>
>
> On Tue, Apr 25, 2017 at 7:22 AM, Vasco Pinho <va...@hotjar.com> wrote:
>
> > Hi Sergey,
> >
> > Thanks for the tip, I've added the DEBUG line to the log properties and
> > this is what I got: https://pastebin.com/0k1tgw3n
> > Unfortunately it looks like other than lots of S3 chatter there's not
> much
> > more info regarding the actual issue. As for the MR/Yarn, there was no
> care
> > to tune any specific settings as we're just running HBase on top of
> Hadoop.
> > Are there any specific things related to that I might check first?
> >
> > Thanks,
> > Vasco
> >
> > On Tue, Apr 25, 2017 at 1:56 AM, Sergey Soldatov <
> sergeysolda...@gmail.com
> > >
> > wrote:
> >
> > > It's rather looks like a problem with MR/Yarn client or configuration.
> > When
> > > the MR job starts, it creates copies of all jars in staging directory
> and
> > > ClientDistributedCacheManager is supposed to just check whether all
> files
> > > exist in the cache. For better understanding why it works incorrectly,
> > > could you please add to hbase log4.properties:
> > > log4j.logger.org.apache.hadoop=DEBUG
> > >
> > > and collect the execution log.
> > >
> > > Thanks,
> > > Sergey
> > >
> > > On Mon, Apr 24, 2017 at 8:58 AM, Vasco Pinho <va...@hotjar.com> wrote:
> > >
> > > > For additional information, CopyTable also fails with the same
> missing
> > > > file:
> > > >
> > > > Exception in thread "main" java.io.FileNotFoundException: File does
> not
> > > > exist:
> > > > hdfs://hjcluster/home/ubuntu/hbase-1.2.4/hbase-prefix-tree/
> > > > target/hbase-prefix-tree-1.2.4.jar
> > > >
> > > >
> > > > On Wed, Apr 19, 2017 at 10:27 AM, Vasco Pinho <va...@hotjar.com>
> > wrote:
> > > >
> > > > > Anyone have any ideas what might be causing this? Wrong classpath?
> > Some
> > > > > package missing?
> > > > >
> > > > > On Sun, Apr 16, 2017 at 9:25 AM, Vasco Pinho <va...@hotjar.com>
> > wrote:
> > > > >
> > > > >> Running Hadoop 2.7.2, Hbase 1.2.4.
> > > > >>
> > > > >> On Fri, Apr 14, 2017 at 6:47 PM, Ted Yu <yuzhih...@gmail.com>
> > wrote:
> > > > >>
> > > > >>> I can debug this next week when I have access to an HA cluster.
> > > > >>>
> > > > >>> Which release of Hadoop are you using ?
> > > > >>>
> > > > >>> Cheers
> > > > >>>
> > > > >>> > On Apr 14, 2017, at 9:21 AM, Vasco Pinho <va...@hotjar.com>
> > wrote:
> > > > >>> >
> > > > >>> > I'm not sure I follow. This file already exists under the
> install
> > > dir
> > > > >>> at:
> > > > >>> >
> > > > >>> > /home/ubuntu/hbase-1.2.4/hbase-prefix-tree/target/hbase-pref
> > > > >>> ix-tree-1.2.4.jar
> > > > >>> >
> > > > >>> > The class is also present at:
> > > > >>> >
> > > > >>> > /home/ubuntu/hbase-1.2.4/hbase-prefix-tree/target/classes/or
> > > > >>> g/apache/hadoop/hbase/codec/prefixtree/PrefixTreeCodec.class
> > > > >>> >
> > > > >>> > Why would I rename it? Is my classpath wrong for some reason?
> In
> > > any
> > > > >>> case,
> > > > >>> > renaming it to something like
> > > > >>> > "hbase-prefix-tree/target/hbase-prefix-tree-1.2.4.jar.bak "
> > yields
> > > > >>> another
> > > > >>> > exception of the same sort (full trace:
> > > > https://pastebin.com/mzcPthZR)
> > > > >>> :
> > > > >>> >
> > > > >>> > java.io.FileNotFoundException: File does not exist:
> > > > >>> > hdfs://clusterID/root/.m2/repository/org/apache/htrace/htrac
> > > > >>> e-core/3.1.0-incubating/htrace-core-3.1.0-incubating.jar
> > > > >>> >
> > > > >>> > So it looks like something else is wrong with the installation
> or
> > > > >>> there is
> > > > >>> > a bug that is making these searches attempt to look in
> "hdfs://"
> > > > >>> instead of
> > > > >>> > the filesystem?
> > > > >>> >
> > > > >>> > Thanks,
> > > > >>> > Vasco Pinho
> > > > >>> >
> > > > >>> >> On Fri, Apr 14, 2017 at 4:32 PM, Ted Yu <yuzhih...@gmail.com>
> > > > wrote:
> > > > >>> >>
> > > > >>> >> Here is how hbase-prefix-tree dependency is detected:
> > > > >>> >>
> > > > >>> >>    try {
> > > > >>> >>
> > > > >>> >>      prefixTreeCodecClass =
> > > > >>> >>
> > > > >>> >>          Class.forName(
> > > > >>> >> "org.apache.hadoop.hbase.codec.prefixtree.PrefixTreeCodec");
> > > > >>> >>
> > > > >>> >>    } catch (ClassNotFoundException e) {
> > > > >>> >>
> > > > >>> >>      // this will show up in unit tests but should not show in
> > > real
> > > > >>> >> deployments
> > > > >>> >>
> > > > >>> >>      LOG.warn("The hbase-prefix-tree module jar containing
> > > > >>> PrefixTreeCodec
> > > > >>> >> is not present." +
> > > > >>> >>
> > > > >>> >>          "  Continuing without it.");
> > > > >>> >>
> > > > >>> >>    }
> > > > >>> >>
> > > > >>> >> As a workaround, consider temporarily renaming
> hbase-prefix-tree
> > > jar
> > > > >>> on the
> > > > >>> >> node where ExportSnapshot is run.
> > > > >>> >>
> > > > >>> >> Remember to rename hbase-prefix-tree jar back after the job.
> > > > >>> >>
> > > > >>> >> FYI
> > > > >>> >>
> > > > >>> >>> On Fri, Apr 14, 2017 at 5:02 AM, Vasco Pinho <
> va...@hotjar.com
> > >
> > > > >>> wrote:
> > > > >>> >>>
> > > > >>> >>> Sure Ted,
> > > > >>> >>>
> > > > >>> >>> Here's the full trace: https://pastebin.com/AGimZ8wZ
> > > > >>> >>> And here's the hbase-site.xml: https://pastebin.com/yhRWCsaU
> > > > >>> >>>
> > > > >>> >>> This is a working Hbase HA setup which has been happily
> > chugging
> > > > >>> along. 3
> > > > >>> >>> node HDFS JN, 3 node zk, 2 of them NN+backup NN. Also two of
> > them
> > > > >>> >>> HMaster+HMaster backup. Then several data nodes.
> > > > >>> >>>
> > > > >>> >>> Thanks for taking a look!
> > > > >>> >>>
> > > > >>> >>> Vasco
> > > > >>> >>>
> > > > >>> >>>> On Fri, Apr 14, 2017 at 1:41 PM, Ted Yu <
> yuzhih...@gmail.com>
> > > > >>> wrote:
> > > > >>> >>>>
> > > > >>> >>>> Can you show the complete stack trace ?
> > > > >>> >>>>
> > > > >>> >>>> Please pastebin contents of hbase/site.xml
> > > > >>> >>>>
> > > > >>> >>>> Thanks
> > > > >>> >>>>
> > > > >>> >>>>> On Apr 14, 2017, at 3:51 AM, Vasco Pinho <va...@hotjar.com
> >
> > > > wrote:
> > > > >>> >>>>>
> > > > >>> >>>>> When running:
> > > > >>> >>>>>
> > > > >>> >>>>> bin/hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot
> > > > >>> >>>>> -Dfs.s3a.buffer.dir=/tmp/hbase_snap_tmp -snapshot
> > > > >>> >>>> TestTable-20170413-143020
> > > > >>> >>>>> -copy-to s3a://bucket-backup/hbase/snapshots/
> > > > >>> >>>>>
> > > > >>> >>>>>
> > > > >>> >>>>> The operation fails with:
> > > > >>> >>>>>
> > > > >>> >>>>>
> > > > >>> >>>>> 2017-04-13 15:03:24,947 ERROR [main]
> snapshot.ExportSnapshot:
> > > > >>> >> Snapshot
> > > > >>> >>>>> export failed
> > > > >>> >>>>> java.io.FileNotFoundException: File does not exist:
> > > > >>> >>>>> hdfs://clusterID/home/ubuntu/h
> base-1.2.4/hbase-prefix-tree/
> > > > >>> >>>> target/hbase-prefix-tree-1.2.4.jar
> > > > >>> >>>>>       at
> > > > >>> >>>>> org.apache.hadoop.hdfs.DistributedFileSystem$17.
> > > > >>> >>>> doCall(DistributedFileSystem.java:1072)
> > > > >>> >>>>>
> > > > >>> >>>>>
> > > > >>> >>>>> I assume it's looking for the
> > > > >>> >>>>> "/home/ubuntu/hbase-1.2.4/hbase-prefix-tree/target/
> > > > >>> >>>> hbase-prefix-tree-1.2.4.jar"
> > > > >>> >>>>> which does exist but obviously not under hdfs://. I don't
> > know
> > > > why
> > > > >>> >> this
> > > > >>> >>>>> file is needed or why it's being looked for in the hdfs://
> > > > instead
> > > > >>> of
> > > > >>> >>> the
> > > > >>> >>>>> local filesystem, where it does exist. I confirmed that it
> > > looks
> > > > >>> >> under
> > > > >>> >>>>> "fs.defaultFS" + <hbase install path> + <relative path to
> > > > >>> >>>> hbase-prefix-tree
> > > > >>> >>>>> jar", although I have no idea how to fix this. Any ideas?
> > > > >>> >>>>>
> > > > >>> >>>>>
> > > > >>> >>>>> Thanks,
> > > > >>> >>>>> Vasco Pinho
> > > > >>> >>
> > > > >>>
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> Vasco Pinho
> > > > >> DevOps Engineer
> > > > >> www.hotjar.com
> > > > >> Connect with me on LinkedIn
> > > > >> <https://pt.linkedin.com/in/vasco-pinho-58770534>
> > > > >> IMPORTANT CONFIDENTIALITY NOTICE: This message is confidential and
> > > > >> intended for the use only of the person to whom this message is
> > > > addressed.
> > > > >> If you are not the intended recipient you are strictly prohibited
> > from
> > > > >> reading, disseminating, copying or using this message, or its
> > contents
> > > > in
> > > > >> any way and must contact the sender immediately.
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Vasco Pinho
> > > > > DevOps Engineer
> > > > > www.hotjar.com
> > > > > Connect with me on LinkedIn
> > > > > <https://pt.linkedin.com/in/vasco-pinho-58770534>
> > > > > IMPORTANT CONFIDENTIALITY NOTICE: This message is confidential and
> > > > > intended for the use only of the person to whom this message is
> > > > addressed.
> > > > > If you are not the intended recipient you are strictly prohibited
> > from
> > > > > reading, disseminating, copying or using this message, or its
> > contents
> > > in
> > > > > any way and must contact the sender immediately.
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Vasco Pinho
> > > > DevOps Engineer
> > > > www.hotjar.com
> > > > Connect with me on LinkedIn
> > > > <https://pt.linkedin.com/in/vasco-pinho-58770534>
> > > > IMPORTANT CONFIDENTIALITY NOTICE: This message is confidential and
> > > intended
> > > > for the use only of the person to whom this message is addressed. If
> > you
> > > > are not the intended recipient you are strictly prohibited from
> > reading,
> > > > disseminating, copying or using this message, or its contents in any
> > way
> > > > and must contact the sender immediately.
> > > >
> > >
> >
>

Reply via email to