instance_id

Ott, Charles H. Thu, 16 Jan 2014 12:03:32 -0800

Forgot this part of accumulo-env.sh:


HADOOP_PREFIX="$HADOOP_HOME"

 

Also in the example below, I make the folder to log in /var/log/accumulo… 
however, in the accumulo-env.sh example I have it pointing to 
$accumulo_home/logs.  Just change that to wherever your logs are stored.

 

Also you’ll need to run ssh-keygen on the master before pushing the ssh keys to 
your nodes. (ssh-copy-id).  Also hdfs user will proabably have read-only access 
to the Hadoop home folder… just make a folder called HDFS_USER_HOMEFOLDER/.ssh 
and give r/w/x permissions to hdfs on that folder (.ssh) only.  Otherwise 
ssh-copy-id wont be able to update that folder.

 

In regard to optimization and write ahead logs, that stuff is usually 
environment/application specific.

 

If anyone has questions or comments on this installation plan let me know, I 
would love to know what you’re doing differently, and why.

 

 

From: user-return-3599-CHARLES.H.OTT=leidos....@accumulo.apache.org 
[mailto:user-return-3599-CHARLES.H.OTT=leidos....@accumulo.apache.org] On 
Behalf Of Ott, Charles H.
Sent: Thursday, January 16, 2014 2:50 PM
To: user@accumulo.apache.org
Subject: RE: accumulo startup issue: Accumulo not initialized, there is no 
instance id at /accumulo/instance_id

 

Disclaimer, 

 

Not advocating this is the best approach, just what I’m currently doing, put 
this together pretty quick, but it should be mostly complete for settting up 
accumulo on cdh hdfs/zk

 

 

 

I always do something like this first on CentOS:

 

$ yum install –y ntpd openssh-clients unzip

 

#setup ssh and ntpd as needed

 

$install the jdk RPM

 

# bash this to setup OS specifics

 

echo "Disabling SELINUX for Optimal CDH Compatability..."

sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config

 

echo "Increasing uLimit, aka File Descripter/Handlers for all Users..."

echo "# Adding Support for CDH" >> /etc/security/limits.conf

echo "*              -     nofile              65536" >> 
/etc/security/limits.conf

 

echo "Disabling IPv6..."

echo "# Disable ipv6" >> /etc/sysctl.conf

echo "net.ipv6.conf.all.disable_ipv6 = 1" >> /etc/sysctl.conf

echo "net.ipv6.conf.default.disable_ipv6 = 1" >> /etc/sysctl.conf

 

echo "Increasing Swapiness Factor to limit use of swap space."

echo "# swappiness for accumulo" >> /etc/sysctl.conf

echo "vm.swappiness = 10" >> /etc/sysctl.conf

 

 

reboot and test OS/services/jdk version…

 

then I usually extract Accumulo to /opt/Accumulo/accumulo-1.5.0

 

Make a sym link, /opt/accumulo/Accumulo-current -> ./accumulo-1.5.0

 

#make dirs. For Accumulo logs, where ever…

mkdir /var/log/accumulo

 

#let HDFS own all your Accumulo folders

chown –R hdfs:hdfs /opt/accumulo

chown –R hdfs:hdfs /var/log/accumulo

 

#update the hdfs password for the next step

user root: passwd hdfs

 

#setup passwordless ssh (test using hdfs afterwards, should be able to ssh 
<node> w/o entering credentials)

su –hdfs

ssh-copy-id <for all tablet server nodes>

 

#update your iptables

 

 

#env vars
ACCUMULO_HOME=/opt/accumulo/accumulo-1.5.0

JAVA_HOME=/usr/java/default (jdk7 in my last install worked fine)

 

Settings for accumulo-env.sh in /conf:

# cdh4

export HADOOP_HDFS_HOME=/opt/cloudera/parcels/CDH/lib/hadoop-hdfs

export HADOOP_MAPREDUCE_HOME=/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce

test -z "$HADOOP_CONF_DIR"       && export 
HADOOP_CONF_DIR="$HADOOP_PREFIX/etc/hadoop"

test -z "$JAVA_HOME"             && export JAVA_HOME=/usr/java/default

test -z "$ZOOKEEPER_HOME"        && export 
ZOOKEEPER_HOME=/opt/cloudera/parcels/CDH/lib/zookeeper

test -z "$ACCUMULO_LOG_DIR"      && export ACCUMULO_LOG_DIR=$ACCUMULO_HOME/logs

 

#update all files as appropriate in /opt/Accumulo/Accumulo-current/conf/*

masters, monitor,slaves,tracers,gc,Accumulo-site.xml, Accumulo-env.sh

 

#accumulo-site.xml

<property>

    <name>general.classpaths</name>

    <value>

      $ACCUMULO_HOME/server/target/classes/,

      $ACCUMULO_HOME/lib/accumulo-server.jar,

      $ACCUMULO_HOME/core/target/classes/,

      $ACCUMULO_HOME/lib/accumulo-core.jar,

      $ACCUMULO_HOME/start/target/classes/,

      $ACCUMULO_HOME/lib/accumulo-start.jar,

      $ACCUMULO_HOME/fate/target/classes/,

      $ACCUMULO_HOME/lib/accumulo-fate.jar,

      $ACCUMULO_HOME/proxy/target/classes/,

      $ACCUMULO_HOME/lib/accumulo-proxy.jar,

      $ACCUMULO_HOME/lib/[^.].*.jar,

      $ZOOKEEPER_HOME/zookeeper[^.].*.jar,

      $HADOOP_CONF_DIR,

      $HADOOP_PREFIX/[^.].*.jar,

      $HADOOP_PREFIX/lib/[^.].*.jar,

      $HADOOP_HDFS_HOME/.*.jar,

      $HADOOP_HDFS_HOME/lib/.*.jar,

      $HADOOP_MAPREDUCE_HOME/.*.jar,

      $HADOOP_MAPREDUCE_HOME/lib/.*.jar

    </value>

    <description>Classpaths that accumulo checks for updates and class files.

      When using the Security Manager, please remove the ".../target/classes/" 
values.

    </description>

  </property>

 

then of course, always run your Accumulo binaries/scripts using the HDFS 
account.  I’m sure I’m missing a few steps here and there… 

 

$ACCUMULO_HOME/bin/accumulo init

…

$ACCUMULO_HOME/bin/start-all.sh

 

 

 

From: user-return-3597-CHARLES.H.OTT=leidos....@accumulo.apache.org 
[mailto:user-return-3597-CHARLES.H.OTT=leidos....@accumulo.apache.org] On 
Behalf Of Sean Busbey
Sent: Thursday, January 16, 2014 2:20 PM
To: Accumulo User List
Subject: Re: accumulo startup issue: Accumulo not initialized, there is no 
instance id at /accumulo/instance_id

 

 

On Thu, Jan 16, 2014 at 1:14 PM, Kesten Broughton <kbrough...@21ct.com> wrote:

"You should make sure to correct the maximum number of open files for the user 
that is running Accumulo."

I have the following in all /etc/security/limits.conf in my accumulo cluster

hdfs soft nofile 65536
hdfs hard nofile 65536

However, i see this for all nodes.
WARN : Max files open on 10.0.11.208 is 32768, recommend 65536

Should it be a different user or something?

'the user that is running Accumulo'
sudo hdfs
hdfs$ bin/accumulo -u root

so is hdfs or root the accumulo user?

 

The user in question here is the one who starts the Accumulo server processes. 
In production environments this should be a user dedicated to Accumulo. FWIW, I 
usually name this user "accumulo".

 

How do you start up Accumulo? a service script? running 
$ACCUMULO_HOME/bin/start-all.sh? something else?

RE: accumulo startup issue: Accumulo not initialized, there is no instance id at /accumulo/instance_id

Reply via email to