Hi Krish,
I got some similar errors when i change `hbase.rootdir` value to the hdfs
filesystem as `hdfs://xxxx/tmp/hbase-root`.
But it's solved when i try to
1. config Atlas Authentication, see more detail in
https://atlas.apache.org/#/Authentication
My hbase needs kerberos in my company,so add kerberos Authentication
atlas.authentication.method.kerberos=true
[email protected]
atlas.authentication.method.kerberos.keytab=/home/atlas/atlas.keytab
atlas.authentication.method.kerberos.name.rules=RULE:[2:$1@$0]([email protected])s/.*/atlas/
atlas.authentication.method.kerberos.token.validity=3600
[email protected]
atlas.authentication.keytab=/home/atlas/atlas.keytab
2. For connecting to hbase ,sometimes you need to add
hdfs-site.xml,core-site.xml, mapred-site.xml,yarn-site.xml to Atlas classpath,
classpath such like 'HBASE_CONF_DIR' or '{ATLAS_HOME}/conf'
[email protected]
From: Krish I
Date: 2020-03-27 16:26
To: user
Subject: Atlas with JanusGraph on HBase backed with S3 giving errors
Hi,
I am trying to set up Atlas on a K8s cluster in AWS with HBase backed by S3.
Everything works fine and works when I point the `hbase.rootdir` value to the
local filesystem as `file:///tmp/hbase-root`.
When I change this to some S3 URI `s3://bucket/path`, this fails with a very
generic error when I start Atlas with:
```
2020-03-27 08:20:55,293 INFO - [main:] ~ Loading atlas-application.properties
from file:/apache-atlas-2.0.0/conf/atlas-application.properties
(ApplicationProperties:123)
2020-03-27 08:20:55,299 INFO - [main:] ~ Using graphdb backend
'org.apache.atlas.repository.graphdb.janus.AtlasJanusGraphDatabase'
(ApplicationProperties:273)
2020-03-27 08:20:55,299 INFO - [main:] ~ Using storage backend 'hbase2'
(ApplicationProperties:284)
2020-03-27 08:20:55,299 INFO - [main:] ~ Using index backend 'solr'
(ApplicationProperties:295)
2020-03-27 08:20:55,300 INFO - [main:] ~ Setting solr-wait-searcher property
'true' (ApplicationProperties:301)
2020-03-27 08:20:55,300 INFO - [main:] ~ Setting index.search.map-name
property 'false' (ApplicationProperties:305)
2020-03-27 08:20:55,301 INFO - [main:] ~ Property (set to default)
atlas.graph.cache.db-cache = true (ApplicationProperties:318)
2020-03-27 08:20:55,301 INFO - [main:] ~ Property (set to default)
atlas.graph.cache.db-cache-clean-wait = 20 (ApplicationProperties:318)
2020-03-27 08:20:55,301 INFO - [main:] ~ Property (set to default)
atlas.graph.cache.db-cache-size = 0.5 (ApplicationProperties:318)
2020-03-27 08:20:55,301 INFO - [main:] ~ Property (set to default)
atlas.graph.cache.tx-cache-size = 15000 (ApplicationProperties:318)
2020-03-27 08:20:55,301 INFO - [main:] ~ Property (set to default)
atlas.graph.cache.tx-dirty-size = 120 (ApplicationProperties:318)
2020-03-27 08:20:55,316 INFO - [main:] ~
########################################################################################
Atlas Server (STARTUP)
project.name: apache-atlas
project.description: Metadata Management and Data Governance
Platform over Hadoop
build.user: root
build.epoch: 1585085591537
project.version: 2.0.0
build.version: 2.0.0
vc.revision: release
vc.source.url: scm:git:git://git.apache.org/atlas.git/atlas-webapp
########################################################################################
(Atlas:215)
2020-03-27 08:20:55,316 INFO - [main:] ~ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
(Atlas:216)
2020-03-27 08:20:55,316 INFO - [main:] ~ Server starting with TLS ? false on
port 21000 (Atlas:217)
2020-03-27 08:20:55,316 INFO - [main:] ~ <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
(Atlas:218)
2020-03-27 08:20:55,961 INFO - [main:] ~ No authentication method configured.
Defaulting to simple authentication (LoginProcessor:102)
2020-03-27 08:20:56,078 WARN - [main:] ~ Unable to load native-hadoop library
for your platform... using builtin-java classes where applicable
(NativeCodeLoader:60)
2020-03-27 08:20:56,100 INFO - [main:] ~ Logged in user root (auth:SIMPLE)
(LoginProcessor:77)
2020-03-27 08:20:56,703 INFO - [main:] ~ Not running setup per configuration
atlas.server.run.setup.on.start. (SetupSteps$SetupRequired:189)
2020-03-27 08:20:58,679 WARN - [main:] ~ Cannot locate configuration: tried
hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
(MetricsConfig:134)
2020-03-27 08:23:08,702 WARN - [main:] ~ Unexpected exception during
getDeployment() (HBaseStoreManager:399)
java.lang.RuntimeException:
org.janusgraph.diskstorage.TemporaryBackendException: Temporary failure in
storage backend
at
org.janusgraph.diskstorage.hbase2.HBaseStoreManager.getDeployment(HBaseStoreManager.java:358)
at
org.janusgraph.diskstorage.hbase2.HBaseStoreManager.getFeatures(HBaseStoreManager.java:397)
at
org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.<init>(GraphDatabaseConfiguration.java:1256)
at
org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:160)
at
org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:131)
at
org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:111)
at
org.apache.atlas.repository.graphdb.janus.AtlasJanusGraphDatabase.getGraphInstance(AtlasJanusGraphDatabase.java:165)
at
org.apache.atlas.repository.graphdb.janus.AtlasJanusGraphDatabase.getGraph(AtlasJanusGraphDatabase.java:263)
at
org.apache.atlas.repository.graph.AtlasGraphProvider.getGraphInstance(AtlasGraphProvider.java:52)
at
org.apache.atlas.repository.graph.AtlasGraphProvider.get(AtlasGraphProvider.java:98)
at
org.apache.atlas.repository.graph.AtlasGraphProvider$$EnhancerBySpringCGLIB$$b936b499.CGLIB$get$1(<generated>)
at
org.apache.atlas.repository.graph.AtlasGraphProvider$$EnhancerBySpringCGLIB$$b936b499$$FastClassBySpringCGLIB$$fd3f07c6.invoke(<generated>)
at
org.springframework.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228)
...
...
...
at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
at
org.apache.atlas.web.service.EmbeddedServer.start(EmbeddedServer.java:98)
at org.apache.atlas.Atlas.main(Atlas.java:133)
Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Temporary
failure in storage backend
at
org.janusgraph.diskstorage.hbase2.HBaseStoreManager.ensureTableExists(HBaseStoreManager.java:732)
at
org.janusgraph.diskstorage.hbase2.HBaseStoreManager.getLocalKeyPartition(HBaseStoreManager.java:518)
at
org.janusgraph.diskstorage.hbase2.HBaseStoreManager.getDeployment(HBaseStoreManager.java:355)
... 92 more
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed
after attempts=16, exceptions:
Fri Mar 27 08:20:59 UTC 2020, RpcRetryingCaller{globalStartTime=1585297258222,
pause=100, maxAttempts=16}, org.apache.hadoop.hbase.PleaseHoldException:
org.apache.hadoop.hbase.PleaseHoldException: Ma
ster is initializing
...
...
...
```
The main bone of contention I think is this line `2020-03-27 08:20:58,679 WARN
- [main:] ~ Cannot locate configuration: tried
hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
(MetricsConfig:134)` which I do not see when running with the local rootdir.
Please not that I am not running Hadoop/HDFS for now and do not intend to; we
will guarantee consistency to S3 in the future using other tech or move to
DynamoDB or something else.
Please also note that I restart the ZK, Hbase, Atlas instance for testing
numerous time during this setup, so the only persistence I have is the HBase
initializaion data in stored in S3.
However, the HBase master and regionserver logs themselves show no error at all
when running with S3 as the rootdir. I have also dropped in an hbase shell and
checked the corresponding HBase web interfaces to check the HBase with S3 is
working fine.
This makes me believe that there is an issue with Janusgraph <-> Hbase
interaction, which I am not sure how to debug. I have solf-linked the
`/atlas/conf/hbase` directory to the actual HBase conf dir at `/hbase/conf`.
Any pointers will be helpful here as I think I am going in blond over here. :)
Best,
Krish