Hey Guys,
I have been trying to configure GAAS based and have got the following
observations
1) I am using gobblin-service.sh start to start the Orchestrator
Application however I don't see it getting started. I am getting the
following information in the master.out log as
WARN [NativeCodeLoader] Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
ERROR [GobblinServiceManager] Callback error:
gobblin.util.callbacks.CallbacksDispatcher$CallbackCallable@1f51431 on
gobblin.service.modules.orchestration.Orchestrator@66f659e6
:java.lang.UnsupportedOperationException
I could see that the gobblin-service.sh is expecting HADOOP_HOME to be set
in the following line
CLASSPATH=${FWDIR_CONF}:${GOBBLIN_JARS}:${SERVICE_CONF_DIR}:${HADOOP_HOME}/lib
I am not sure what needs to be set here, I did configure the HADOOP_HOME to
the hadoop-2.7.2 however I do still see the same issue.
2) Using the $GOBBIN_HOME/bin/./gobblin-service.sh start is starting the
GobblinServiceManager process. I figured out the port for the process using
vicky@vicky-Latitude-E5570:~/git/bpuholdings/dip/dip-dev/runtime/gobblin-dist/bin$
netstat -a -p | grep 1460
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp6 0 0 [::]:48039 [::]:*
LISTEN 1460/java
unix 2 [ ] STREAM CONNECTED 2557156
1460/java
unix 2 [ ] STREAM CONNECTED 2549041
1460/java
The server is listining on the 48039 but does not list the JobFlows (
http://localhost:48039/flowconfigs) , Since I am staring it for the first
time there would be no JobFlows I would expect the empty list. I saw the
implementation of FlowConfigsResource which does not have the
implementation to render the FlowConfigs by default.
I would propose to have a GET call for listing all the Flow Specs. There
should be a way to configure the default port for the Orchestrator ( i.e
GobblinServiceManager).
3) I did tweak the log4j-cluster.properties at the
GOBBLIN_HOME/conf/service, however don't see the changes are being taken
from there. I am unable to find out how the log4j related properties will
be taken from the log4j-cluster.properties file. I would expect the log4j
properties as log4j.properties. I did try to change the
log4j-cluster.properties to log4j.properties however did not get the
expected results. Can some one explain how it is being used?
4) I did not configure the properties in the
GOBBLIN_HOME/conf/service/application.conf and tried to bring up the
Gobblin cluster by calling "./gobblin-cluster-master.sh start" and
"./gobblin-cluster-worker.sh start", this did not help as there are errors
coming.
When I am calling the goblin-cluster-master.sh I see the following error
appearing in the master.out
Exception in thread "main" java.io.FileNotFoundException:
log4j-cluster.properties (No such file or directory)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at java.io.FileInputStream.<init>(FileInputStream.java:93)
at
gobblin.util.logs.Log4jConfigurationHelper.updateLog4jConfiguration(Log4jConfigurationHelper.java:51)
at
gobblin.cluster.GobblinClusterManager.main(GobblinClusterManager.java:723)
In this case the log4j-cluster.properties are being read.
Can someone jot the steps to get the GAAS up?
Here are the steps that I think should be done
1) Bring the Orchestrator up by invoking
$GOBBLIN_HOME/bin/gobblin-service.sh start
2) Bring the Gobblin Cluster up by invoking
$GOBBLIN_HOME/bin/gobblin-cluster-master.sh start
3) Brink the Gobblin Cluster worker's by invoking
$GOBBLIN_HOME/bin/gobblin-cluster-worker.sh start
And for all the above steps we need to configure the Helix/Kafka etc.
Regards,
Vicky