We definitely need to move to Zookeeper (or even better, an abstraction
thereof which could use Zookeeper as its implementation but could
plug-and-play with other cluster state managers as well).
On 3/4/16 2:24 PM, Raman Grover wrote:
Managix attempts to gather information on the daemons (CC/NC) to validate
if the processes started successfully It does so by extracting the process
Ids. However there is a bug where in the extraction of process ids (based
on grep) fails which leads to a false alarm.
Managix would not be able to shut down the asterix instance as it was
unable to extract the process Id (it does a kill -9 to shut down the
dameons associated with an instance).
The mechanism is vulnerable to the layout of the output for ps command. We
need a more robust way of collecting the process Ids and knowing the status
for the remote processes (CC/NC). An option is to have the NCs and CC
follow cluster membership protocol by registering themselves as znodes with
the existing zookeeper instances. Managix can query the zookeeper to
extract required info for launched processes. Other suggestions are welcome.
Regards,
Raman
On Fri, Mar 4, 2016 at 2:13 PM, Yingyi Bu <[email protected]> wrote:
I always have this warning on my Mac machine when I do "managix start".
Everything works except that "managix stop" couldn't kill the CC.
Best,
Yingyi
On Fri, Mar 4, 2016 at 2:09 PM, Ian Maxon <[email protected]> wrote:
Yes, it might be a false alarm as Wail noted.
What is the content of the 3 log files though? Are they just empty?
logs/execute.log should show you what exact commands were executed by
managix, you can try some of them by hand to see where things may be
going
awry in the startup process.
On Fri, Mar 4, 2016 at 5:21 AM, Wail Alkowaileet <[email protected]>
wrote:
The message can misleading.
Can you open http <http://127.0.0.1:19001/>://127.0.0.1:19001
<http://127.0.0.1:19001/> and try some queries.
On Mar 4, 2016 14:05, "Veeral Shah" <[email protected]> wrote:
A newbie issue. Deployed the github master branch.
After a successful build, I configured asterixdb to run single
instance
(using managix) - as documented on
https://asterixdb.ics.uci.edu/documentation/install.html#Section1SingleMachineAsterixDBInstallation
But cluster controller fails to start. the logs dont reveal anything
abt
the error though. Appears to be some obvious mistake but in the
absence
of
the error messages, finding it tough to triage. I see 3 log files (i
dont
understand why it starts 2 NCs and a CC when i am running a single
instance
asterixdb).
root@ubuntu205:/home/veerals/work/installer# managix configure
root@ubuntu205:/home/veerals/work/installer# managix validate
INFO: Environment [OK]
INFO: Managix Configuration [OK]
root@ubuntu205:/home/veerals/work/installer#
$MANAGIX_HOME/bin/managix
create -n try1 -c
/home/veerals/work/installer/clusters/local/local.xml
INFO: Name:try1
Created:Fri Mar 04 15:22:22 IST 2016
Web-Url:http://127.0.0.1:19001
State:UNUSABLE
WARNING!:Cluster Controller not running at master
root@ubuntu205:/home/veerals/work/installer# managix describe -n
try1
-admin
INFO: Name:try1
Created:Fri Mar 04 15:22:22 IST 2016
Web-Url:http://127.0.0.1:19001
State:UNUSABLE
WARNING!:Cluster Controller not running at master
Master node:master:127.0.0.1
nc1:127.0.0.1
nc2:127.0.0.1
Asterix version:0.8.8-SNAPSHOT
Metadata Node:nc1
Processes
NC at nc1 [ 16781 ]
NC at nc2 [ 16777 ]
Asterix Configuration
nc.java.opts :-Xmx3096m
cc.java.opts :-Xmx1024m
max.wait.active.cluster :60
storage.buffercache.pagesize :131072
storage.buffercache.size :536870912
storage.buffercache.maxopenfiles :214748364
storage.memorycomponent.pagesize :131072
storage.memorycomponent.numpages :256
storage.metadata.memorycomponent.numpages:64
storage.memorycomponent.numcomponents :2
storage.memorycomponent.globalbudget :1073741824
storage.lsm.bloomfilter.falsepositiverate:0.01
txn.log.buffer.numpages :8
txn.log.buffer.pagesize :524288
txn.log.partitionsize :2147483648
txn.log.checkpoint.lsnthreshold :67108864
txn.log.checkpoint.pollfrequency :120
txn.log.checkpoint.history :0
txn.lock.escalationthreshold :1000
txn.lock.shrinktimer :5000
txn.lock.timeout.waitthreshold :60000
txn.lock.timeout.sweepthreshold :10000
compiler.sortmemory :33554432
compiler.joinmemory :33554432
compiler.framesize :131072
compiler.pregelix.home :~/pregelix
web.port :19001
api.port :19002
log.level :INFO
plot.activate :false
The log files dont reveal much:
Thanks and regards
Veeral Shah