Hello:

I have developed a set of procedures for simplifying operation and maintenance 
for DS.


I maintain the TiDB cluster and CM/CDH cluster daily, and got some inspiration. 


I use Ansible to develop this program.


My current idea is based on the DevOps, and does not make any adjustments to 
the architecture.


Features:
Install all roles fully automatically (JDK, DB, ZK, BIG DATA relate, DS relate, 
LOG collect)
Install online / Install offline  (The offline way is to manually download the 
package I prepared no need verification, Just put it in the specified directory)
Start and stop, various roles.
Remote role configuration modification.
Butt BIG DATA.
Cluster rolling upgrade, Cluster single role upgrade.
Cluster expansion and reduction. (such as switch from cluster to single node)
Log collect based on GaryLog.
Destroy cluster. 
 Adjustment:
Adjust configuration files to two categories, main and other. The main thing is 
to run, others are used for tuning. (No add or del conf, just tell users to 
conf priority)
Ignore previous install.
Use ZK native method instead of Kazoo.
Added original patches for start and stop scripts.
Advantage:
The previous installation method is only installation, The user cannot get the 
correct error prompt during installation or other operations.This program can 
automatically avoid various minor problems during installation and will tell 
the user what the error is.
Prepare for future rolling upgrades.
When a user needs to troubleshoot or upgrade, he can simply and comprehensively 
describe his cluster status.
These are what I currently think, Already implement [Automatic installation 
70%] [Remote role configuration modification 100%] [Start and stop, various 
roles 100%], This is the address of the program 
https://github.com/feloxx/ds-yibasuo,  Welcome everyone to criticize and fix.

















chendapao

[email protected] 

On 12/11/2019 20:26,leon bao<[email protected]> wrote:
As an open source project, I think it's important to stay open and
extensible.
Our current alert does not have many shortcomings except for scanning the
database with one thread, at the same time, the alert service as a separate
module provides better scalability.
So I don't think it is necessary to merge alert into other modules without
much benefit.

guo jiwei <[email protected]> 于2019年12月11日周三 下午7:33写道:

To xiaochun.
it's not a good way.
Alert must be trigger by who is scheduling the task, in DS, it's
MasterServer.  Because, only by who is scheduling the task, it can know the
task status in time. If the task is timeout, trigger timeout alert in time
is very important for users.
this is why we have to move alert into server module.
Alert implementation will be refactor in the future. not scan db anymore.

On Wed, Dec 11, 2019 at 7:26 PM Xiaochun Liu <[email protected]>
wrote:

To guo jiwei:

Why not put together with the api server, the alert server function is
very small,
and the load will not be very high. If logs are stored together in the
future,
we can combine alert server, log server, api server together, these can
be
called management server.

Best Regards
---------------
DolphinScheduler(Incubator) Committer
Xiaochun Liu 刘小春
[email protected]
---------------



在 2019年12月11日,下午7:15,guo jiwei <[email protected]> 写道:

To ligang.
it's right.
But alert server is only a small function.  we define it as an
individual module and as a server.   do you thing alert is expensive or
taking more resource ? if not , why a single module ?
And alert server trigger task event by scanning db, do you think it
is a
nice way ?
Moving into server module is only our first step for simplifying user
deployment. Extension of alert can be updated via redeploy server and
it's
not a frequent operation.
As the architecture changes, alert implementation will change.


On Wed, Dec 11, 2019 at 6:23 PM 李 岗 <[email protected]> wrote:

I think from another angle,Master and Worker as key services,I think
not
to redeployment during normal execution.
If  tasks are still running,redeploy master and worker may be lead to
missed the timed task.

________________________________
DolphinScheduler(Incubator) PPMC
Gang Li 李岗

[email protected]<mailto:[email protected]>

From: guo jiwei<mailto:[email protected]>
Date: 2019-12-11 18:11
To: dev<mailto:[email protected]>
Subject: Re: Aproposal for DolphinScheduler Simplified Deployment
To ligang.
redeploy is simple, but what about the latency of alert ?
it's easy to redeploy master server to update alert


On Wed, Dec 11, 2019 at 6:03 PM 李 岗 <[email protected]> wrote:

I think the alert module can be retained. Currently, it only supports
email and webchat,
but more alarm modes can be added in the future.

At present,alert is a independent service. the alert service only
consumes
alarm information in the database, other services produce these alarm
information.
If a new alarm mode is added, It is only necessary to redeploy the
alert
service.



________________________________
DolphinScheduler(Incubator) PPMC
Gang Li 李岗

[email protected]<mailto:[email protected]>

发件人: qiao zhanwei<mailto:[email protected]>
发送时间: 2019-12-10 14:24
收件人: dev<mailto:[email protected]>
主题: Aproposal for DolphinScheduler Simplified Deployment

Hello All ,

Now DolphinScheduler has so many Configuration files

for example :

dolphinscheduler-alert :
alert.properties

dolphinscheduler-api :
application-api.properties
application-combined.properties


dolphinscheduler-common :
hadoop.properties
common.properties
quartz.properties
zookeeper.properties

dolphinscheduler-dao :
application-dao.properties

dolphinscheduler-server :
application-master.properties
application-master.properties
master.properties
worker.properties

.dolphinscheduler_env.sh

Can we simplify deployment ?

Main point :

1 configuration file simplification and  merged configuration file
2 master server remove port
3 support offline installation,remove kazoo dependencies in install
and
monitor
4 instll.sh script simplification


—————————————
DolphinScheduler(Incubator)  PPMC
Zhanwei Qiao 乔占卫

[email protected]







--
DolphinScheduler(Incubator)  PPMC
BaoLiang 鲍亮
[email protected]

guo jiwei <[email protected]> 于2019年12月11日周三 下午7:33写道:

To xiaochun.
it's not a good way.
Alert must be trigger by who is scheduling the task, in DS, it's
MasterServer.  Because, only by who is scheduling the task, it can know the
task status in time. If the task is timeout, trigger timeout alert in time
is very important for users.
this is why we have to move alert into server module.
Alert implementation will be refactor in the future. not scan db anymore.

On Wed, Dec 11, 2019 at 7:26 PM Xiaochun Liu <[email protected]>
wrote:

To guo jiwei:

Why not put together with the api server, the alert server function is
very small,
and the load will not be very high. If logs are stored together in the
future,
we can combine alert server, log server, api server together, these can
be
called management server.

Best Regards
---------------
DolphinScheduler(Incubator) Committer
Xiaochun Liu 刘小春
[email protected]
---------------



在 2019年12月11日,下午7:15,guo jiwei <[email protected]> 写道:

To ligang.
it's right.
But alert server is only a small function.  we define it as an
individual module and as a server.   do you thing alert is expensive or
taking more resource ? if not , why a single module ?
And alert server trigger task event by scanning db, do you think it
is a
nice way ?
Moving into server module is only our first step for simplifying user
deployment. Extension of alert can be updated via redeploy server and
it's
not a frequent operation.
As the architecture changes, alert implementation will change.


On Wed, Dec 11, 2019 at 6:23 PM 李 岗 <[email protected]> wrote:

I think from another angle,Master and Worker as key services,I think
not
to redeployment during normal execution.
If  tasks are still running,redeploy master and worker may be lead to
missed the timed task.

________________________________
DolphinScheduler(Incubator) PPMC
Gang Li 李岗

[email protected]<mailto:[email protected]>

From: guo jiwei<mailto:[email protected]>
Date: 2019-12-11 18:11
To: dev<mailto:[email protected]>
Subject: Re: Aproposal for DolphinScheduler Simplified Deployment
To ligang.
redeploy is simple, but what about the latency of alert ?
it's easy to redeploy master server to update alert


On Wed, Dec 11, 2019 at 6:03 PM 李 岗 <[email protected]> wrote:

I think the alert module can be retained. Currently, it only supports
email and webchat,
but more alarm modes can be added in the future.

At present,alert is a independent service. the alert service only
consumes
alarm information in the database, other services produce these alarm
information.
If a new alarm mode is added, It is only necessary to redeploy the
alert
service.



________________________________
DolphinScheduler(Incubator) PPMC
Gang Li 李岗

[email protected]<mailto:[email protected]>

发件人: qiao zhanwei<mailto:[email protected]>
发送时间: 2019-12-10 14:24
收件人: dev<mailto:[email protected]>
主题: Aproposal for DolphinScheduler Simplified Deployment

Hello All ,

Now DolphinScheduler has so many Configuration files

for example :

dolphinscheduler-alert :
alert.properties

dolphinscheduler-api :
application-api.properties
application-combined.properties


dolphinscheduler-common :
hadoop.properties
common.properties
quartz.properties
zookeeper.properties

dolphinscheduler-dao :
application-dao.properties

dolphinscheduler-server :
application-master.properties
application-master.properties
master.properties
worker.properties

.dolphinscheduler_env.sh

Can we simplify deployment ?

Main point :

1 configuration file simplification and  merged configuration file
2 master server remove port
3 support offline installation,remove kazoo dependencies in install
and
monitor
4 instll.sh script simplification


—————————————
DolphinScheduler(Incubator)  PPMC
Zhanwei Qiao 乔占卫

[email protected]







--
DolphinScheduler(Incubator)  PPMC
BaoLiang 鲍亮
[email protected]

Reply via email to