I followed the troubleshooting instructions at:

When logging into the zookeeper instances, the .bash_profile file looks
standard and there is no reference to $ZOOKEEPER_HOME

[root@ip-172-27-32-153 ec2-user]# cat .bash_profile
# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
        . ~/.bashrc

# User specific environment and startup programs


export PATH

So the deployment did not configure the zookeeper instances at all. Would
anyone know how to fix this?

Any help appreciated.

On Wed, 7 Feb 2024 at 04:33, John W <mag3...@gmail.com> wrote:

> Hi, I'm having problems deploying the kylin4_on_cloud project located at:
> https://github.com/apache/kylin/tree/kylin4_on_cloud
> I've also been following the instructions here
> https://www.youtube.com/watch?v=5kKXEMjO1Sc&ab_channel=Kyligence
> I used windows to git clone the repo and set up the venv with the latest
> packages via:
> pip install PyYAML
> pip install boto3
> pip install botocore
> pip install pyparsing
> pip install requests
> pip install retrying
> pip install Jinja2
> pip install pytest-shutil
> :
> I also changed the RDSEngineVersion to 8.0.35 in kylin_configs.yaml, as
> RDSEngineVersion 5.7.25 (default repo version) was giving me the error
> "Exception: Current stack: ec2-rds-stack is create failed, please check".
> Here's the log with error I am now getting:
> ==========================================================================
> (venv) C:\projects\kylin4_on_cloud>python deploy.py --type deploy --mode
> job
> 2024-02-07 02:13:54 - botocore.credentials - INFO - 5484 - Found
> credentials in shared credentials file: ~/.aws/credentials
> 2024-02-07 02:13:57 - engine - INFO - 5484 - Env already inited, skip init
> again.
> 2024-02-07 02:13:58 - clouds.aws - WARNING - 5484 - Current env for
> deploying a cluster is not ready.
> 2024-02-07 02:14:20 - instances.aws_instance - INFO - 5484 - Now creating
> stack: ec2-or-emr-vpc-stack.
> 2024-02-07 02:16:42 - instances.aws_instance - INFO - 5484 - Now creating
> stack: ec2-rds-stack.
> 2024-02-07 02:21:06 - instances.aws_instance - INFO - 5484 - Now creating
> stack: ec2-static-service-stack.
> 2024-02-07 02:21:06 - engine - INFO - 5484 - First launch default Kylin
> Cluster.
> 2024-02-07 02:22:08 - clouds.aws - WARNING - 5484 - Current cluster is not
> ready.
> 2024-02-07 02:22:30 - instances.aws_instance - INFO - 5484 - Now creating
> stack: ec2-zookeeper-stack.
> 2024-02-07 02:23:43 - instances.aws_instance - INFO - 5484 - Current
> execute commands in `Zookeeper stack` which named ec2-zookeeper-stack.
> 2024-02-07 02:23:43 - instances.aws_instance - INFO - 5484 - Current
> instance id: i-0cbc37f83c9cda006 is executing commands: grep -Fq
> "" /home/ec2-user/hadoop/zookeeper/conf/zoo.cfg; echo
> $?.
> 2024-02-07 02:23:49 - instances.aws_instance - INFO - 5484 - Current
> instance id: i-0915d44c700e644dc is executing commands: grep -Fq
> "" /home/ec2-user/hadoop/zookeeper/conf/zoo.cfg; echo
> $?.
> 2024-02-07 02:23:54 - instances.aws_instance - INFO - 5484 - Current
> instance id: i-0fdbacc22ecae360a is executing commands: grep -Fq
> "" /home/ec2-user/hadoop/zookeeper/conf/zoo.cfg; echo
> $?.
> 2024-02-07 02:24:00 - instances.aws_instance - INFO - 5484 - Current
> instance id: i-0cbc37f83c9cda006 is executing commands: echo
> 'server.1=
> server.2=
> server.3=' >>
> /home/ec2-user/hadoop/zookeeper/conf/zoo.cfg.
> 2024-02-07 02:24:05 - instances.aws_instance - WARNING - 5484 -
> {'CommandId': '704b776f-e574-47ea-bf13-30d3be2e9df2', 'InstanceId':
> 'i-0cbc37f83c9cda006', 'Comment': '', 'DocumentName': 'AWS-RunShellScript',
> 'DocumentVersion': '$DEFAULT', 'PluginName': 'aws:runShellScript',
> 'ResponseCode': 1,
> 'ExecutionStartDateTime': '2024-02-06T16:24:00.394Z',
> 'ExecutionElapsedTime': 'PT0.008S', 'ExecutionEndDateTime':
> '2024-02-06T16:24:00.394Z', 'Status': 'Failed', 'StatusDetails': 'Failed',
> 'StandardOutputContent': '', 'StandardOutputUrl': '',
> 'StandardErrorContent':
> '/var/lib/amazon/ssm/i-0cbc37f83c9cda006/document/orchestration/704b776f-e574-47ea-bf13-30d3be2e9df2/awsrunShellScript/0.awsrunShellScript/_script.sh:
> line 3: /home/ec2-user/hadoop/zookeeper/conf/zoo.cfg: No such file or
> directory\nfailed to run commands: exit status 1', 'StandardErrorUrl': '',
> 'CloudWatchOutputConfig': {'CloudWatchLogGroupName': '',
> 'CloudWatchOutputEnabled': False}, 'ResponseMetadata': {'RequestId':
> '133ea7d8-d661-4ea0-960d-349b294dd8a9', 'HTTPStatusCode': 200,
> 'HTTPHeaders': {'server': 'Server', 'date': 'Tue, 06 Feb 2024 16:24:05
> GMT', 'content-type': 'application/x-amz-json-1.1', 'content-length':
> '848', 'connection': 'keep-alive', 'x-amzn-requestid':
> '133ea7d8-d661-4ea0-960d-349b294dd8a9'}, 'RetryAttempts': 0}}
> Traceback (most recent call last):
>   File "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\deploy.py", line 141,
> in <module>
>     deploy_on_aws(args.type, args.kylin_mode, args.scale_type,
> args.node_type, args.cluster)
>   File "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\deploy.py", line 63,
> in deploy_on_aws
>     aws_engine.launch_default_cluster()
>   File "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\engine.py", line 38,
> in launch_default_cluster
>     self.engine_utils.launch_default_cluster()
>   File
> "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\utils\engine_utils.py", line
> 101, in launch_default_cluster
>     cloud_addr = self.get_kylin_address()
>   File
> "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\utils\engine_utils.py", line
> 217, in get_kylin_address
>     kylin_address = self.aws.get_kylin_address()
>   File "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\clouds\aws.py", line
> 149, in get_kylin_address
>     kylin_resources = self.get_kylin_resources()
>   File "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\clouds\aws.py", line
> 157, in get_kylin_resources
>     self.init_cluster()
>   File "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\clouds\aws.py", line
> 137, in init_cluster
>     self.cloud_instance.after_create_zk_cluster()
>   File
> "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\instances\aws_instance.py",
> line 551, in after_create_zk_cluster
>     self.after_create_zk_of_target_cluster()
>   File
> "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\instances\aws_instance.py",
> line 569, in after_create_zk_of_target_cluster
>     self.refresh_zks_cfg(zk_ips=zk_ips, zk_ids=zk_ids)
>   File
> "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\instances\aws_instance.py",
> line 582, in refresh_zks_cfg
>     self.exec_script_instance_and_return(name_or_id=zk_id,
> script=refresh_command)
>   File
> "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\instances\aws_instance.py",
> line 1978, in exec_script_instance_and_return
>     assert output and output['Status'] == 'Success', \
> AssertionError: execute script failed, failed details message:
> {'CommandId': '704b776f-e574-47ea-bf13-30d3be2e9df2', 'InstanceId':
> 'i-0cbc37f83c9cda006', 'Comment': '', 'DocumentName': 'AWS-RunShellScript',
> 'DocumentVersion': '$DEFAULT', 'PluginName': 'aws:runShellScript',
> 'ResponseCode': 1, 'ExecutionStartDateTime': '2024-02-06T16:24:00.394Z',
> 'ExecutionElapsedTime': 'PT0.008S', 'ExecutionEndDateTime':
> '2024-02-06T16:24:00.394Z', 'Status': 'Failed', 'StatusDetails': 'Failed',
> 'StandardOutputContent': '', 'StandardOutputUrl': '',
> 'StandardErrorContent':
> '/var/lib/amazon/ssm/i-0cbc37f83c9cda006/document/orchestration/704b776f-e574-47ea-bf13-30d3be2e9df2/awsrunShellScript/0.awsrunShellScript/_script.sh:
> line 3: /home/ec2-user/hadoop/zookeeper/conf/zoo.cfg: No such file or
> directory\nfailed to run commands: exit status 1', 'StandardErrorUrl': '',
> 'CloudWatchOutputConfig':
> {'CloudWatchLogGroupName': '', 'CloudWatchOutputEnabled': False},
> 'ResponseMetadata': {'RequestId': '133ea7d8-d661-4ea0-960d-349b294dd8a9',
> 'HTTPStatusCode': 200, 'HTTPHeaders': {'server': 'Server', 'date': 'Tue, 06
> Feb 2024 16:24:05 GMT', 'content-type': 'application/x-amz-json-1.1',
> 'content-length': '848', 'connection': 'keep-alive', 'x-amzn-requestid':
> '133ea7d8-d661-4ea0-960d-349b294dd8a9'}, 'RetryAttempts': 0}}
> ==========================================================================
> I checked on zk ec2 instances and there is no /home/ec2-user/hadoop
> directory.
> I also noticed that the file "prepare-ec2-env-for-zk.sh" has been uploaded
> to the home directory on all zk ec2 nodes:
> ---------------------------------------------------------------------
> drwx------ 3 ec2-user ec2-user   107 Feb  5 17:58 .
> drwxr-xr-x 4 root     root        38 Feb  5 17:57 ..
> -rw-r--r-- 1 ec2-user ec2-user    18 Jul 27  2018 .bash_logout
> -rw-r--r-- 1 ec2-user ec2-user   193 Jul 27  2018 .bash_profile
> -rw-r--r-- 1 ec2-user ec2-user   231 Jul 27  2018 .bashrc
> drwx------ 2 ec2-user ec2-user    29 Feb  5 17:57 .ssh
> -rw-r--r-- 1 root     root     10119 Feb  4 23:40 prepare-ec2-env-for-zk.sh
> ---------------------------------------------------------------------
> I guess the deploy script hasn't been able to execute the
> "prepare-ec2-env-for-zk.sh" script on the three zk ec2 nodes? I'm not sure
> what else could be wrong.
> Can someone please tell me how to get the deployment working?
> Thanks.

