I followed the troubleshooting instructions at:
https://github.com/apache/kylin/blob/kylin4_on_cloud/readme/trouble_shooting.md#kylin-can-not-access-and-exception-session-0x0-for-server-null-unexpected-error-closing-socket-connection-and-attempting-reconnect-is-in-kylinlog

When logging into the zookeeper instances, the .bash_profile file looks
standard and there is no reference to $ZOOKEEPER_HOME

[root@ip-172-27-32-153 ec2-user]# cat .bash_profile
------------------------------
# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
        . ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/.local/bin:$HOME/bin

export PATH
-------------------------------

So the deployment did not configure the zookeeper instances at all. Would
anyone know how to fix this?

Any help appreciated.

On Wed, 7 Feb 2024 at 04:33, John W <mag3...@gmail.com> wrote:

> Hi, I'm having problems deploying the kylin4_on_cloud project located at:
> https://github.com/apache/kylin/tree/kylin4_on_cloud
>
> I've also been following the instructions here
> https://www.youtube.com/watch?v=5kKXEMjO1Sc&ab_channel=Kyligence
>
> I used windows to git clone the repo and set up the venv with the latest
> packages via:
> pip install PyYAML
> pip install boto3
> pip install botocore
> pip install pyparsing
> pip install requests
> pip install retrying
> pip install Jinja2
> pip install pytest-shutil
> :
> I also changed the RDSEngineVersion to 8.0.35 in kylin_configs.yaml, as
> RDSEngineVersion 5.7.25 (default repo version) was giving me the error
> "Exception: Current stack: ec2-rds-stack is create failed, please check".
>
> Here's the log with error I am now getting:
>
> ==========================================================================
> (venv) C:\projects\kylin4_on_cloud>python deploy.py --type deploy --mode
> job
> 2024-02-07 02:13:54 - botocore.credentials - INFO - 5484 - Found
> credentials in shared credentials file: ~/.aws/credentials
> 2024-02-07 02:13:57 - engine - INFO - 5484 - Env already inited, skip init
> again.
> 2024-02-07 02:13:58 - clouds.aws - WARNING - 5484 - Current env for
> deploying a cluster is not ready.
> 2024-02-07 02:14:20 - instances.aws_instance - INFO - 5484 - Now creating
> stack: ec2-or-emr-vpc-stack.
> 2024-02-07 02:16:42 - instances.aws_instance - INFO - 5484 - Now creating
> stack: ec2-rds-stack.
> 2024-02-07 02:21:06 - instances.aws_instance - INFO - 5484 - Now creating
> stack: ec2-static-service-stack.
> 2024-02-07 02:21:06 - engine - INFO - 5484 - First launch default Kylin
> Cluster.
> 2024-02-07 02:22:08 - clouds.aws - WARNING - 5484 - Current cluster is not
> ready.
> 2024-02-07 02:22:30 - instances.aws_instance - INFO - 5484 - Now creating
> stack: ec2-zookeeper-stack.
> 2024-02-07 02:23:43 - instances.aws_instance - INFO - 5484 - Current
> execute commands in `Zookeeper stack` which named ec2-zookeeper-stack.
> 2024-02-07 02:23:43 - instances.aws_instance - INFO - 5484 - Current
> instance id: i-0cbc37f83c9cda006 is executing commands: grep -Fq
> "10.1.0.133:2888:3888" /home/ec2-user/hadoop/zookeeper/conf/zoo.cfg; echo
> $?.
> 2024-02-07 02:23:49 - instances.aws_instance - INFO - 5484 - Current
> instance id: i-0915d44c700e644dc is executing commands: grep -Fq
> "10.1.0.129:2888:3888" /home/ec2-user/hadoop/zookeeper/conf/zoo.cfg; echo
> $?.
> 2024-02-07 02:23:54 - instances.aws_instance - INFO - 5484 - Current
> instance id: i-0fdbacc22ecae360a is executing commands: grep -Fq
> "10.1.0.58:2888:3888" /home/ec2-user/hadoop/zookeeper/conf/zoo.cfg; echo
> $?.
> 2024-02-07 02:24:00 - instances.aws_instance - INFO - 5484 - Current
> instance id: i-0cbc37f83c9cda006 is executing commands: echo
> 'server.1=10.1.0.133:2888:3888
> server.2=10.1.0.129:2888:3888
> server.3=10.1.0.58:2888:3888' >>
> /home/ec2-user/hadoop/zookeeper/conf/zoo.cfg.
> 2024-02-07 02:24:05 - instances.aws_instance - WARNING - 5484 -
> {'CommandId': '704b776f-e574-47ea-bf13-30d3be2e9df2', 'InstanceId':
> 'i-0cbc37f83c9cda006', 'Comment': '', 'DocumentName': 'AWS-RunShellScript',
> 'DocumentVersion': '$DEFAULT', 'PluginName': 'aws:runShellScript',
> 'ResponseCode': 1,
> 'ExecutionStartDateTime': '2024-02-06T16:24:00.394Z',
> 'ExecutionElapsedTime': 'PT0.008S', 'ExecutionEndDateTime':
> '2024-02-06T16:24:00.394Z', 'Status': 'Failed', 'StatusDetails': 'Failed',
> 'StandardOutputContent': '', 'StandardOutputUrl': '',
> 'StandardErrorContent':
> '/var/lib/amazon/ssm/i-0cbc37f83c9cda006/document/orchestration/704b776f-e574-47ea-bf13-30d3be2e9df2/awsrunShellScript/0.awsrunShellScript/_script.sh:
> line 3: /home/ec2-user/hadoop/zookeeper/conf/zoo.cfg: No such file or
> directory\nfailed to run commands: exit status 1', 'StandardErrorUrl': '',
> 'CloudWatchOutputConfig': {'CloudWatchLogGroupName': '',
> 'CloudWatchOutputEnabled': False}, 'ResponseMetadata': {'RequestId':
> '133ea7d8-d661-4ea0-960d-349b294dd8a9', 'HTTPStatusCode': 200,
> 'HTTPHeaders': {'server': 'Server', 'date': 'Tue, 06 Feb 2024 16:24:05
> GMT', 'content-type': 'application/x-amz-json-1.1', 'content-length':
> '848', 'connection': 'keep-alive', 'x-amzn-requestid':
> '133ea7d8-d661-4ea0-960d-349b294dd8a9'}, 'RetryAttempts': 0}}
> Traceback (most recent call last):
>   File "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\deploy.py", line 141,
> in <module>
>     deploy_on_aws(args.type, args.kylin_mode, args.scale_type,
> args.node_type, args.cluster)
>   File "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\deploy.py", line 63,
> in deploy_on_aws
>     aws_engine.launch_default_cluster()
>   File "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\engine.py", line 38,
> in launch_default_cluster
>     self.engine_utils.launch_default_cluster()
>   File
> "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\utils\engine_utils.py", line
> 101, in launch_default_cluster
>     cloud_addr = self.get_kylin_address()
>   File
> "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\utils\engine_utils.py", line
> 217, in get_kylin_address
>     kylin_address = self.aws.get_kylin_address()
>   File "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\clouds\aws.py", line
> 149, in get_kylin_address
>     kylin_resources = self.get_kylin_resources()
>   File "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\clouds\aws.py", line
> 157, in get_kylin_resources
>     self.init_cluster()
>   File "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\clouds\aws.py", line
> 137, in init_cluster
>     self.cloud_instance.after_create_zk_cluster()
>   File
> "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\instances\aws_instance.py",
> line 551, in after_create_zk_cluster
>     self.after_create_zk_of_target_cluster()
>   File
> "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\instances\aws_instance.py",
> line 569, in after_create_zk_of_target_cluster
>     self.refresh_zks_cfg(zk_ips=zk_ips, zk_ids=zk_ids)
>   File
> "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\instances\aws_instance.py",
> line 582, in refresh_zks_cfg
>     self.exec_script_instance_and_return(name_or_id=zk_id,
> script=refresh_command)
>   File
> "C:\myfiles\_clients\me\kylin\kylin4_on_cloud\instances\aws_instance.py",
> line 1978, in exec_script_instance_and_return
>     assert output and output['Status'] == 'Success', \
> AssertionError: execute script failed, failed details message:
> {'CommandId': '704b776f-e574-47ea-bf13-30d3be2e9df2', 'InstanceId':
> 'i-0cbc37f83c9cda006', 'Comment': '', 'DocumentName': 'AWS-RunShellScript',
> 'DocumentVersion': '$DEFAULT', 'PluginName': 'aws:runShellScript',
> 'ResponseCode': 1, 'ExecutionStartDateTime': '2024-02-06T16:24:00.394Z',
> 'ExecutionElapsedTime': 'PT0.008S', 'ExecutionEndDateTime':
> '2024-02-06T16:24:00.394Z', 'Status': 'Failed', 'StatusDetails': 'Failed',
> 'StandardOutputContent': '', 'StandardOutputUrl': '',
> 'StandardErrorContent':
> '/var/lib/amazon/ssm/i-0cbc37f83c9cda006/document/orchestration/704b776f-e574-47ea-bf13-30d3be2e9df2/awsrunShellScript/0.awsrunShellScript/_script.sh:
> line 3: /home/ec2-user/hadoop/zookeeper/conf/zoo.cfg: No such file or
> directory\nfailed to run commands: exit status 1', 'StandardErrorUrl': '',
> 'CloudWatchOutputConfig':
> {'CloudWatchLogGroupName': '', 'CloudWatchOutputEnabled': False},
> 'ResponseMetadata': {'RequestId': '133ea7d8-d661-4ea0-960d-349b294dd8a9',
> 'HTTPStatusCode': 200, 'HTTPHeaders': {'server': 'Server', 'date': 'Tue, 06
> Feb 2024 16:24:05 GMT', 'content-type': 'application/x-amz-json-1.1',
> 'content-length': '848', 'connection': 'keep-alive', 'x-amzn-requestid':
> '133ea7d8-d661-4ea0-960d-349b294dd8a9'}, 'RetryAttempts': 0}}
> ==========================================================================
>
> I checked on zk ec2 instances and there is no /home/ec2-user/hadoop
> directory.
>
> I also noticed that the file "prepare-ec2-env-for-zk.sh" has been uploaded
> to the home directory on all zk ec2 nodes:
> ---------------------------------------------------------------------
> drwx------ 3 ec2-user ec2-user   107 Feb  5 17:58 .
> drwxr-xr-x 4 root     root        38 Feb  5 17:57 ..
> -rw-r--r-- 1 ec2-user ec2-user    18 Jul 27  2018 .bash_logout
> -rw-r--r-- 1 ec2-user ec2-user   193 Jul 27  2018 .bash_profile
> -rw-r--r-- 1 ec2-user ec2-user   231 Jul 27  2018 .bashrc
> drwx------ 2 ec2-user ec2-user    29 Feb  5 17:57 .ssh
> -rw-r--r-- 1 root     root     10119 Feb  4 23:40 prepare-ec2-env-for-zk.sh
> ---------------------------------------------------------------------
>
> I guess the deploy script hasn't been able to execute the
> "prepare-ec2-env-for-zk.sh" script on the three zk ec2 nodes? I'm not sure
> what else could be wrong.
>
> Can someone please tell me how to get the deployment working?
>
> Thanks.
>

Reply via email to