Thank you Xiaoxiang for your advice. As my title email shown, I guessed that the OLAP functionalities has not been correctly set up in my computer.
The evidence about it is that: when I disable the Pushdown option box to use solely the precomputation cube only, it showed following error: Please kindly advise how to properly build the OLAP LIMIT 500": No realization found for OLAPContext, MODEL_UNMATCHED_JOIN, rel#2240:KapTableScan.OLAP.[](table=[VNEVENT_HIVE_DWH_400MILLION_ROWS, FACTUSEREVENT],ctx=0@null,fields=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]) On Wed, Nov 1, 2023 at 10:40 AM Xiaoxiang Yu <[email protected]> wrote: > Hi, > > Yesterday, I tried to see if query pushdown functions work well in the > Kylin5 docker, and all of my queries return proper responses . > After checking your logs from Shaofeng, I found these error messages > repeated many times: > 1. 'java.io.IOException: All datanodes DatanodeInfoWithStorage[ > 127.0.0.1:9866,DS-5093899b-06c7-4386-95d5-6fc271d92b52,DISK] are bad. > Aborting...' > 2. 'curator.ConnectionState : Connection timed out for connection > string (localhost:2181) and timeout (15000) / elapsed (41794) > org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = > ConnectionLoss' > > I guess the root cause is that the container didn't not have enough > resources. I found you query on a table called > 'XXX_hive_dwh_400million_rows', looks like you gave a complex query on a > table which contains 400 million rows? > > Since I am the uploader of kylin5 's docker image, I want to give some > explainment. Kylin5 docker is not a place for performance benchmarks, it is > only for demonstration. It is only allocated with very little resources(8G > memory) if you are using the default command from docker hub page. Before I > uploaded my image, I only tested my image using the ssb dataset, which the > biggest table only contains about 60k rows. If you are using a larger > dataset and complexer queries, you have to scale the resource properly. Try > querying tables which contain not more than 100k rows by default. > > Here are some tips which may help you to check if the daemon service > is in health status and resources(particularly disk space) is configured > properly. > > 1. Checking HDFS 's web ui( > http://localhost:9870/dfshealth.html#tab-datanode ) to confirm whether > HDFS service is in 'In service' status. > 2. Checking Datanode 's log in > `/opt/hadoop-3.2.1/logs/hadoop-root-datanode-Kylin5-Machine.log`, check if > there is any error message. Like: cat > /opt/hadoop-3.2.1/logs/hadoop-root-datanode-Kylin5-Machine.log | grep ERROR > | wc -l > 3. Checking if your docker engine is configured with enough disk > space, if you are using Docker Desktop like me,please go to "Settings" - > "Resources" - "Advanced", make sure you have allocated 40GB+ disk space to > the docker container. > 4. Checking the available disk space of your container by `df -h`, > make sure the 'Use%' of 'overlay' is less than 60% . > 5. Checking the load average/ cpu usage/ jvm gc. Make sure these > metrics are not really high when you send a query. > ------------------------ > With warm regard > Xiaoxiang Yu > > > > On Tue, Oct 31, 2023 at 5:13 PM Nam Đỗ Duy <[email protected]> wrote: > >> Hi ShaoFeng >> >> Thank you very much for your valuable feedback >> >> I saw the application to be there (if I see it right) as in the >> attachment photo. Kindly advise so that I can run this query on OLAP. >> >> PS. I sent you the log file in private. >> >> [image: image.png] >> >> On Tue, Oct 31, 2023 at 3:11 PM ShaoFeng Shi <[email protected]> >> wrote: >> >>> Can you provide the messages in logs/kylin.log when executing the SQL? >>> and you can also check the Spark UI from yarn resource manager (there >>> should be one running application called Spardar, which is Kylin's backend >>> spark application). If the application is not there, it may indicates the >>> yarn doesn't have resource to startup it. >>> >>> Best regards, >>> >>> Shaofeng Shi 史少锋 >>> Apache Kylin PMC, >>> Apache Incubator PMC, >>> Email: [email protected] >>> >>> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html >>> Join Kylin user mail group: [email protected] >>> Join Kylin dev mail group: [email protected] >>> >>> >>> >>> >>> Nam Đỗ Duy <[email protected]> 于2023年10月31日周二 10:35写道: >>> >>>> Dear Sir/Madam, >>>> >>>> I have a fact with 500million rows then I build model, index according >>>> to the website help. >>>> >>>> I chose full incremental because this is the first times I load data >>>> >>>> I create both index types Aggregate group index, table index as photo >>>> attached. >>>> >>>> But the query always failed after timeout of 300 seconds (I run in >>>> docker), I dont want to increase the value of 300 seconds because I wish >>>> the OLAP can run within 1 minutes (is that possible?) >>>> >>>> It seems that the OLAP function in indexing not working to speedup the >>>> query by precomputed cube. >>>> >>>> Can you advise to check whether the index did really work? >>>> >>>> It is quite urgent task for me so prompt response is highly appreciated. >>>> >>>> Thank you very much >>>> >>>
