Dear , hope all is well,
we are looking to use Apache Kylin instead of SSAS for our business analysis -dashboard product . we are facing a problem in building the cube , it contains two hive tables one fact table and one dimension table . fact table total number of rows is 47271784 and total size is 5326550430 as shown in show tblproperties query in hive cmd . and dimision tble totoal number of rows is 5261766 and total size is 1174440814 as shown in show tblproperties query in hive cmd. the build process failed in step 3 // #3 Step Name: Extract Fact Table Distinct Columns Data Size: 16.19 KB Duration: 11.78 mins Waiting: 13 seconds the logs give Java heap space Error as follow : org.apache.kylin.engine.mr.exception.MapReduceException: Counters: 55 File System Counters FILE: Number of bytes read=323698 FILE: Number of bytes written=29783830 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=252673677 HDFS: Number of bytes written=16576 HDFS: Number of read operations=195 HDFS: Number of large read operations=0 HDFS: Number of write operations=3 Job Counters Failed reduce tasks=4 Launched map tasks=47 Launched reduce tasks=5 Data-local map tasks=47 Total time spent by all maps in occupied slots (ms)=4363352 Total time spent by all reduces in occupied slots (ms)=2032100 Total time spent by all map tasks (ms)=1090838 Total time spent by all reduce tasks (ms)=508025 Total vcore-milliseconds taken by all map tasks=1090838 Total vcore-milliseconds taken by all reduce tasks=508025 Total megabyte-milliseconds taken by all map tasks=1117018112 Total megabyte-milliseconds taken by all reduce tasks=520217600 Map-Reduce Framework Map input records=47271784 Map output records=5261813 Map output bytes=57539075 Map output materialized bytes=15536194 Input split bytes=138932 Combine input records=5261813 Combine output records=5261813 Reduce input groups=1 Reduce shuffle bytes=340412 Reduce input records=47 Reduce output records=0 Spilled Records=5261860 Shuffled Maps =47 Failed Shuffles=0 Merged Map outputs=47 GC time elapsed (ms)=68095 CPU time spent (ms)=1246430 Physical memory (bytes) snapshot=44485660672 Virtual memory (bytes) snapshot=137661587456 Total committed heap usage (bytes)=41749577728 Peak Map Physical memory (bytes)=960831488 Peak Map Virtual memory (bytes)=2891886592 Peak Reduce Physical memory (bytes)=305377280 Peak Reduce Virtual memory (bytes)=2667810816 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=0 org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper$RawDataCounter BYTES=1563833108 Job Diagnostics:Task failed task_1610370996803_0012_r_000000 Job failed as tasks failed. failedMaps:0 failedReduces:1 killedMaps:0 killedReduces: 0 Failure task Diagnostics: Error: Java heap space at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:234) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) i tried to increase the memory located to Kylin to 17 gb in the setenv.sh file as recommended as follow in setenv.sh file export KYLIN_JVM_SETTINGS="-Xms17g -Xmx17g -Xss1024K -XX:MaxPermSize=1g -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:$KYLIN_HOME/logs/kylin.gc.%p -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=64M" but still give this error , aim using Kylin v3.1.1 on HDP 3.0 , the server resources are 32 GB RAM and 4 cores i7 CPU. please let me know if you need any more information from my side . to guide us where is the problem with the needed solution , and recommended setting . your quick response is highly appreciated , we need to know how much Kylin is reliable and what level of support it provides . best regards Ahmad Hammad chief technology officer webiste:http://beyegroup.com/ mobile:962 79640 1490 email:[email protected]
