Dear ,

hope all is well,

we are looking to use Apache Kylin instead of SSAS for our business analysis 
-dashboard product . we are facing a problem in building the cube , it contains 
two hive tables one fact table and one dimension table .

fact table total number of rows is 47271784  and total size is 5326550430 as 
shown in show tblproperties query in hive cmd .

and dimision tble totoal number of rows is 5261766 and total size is 1174440814 
as shown in show tblproperties query in hive cmd.




the build process failed in step 3 //
 #3 Step Name: Extract Fact Table Distinct Columns
Data Size: 16.19 KB
Duration: 11.78 mins Waiting: 13 seconds


the logs give Java heap space Error as follow :

org.apache.kylin.engine.mr.exception.MapReduceException: Counters: 55
File System Counters
FILE: Number of bytes read=323698
FILE: Number of bytes written=29783830
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=252673677
HDFS: Number of bytes written=16576
HDFS: Number of read operations=195
HDFS: Number of large read operations=0
HDFS: Number of write operations=3
Job Counters
Failed reduce tasks=4
Launched map tasks=47
Launched reduce tasks=5
Data-local map tasks=47
Total time spent by all maps in occupied slots (ms)=4363352
Total time spent by all reduces in occupied slots (ms)=2032100
Total time spent by all map tasks (ms)=1090838
Total time spent by all reduce tasks (ms)=508025
Total vcore-milliseconds taken by all map tasks=1090838
Total vcore-milliseconds taken by all reduce tasks=508025
Total megabyte-milliseconds taken by all map tasks=1117018112
Total megabyte-milliseconds taken by all reduce tasks=520217600
Map-Reduce Framework
Map input records=47271784
Map output records=5261813
Map output bytes=57539075
Map output materialized bytes=15536194
Input split bytes=138932
Combine input records=5261813
Combine output records=5261813
Reduce input groups=1
Reduce shuffle bytes=340412
Reduce input records=47
Reduce output records=0
Spilled Records=5261860
Shuffled Maps =47
Failed Shuffles=0
Merged Map outputs=47
GC time elapsed (ms)=68095
CPU time spent (ms)=1246430
Physical memory (bytes) snapshot=44485660672
Virtual memory (bytes) snapshot=137661587456
Total committed heap usage (bytes)=41749577728
Peak Map Physical memory (bytes)=960831488
Peak Map Virtual memory (bytes)=2891886592
Peak Reduce Physical memory (bytes)=305377280
Peak Reduce Virtual memory (bytes)=2667810816
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper$RawDataCounter
BYTES=1563833108
Job Diagnostics:Task failed task_1610370996803_0012_r_000000
Job failed as tasks failed. failedMaps:0 failedReduces:1 killedMaps:0 
killedReduces: 0

Failure task Diagnostics:
Error: Java heap space

at 
org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:234)
at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
at 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
at 
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


i tried to increase the memory located to Kylin to 17 gb in the setenv.sh file 
as recommended

 as follow in setenv.sh file

export KYLIN_JVM_SETTINGS="-Xms17g -Xmx17g -Xss1024K -XX:MaxPermSize=1g 
-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
-Xloggc:$KYLIN_HOME/logs/kylin.gc.%p -XX:+UseGCLogFileRotation 
-XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=64M"

but still give this error ,

aim using Kylin v3.1.1 on HDP 3.0 , the server resources are 32 GB RAM and 4 
cores i7 CPU.

please let me know if you need any more information from my side . to guide us 
where is the problem with the needed solution , and recommended setting .

your quick response is highly appreciated , we need to know how much Kylin is 
reliable and what level of support it provides .

best regards

Ahmad Hammad
chief technology officer
webiste:http://beyegroup.com/
mobile:962 79640 1490
email:[email protected]

Reply via email to