Hi, ?????? hadoop3.1.2 kylin:2.6.1
????????????#3 step ?? Extract Fact Table Distinct Columns) kylin.log ??????????????~?? ??????????????: 2019-03-08 11:15:08,660 INFO [Scheduler 190936888 Job edba142b-b9ec-2379-89f6-2a34511d0425-177] spark.SparkExecutable:262 : cmd: export HADOOP_CONF_DIR=/home/apache/hbase/conf && /home/apache/kylin/spark/bin/spark-submit --class org.apache.kylin.common.util.SparkEntry --conf spark.executor.instances=40 --conf spark.yarn.archive=hdfs://master:9820/kylin/spark/spark-libs.jar --conf spark.network.timeout=600 --conf spark.yarn.queue=default --conf spark.history.fs.logDirectory=hdfs:///kylin/spark-history --conf spark.io.compression.codec=org.apache.spark.io.SnappyCompressionCodec --conf spark.dynamicAllocation.enabled=true --conf spark.master=yarn --conf spark.storage.memoryFraction=0.5 --conf spark.dynamicAllocation.executorIdleTimeout=300 --conf spark.hadoop.yarn.timeline-service.enabled=false --conf spark.executor.memory=4G --conf spark.eventLog.enabled=true --conf spark.eventLog.dir=hdfs:///kylin/spark-history --conf spark.dynamicAllocation.minExecutors=1 --conf spark.executor.cores=2 --conf spark.yarn.executor.memoryOverhead=1024 --conf spark.hadoop.dfs.replication=2 --conf spark.dynamicAllocation.maxExecutors=1000 --conf spark.executor.memoryOverhead=1024 --conf spark.driver.memory=2G --conf spark.driver.memoryOverhead=256 --conf spark.submit.deployMode=cluster --conf spark.shuffle.service.enabled=true --jars /home/apache/kylin/lib/kylin-job-2.6.1.jar /home/apache/kylin/lib/kylin-job-2.6.1.jar -className org.apache.kylin.engine.spark.SparkFactDistinct -counterOutput hdfs://kylincluster/kylin/kylin_metadata/kylin-edba142b-b9ec-2379-89f6-2a34511d0425/spark_cube_low/counter -statisticssamplingpercent 100 -cubename spark_cube_low -hiveTable default.kylin_intermediate_spark_cube_low_63876e8b_97c8_074d_0356_3f779a6ca1b6 -output hdfs://kylincluster/kylin/kylin_metadata/kylin-edba142b-b9ec-2379-89f6-2a34511d0425/spark_cube_low/fact_distinct_columns -input hdfs://kylincluster/kylin/kylin_metadata/kylin-edba142b-b9ec-2379-89f6-2a34511d0425/kylin_intermediate_spark_cube_low_63876e8b_97c8_074d_0356_3f779a6ca1b6 -segmentId 63876e8b-97c8-074d-0356-3f779a6ca1b6 -metaUrl kylin_metadata@hdfs,path=hdfs://kylincluster/kylin/kylin_metadata/kylin-edba142b-b9ec-2379-89f6-2a34511d0425/spark_cube_low/metadata 2019-03-08 11:15:09,423 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:09 WARN SparkConf:66 - The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead. 2019-03-08 11:15:10,478 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:10 INFO Client:54 - Requesting a new application from cluster with 2 NodeManagers 2019-03-08 11:15:10,541 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:10 INFO Client:54 - Verifying our application has not requested more than the maximum memory capability of the cluster (8000 MB per container) 2019-03-08 11:15:10,542 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:10 INFO Client:54 - Will allocate AM container, with 2304 MB memory including 256 MB overhead 2019-03-08 11:15:10,542 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:10 INFO Client:54 - Setting up container launch context for our AM 2019-03-08 11:15:10,544 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:10 INFO Client:54 - Setting up the launch environment for our AM container 2019-03-08 11:15:10,552 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:10 INFO Client:54 - Preparing resources for our AM container 2019-03-08 11:15:11,553 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:11 INFO Client:54 - Uploading resource hdfs://master:9820/kylin/spark/spark-libs.jar -> hdfs://kylincluster/user/root/.sparkStaging/application_1552042715496_0010/spark-libs.jar 2019-03-08 11:15:13,978 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:13 INFO Client:54 - Uploading resource file:/home/apache/kylin/lib/kylin-job-2.6.1.jar -> hdfs://kylincluster/user/root/.sparkStaging/application_1552042715496_0010/kylin-job-2.6.1.jar 2019-03-08 11:15:14,485 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:14 WARN Client:66 - Same name resource file:///home/apache/kylin/lib/kylin-job-2.6.1.jar added multiple times to distributed cache 2019-03-08 11:15:14,594 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:14 INFO Client:54 - Uploading resource file:/tmp/spark-1e734412-567c-4c59-a6c6-9c58c83355db/__spark_conf__6590808330343993974.zip -> hdfs://kylincluster/user/root/.sparkStaging/application_1552042715496_0010/__spark_conf__.zip 2019-03-08 11:15:14,796 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:14 INFO SecurityManager:54 - Changing view acls to: root 2019-03-08 11:15:14,797 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:14 INFO SecurityManager:54 - Changing modify acls to: root 2019-03-08 11:15:14,798 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:14 INFO SecurityManager:54 - Changing view acls groups to: 2019-03-08 11:15:14,798 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:14 INFO SecurityManager:54 - Changing modify acls groups to: 2019-03-08 11:15:14,800 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:14 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 2019-03-08 11:15:14,811 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:14 INFO Client:54 - Submitting application application_1552042715496_0010 to ResourceManager 2019-03-08 11:15:15,038 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:15 INFO YarnClientImpl:273 - Submitted application application_1552042715496_0010 2019-03-08 11:15:16,042 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:16 INFO Client:54 - Application report for application_1552042715496_0010 (state: ACCEPTED) 2019-03-08 11:15:16,046 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:16 INFO Client:54 - 2019-03-08 11:15:16,046 INFO [pool-16-thread-1] spark.SparkExecutable:38 : client token: N/A 2019-03-08 11:15:16,046 INFO [pool-16-thread-1] spark.SparkExecutable:38 : diagnostics: AM container is launched, waiting for AM container to Register with RM 2019-03-08 11:15:16,046 INFO [pool-16-thread-1] spark.SparkExecutable:38 : ApplicationMaster host: N/A 2019-03-08 11:15:16,046 INFO [pool-16-thread-1] spark.SparkExecutable:38 : ApplicationMaster RPC port: -1 2019-03-08 11:15:16,046 INFO [pool-16-thread-1] spark.SparkExecutable:38 : queue: default 2019-03-08 11:15:16,047 INFO [pool-16-thread-1] spark.SparkExecutable:38 : start time: 1552043714819 2019-03-08 11:15:16,047 INFO [pool-16-thread-1] spark.SparkExecutable:38 : final status: UNDEFINED 2019-03-08 11:15:16,047 INFO [pool-16-thread-1] spark.SparkExecutable:38 : tracking URL: http://master:8088/proxy/application_1552042715496_0010/ 2019-03-08 11:15:16,050 INFO [pool-16-thread-1] spark.SparkExecutable:38 : user: root 2019-03-08 11:15:17,048 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:17 INFO Client:54 - Application report for application_1552042715496_0010 (state: ACCEPTED) 2019-03-08 11:15:18,050 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:18 INFO Client:54 - Application report for application_1552042715496_0010 (state: ACCEPTED) 2019-03-08 11:15:19,052 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:19 INFO Client:54 - Application report for application_1552042715496_0010 (state: ACCEPTED) 2019-03-08 11:15:20,054 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:20 INFO Client:54 - Application report for application_1552042715496_0010 (state: ACCEPTED) 2019-03-08 11:15:21,056 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:21 INFO Client:54 - Application report for application_1552042715496_0010 (state: ACCEPTED) 2019-03-08 11:15:22,058 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:22 INFO Client:54 - Application report for application_1552042715496_0010 (state: RUNNING) 2019-03-08 11:15:22,059 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:22 INFO Client:54 - 2019-03-08 11:15:22,059 INFO [pool-16-thread-1] spark.SparkExecutable:38 : client token: N/A 2019-03-08 11:15:22,059 INFO [pool-16-thread-1] spark.SparkExecutable:38 : diagnostics: N/A 2019-03-08 11:15:22,059 INFO [pool-16-thread-1] spark.SparkExecutable:38 : ApplicationMaster host: 192.168.0.110 2019-03-08 11:15:22,059 INFO [pool-16-thread-1] spark.SparkExecutable:38 : ApplicationMaster RPC port: 0 2019-03-08 11:15:22,059 INFO [pool-16-thread-1] spark.SparkExecutable:38 : queue: default 2019-03-08 11:15:22,059 INFO [pool-16-thread-1] spark.SparkExecutable:38 : start time: 1552043714819 2019-03-08 11:15:22,059 INFO [pool-16-thread-1] spark.SparkExecutable:38 : final status: UNDEFINED 2019-03-08 11:15:22,059 INFO [pool-16-thread-1] spark.SparkExecutable:38 : tracking URL: http://master:8088/proxy/application_1552042715496_0010/ 2019-03-08 11:15:22,063 INFO [pool-16-thread-1] spark.SparkExecutable:38 : user: root 2019-03-08 11:15:23,061 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:23 INFO Client:54 - Application report for application_1552042715496_0010 (state: RUNNING) 2019-03-08 11:15:24,062 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:24 INFO Client:54 - Application report for application_1552042715496_0010 (state: RUNNING) 2019-03-08 11:15:24,647 INFO [FetcherRunner 1227064704-51] threadpool.DefaultFetcherRunner:94 : Job Fetcher: 1 should running, 1 actual running, 1 stopped, 0 ready, 22 already succeed, 29 error, 0 discarded, 0 others 2019-03-08 11:15:25,065 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:25 INFO Client:54 - Application report for application_1552042715496_0010 (state: RUNNING) 2019-03-08 11:15:26,067 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:26 INFO Client:54 - Application report for application_1552042715496_0010 (state: RUNNING) 2019-03-08 11:15:27,069 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:27 INFO Client:54 - Application report for application_1552042715496_0010 (state: RUNNING) 2019-03-08 11:15:28,071 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:28 INFO Client:54 - Application report for application_1552042715496_0010 (state: RUNNING) 2019-03-08 11:15:29,073 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:29 INFO Client:54 - Application report for application_1552042715496_0010 (state: RUNNING) 2019-03-08 11:15:30,075 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:30 INFO Client:54 - Application report for application_1552042715496_0010 (state: RUNNING) 2019-03-08 11:15:31,077 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:31 INFO Client:54 - Application report for application_1552042715496_0010 (state: RUNNING) 2019-03-08 11:15:32,079 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:32 INFO Client:54 - Application report for application_1552042715496_0010 (state: RUNNING) 2019-03-08 11:15:33,081 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:33 INFO Client:54 - Application report for application_1552042715496_0010 (state: RUNNING) 2019-03-08 11:15:33,487 DEBUG [http-nio-7070-exec-4] common.KylinConfig:328 : KYLIN_CONF property was not set, will seek KYLIN_HOME env variable 2019-03-08 11:15:33,487 INFO [http-nio-7070-exec-4] common.KylinConfig:334 : Use KYLIN_HOME=/home/apache/kylin 2019-03-08 11:15:34,083 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:34 INFO Client:54 - Application report for application_1552042715496_0010 (state: RUNNING) 2019-03-08 11:15:34,242 DEBUG [http-nio-7070-exec-3] badquery.BadQueryHistoryManager:65 : Loaded 0 Bad Query(s) 2019-03-08 11:15:35,085 INFO [pool-16-thread-1] spark.SparkExecutable:38 : 2019-03-08 11:15:35 INFO Client:54 - Application report for application_15520427 spark ?????????? INFO Client:54 - Application report for application_1552042715496_0010 (state: RUNNING) ????????????????????????????10% spark??job????,?????? ??cube??map reduce???????????????????? hadoop-root-resourcemanager-master.log ????????????????yarn??container?? ??????1-4?????????? ??????resourcemanager?????? 2019-03-08 11:32:34,829 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Allocation proposal accepted 2019-03-08 11:32:37,590 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e89_1552042715496_0010_01_000409 Container Transitioned from ACQUIRED to RELEASED 2019-03-08 11:32:37,590 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root IP=192.168.0.110 OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1552042715496_0010 CONTAINERID=container_e89_1552042715496_0010_01_000409 RESOURCE=<memory:6144, vCores:1> QUEUENAME=default 2019-03-08 11:32:37,591 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e89_1552042715496_0010_01_000410 Container Transitioned from ALLOCATED to ACQUIRED 2019-03-08 11:32:37,833 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator: assignedContainer application attempt=appattempt_1552042715496_0010_000001 container=null queue=default clusterResource=<memory:32000, vCores:24> type=NODE_LOCAL requestedPartition= 2019-03-08 11:32:37,833 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e89_1552042715496_0010_01_000411 Container Transitioned from NEW to ALLOCATED 2019-03-08 11:32:37,833 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode: Assigned container container_e89_1552042715496_0010_01_000411 of capacity <memory:6144, vCores:1> on host master:33725, which has 2 containers, <memory:12288, vCores:2> used and <memory:3712, vCores:10> available after allocation 2019-03-08 11:32:37,833 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root OPERATION=AM Allocated Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1552042715496_0010 CONTAINERID=container_e89_1552042715496_0010_01_000411 RESOURCE=<memory:6144, vCores:1> QUEUENAME=default 2019-03-08 11:32:37,833 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: assignedContainer queue=root usedCapacity=0.512 absoluteUsedCapacity=0.512 used=<memory:16384, vCores:3> cluster=<memory:32000, vCores:24> 2019-03-08 11:32:37,833 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Allocation proposal accepted 2019-03-08 11:32:40,595 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e89_1552042715496_0010_01_000410 Container Transitioned from ACQUIRED to RELEASED 2019-03-08 11:32:40,595 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root IP=192.168.0.110 OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1552042715496_0010 CONTAINERID=container_e89_1552042715496_0010_01_000410 RESOURCE=<memory:6144, vCores:1> QUEUENAME=default 2019-03-08 11:32:40,596 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e89_1552042715496_0010_01_000411 Container Transitioned from ALLOCATED to ACQUIRED 2019-03-08 11:32:40,836 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator: assignedContainer application attempt=appattempt_1552042715496_0010_000001 container=null queue=default clusterResource=<memory:32000, vCores:24> type=NODE_LOCAL requestedPartition= 2019-03-08 11:32:40,836 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e89_1552042715496_0010_01_000412 Container Transitioned from NEW to ALLOCATED 2019-03-08 11:32:40,836 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode: Assigned container container_e89_1552042715496_0010_01_000412 of capacity <memory:6144, vCores:1> on host master:33725, which has 2 containers, <memory:12288, vCores:2> used and <memory:3712, vCores:10> available after allocation 2019-03-08 11:32:40,836 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root OPERATION=AM Allocated Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1552042715496_0010 CONTAINERID=container_e89_1552042715496_0010_01_000412 RESOURCE=<memory:6144, vCores:1> QUEUENAME=default 2019-03-08 11:32:40,836 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: assignedContainer queue=root usedCapacity=0.512 absoluteUsedCapacity=0.512 used=<memory:16384, vCores:3> cluster=<memory:32000, vCores:24> 2019-03-08 11:32:40,836 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Allocation proposal accepted ??????diagnosis ?????????????????????????? Best regards
