[ 
https://issues.apache.org/jira/browse/KYLIN-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17100720#comment-17100720
 ] 

ASF GitHub Bot commented on KYLIN-4385:
---------------------------------------

hit-lacus opened a new pull request #1197:
URL: https://github.com/apache/kylin/pull/1197


   ## Proposed changes
   
   Describe the big picture of your changes here to communicate to the 
maintainers why we should accept this pull request. If it fixes a bug or 
resolves a feature request, be sure to link to that issue.
   
   ## Types of changes
   
   What types of changes does your code introduce to Kylin?
   _Put an `x` in the boxes that apply_
   
   - [ ] Bugfix (non-breaking change which fixes an issue)
   - [ ] New feature (non-breaking change which adds functionality)
   - [ ] Breaking change (fix or feature that would cause existing 
functionality to not work as expected)
   - [ ] Documentation Update (if none of the other choices apply)
   
   ## Checklist
   
   _Put an `x` in the boxes that apply. You can also fill these out after 
creating the PR. If you're unsure about any of them, don't hesitate to ask. 
We're here to help! This is simply a reminder of what we are going to look for 
before merging your code._
   
   - [ ] I have create an issue on [Kylin's 
jira](https://issues.apache.org/jira/browse/KYLIN), and have described the 
bug/feature there in detail
   - [ ] Commit messages in my PR start with the related jira ID, like 
"KYLIN-0000 Make Kylin project open-source"
   - [ ] Compiling and unit tests pass locally with my changes
   - [ ] I have added tests that prove my fix is effective or that my feature 
works
   - [ ] If this change need a document change, I will prepare another pr 
against the `document` branch
   - [ ] Any dependent changes have been merged
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
user@kylin or dev@kylin by explaining why you chose the solution you did and 
what alternatives you considered, etc...
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> KYLIN system cube failing to update table when run on EMR with S3 as storage 
> and EMRFS
> --------------------------------------------------------------------------------------
>
>                 Key: KYLIN-4385
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4385
>             Project: Kylin
>          Issue Type: Bug
>            Reporter: raghu ram reddy
>            Assignee: Xiaoxiang Yu
>            Priority: Major
>             Fix For: v3.1.0, v3.0.2, v2.6.6
>
>
>  
> 2020-02-24T15:35:46,548 INFO [metrics-blocking-reservoir-scheduler-0] 
> org.apache.kylin.metrics.lib.impl.hive.HiveReservoirReporter - Try to write 
> 113 records2020-02-24T15:35:46,566 INFO 
> [metrics-blocking-reservoir-scheduler-0] org.apache.hadoop.hive.conf.HiveConf 
> - Found configuration file 
> file:/etc/hive/conf.dist/hive-site.xml2020-02-24T15:35:47,097 INFO 
> [metrics-blocking-reservoir-scheduler-0] hive.metastore - Trying to connect 
> to metastore with URI 
> thrift://ip-1-1-1-1.ec2.internal:90832020-02-24T15:35:47,216 INFO 
> [metrics-blocking-reservoir-scheduler-0] hive.metastore - Opened a connection 
> to metastore, current connections: 12020-02-24T15:35:47,216 INFO 
> [metrics-blocking-reservoir-scheduler-0] hive.metastore - Connected to 
> metastore.2020-02-24T15:35:47,433 INFO 
> [metrics-blocking-reservoir-scheduler-0] hive.metastore - Closed a connection 
> to metastore, current connections: 02020-02-24T15:35:47,824 INFO 
> [metrics-blocking-reservoir-scheduler-0] 
> org.apache.kylin.metrics.lib.impl.hive.HiveProducer - Try to use new 
> partition content path: 
> hdfs://ip-1-1-2-1.ec2.internal:8020/tmp/system_cube/hive_metrics_query_cube_qa/kday_date=2020-02-24/ip-1-1-1-1-1582558547056-part-0000
>  for metric: METRICS_QUERY_CUBE_QA2020-02-24T15:35:47,959 INFO 
> [metrics-blocking-reservoir-scheduler-0] 
> org.apache.kylin.metrics.lib.impl.hive.HiveProducer - Success to write 37 
> metrics (METRICS_QUERY_CUBE_QA) to file 
> hdfs://ip-1-1-2-1.ec2.internal:8020/tmp/system_cube/hive_metrics_query_cube_qa/kday_date=2020-02-24/ip-1-1-1-1-1582558547056-part-00002020-02-24T15:35:48,275
>  INFO [metrics-blocking-reservoir-scheduler-0] hive.metastore - Trying to 
> connect to metastore with URI 
> thrift://ip-1-1-2-1.ec2.internal:90832020-02-24T15:35:48,288 INFO 
> [metrics-blocking-reservoir-scheduler-0] hive.metastore - Opened a connection 
> to metastore, current connections: 12020-02-24T15:35:48,289 INFO 
> [metrics-blocking-reservoir-scheduler-0] hive.metastore - Connected to 
> metastore.2020-02-24T15:35:48,711 INFO 
> [metrics-blocking-reservoir-scheduler-0] hive.metastore - Closed a connection 
> to metastore, current connections: 02020-02-24T15:35:50,223 INFO 
> [metrics-blocking-reservoir-scheduler-0] 
> org.apache.hadoop.hive.ql.session.SessionState - Created HDFS directory: 
> /tmp/hive/kylin/3f98a154-e471-40fc-9829-4c4283266d462020-02-24T15:35:50,224 
> INFO [metrics-blocking-reservoir-scheduler-0] 
> org.apache.hadoop.hive.ql.session.SessionState - Created local directory: 
> /usr/local/kylin/tomcat/temp/kylin/3f98a154-e471-40fc-9829-4c4283266d462020-02-24T15:35:50,232
>  INFO [metrics-blocking-reservoir-scheduler-0] 
> org.apache.hadoop.hive.ql.session.SessionState - Created HDFS directory: 
> /tmp/hive/kylin/3f98a154-e471-40fc-9829-4c4283266d46/_tmp_space.db2020-02-24T15:35:50,291
>  INFO [metrics-blocking-reservoir-scheduler-0] 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState - User of session id 
> 3f98a154-e471-40fc-9829-4c4283266d46 is kylin2020-02-24T15:35:50,389 INFO 
> [metrics-blocking-reservoir-scheduler-0] 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils - Jar dir is null / directory 
> doesn't exist. Choosing HIVE_INSTALL_DIR - 
> /user/kylin/.hiveJars2020-02-24T15:35:50,933 INFO 
> [metrics-blocking-reservoir-scheduler-0] 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils - Resource modification time: 
> 1581024148854 for 
> hdfs://ip-1-1-2-1.ec2.internal:8020/user/kylin/.hiveJars/hive-exec-2.3.6-amzn-0-9f4c4d2a9ab8330bfec9b3ce23e40355288cc5c08a20165b20aca86b2b6c2c95.jar2020-02-24T15:35:51,066
>  INFO [metrics-blocking-reservoir-scheduler-0] 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController
>  - Created SQLStdHiveAccessController for session context : 
> HiveAuthzSessionContext [sessionString=3f98a154-e471-40fc-9829-4c4283266d46, 
> clientType=HIVECLI]2020-02-24T15:35:51,073 WARN 
> [metrics-blocking-reservoir-scheduler-0] 
> org.apache.hadoop.hive.ql.session.SessionState - METASTORE_FILTER_HOOK will 
> be ignored, since hive.security.authorization.manager is set to instance of 
> HiveAuthorizerFactory.2020-02-24T15:35:51,646 INFO 
> [metrics-blocking-reservoir-scheduler-0] hive.metastore - Trying to connect 
> to metastore with URI 
> thrift://ip-1-1-2-1.ec2.internal:90832020-02-24T15:35:51,662 INFO 
> [metrics-blocking-reservoir-scheduler-0] hive.metastore - Opened a connection 
> to metastore, current connections: 12020-02-24T15:35:51,662 INFO 
> [metrics-blocking-reservoir-scheduler-0] hive.metastore - Connected to 
> metastore.2020-02-24T15:35:51,992 INFO 
> [metrics-blocking-reservoir-scheduler-0] org.apache.tez.client.TezClient - 
> Tez Client Version: [ component=tez-api, version=0.9.2, 
> revision=9566b9ed1d86bc2697f1622e4e9825da6c011583, 
> SCM-URL=scm:git:https://gitbox.apache.org/repos/asf/tez.git, 
> buildTime=2019-10-28T16:32:03Z ]2020-02-24T15:35:51,992 INFO 
> [metrics-blocking-reservoir-scheduler-0] 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState - Opening new Tez Session 
> (id: 3f98a154-e471-40fc-9829-4c4283266d46, scratch dir: 
> hdfs://ip-1-1-2-1.ec2.internal:8020/tmp/hive/kylin/_tez_session_dir/3f98a154-e471-40fc-9829-4c4283266d46)2020-02-24T15:35:52,578
>  INFO [metrics-blocking-reservoir-scheduler-0] 
> org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at 
> ip-1-1-2-1.ec2.internal/10.127.2.141:80322020-02-24T15:35:52,767 INFO 
> [metrics-blocking-reservoir-scheduler-0] org.apache.tez.client.TezClient - 
> Session mode. Starting session.2020-02-24T15:35:52,839 INFO 
> [metrics-blocking-reservoir-scheduler-0] org.apache.tez.client.TezClientUtils 
> - Using tez.lib.uris value from configuration: 
> hdfs:///apps/tez/tez.tar.gz2020-02-24T15:35:52,839 INFO 
> [metrics-blocking-reservoir-scheduler-0] org.apache.tez.client.TezClientUtils 
> - Using tez.lib.uris.classpath value from configuration: 
> null2020-02-24T15:35:52,871 INFO [metrics-blocking-reservoir-scheduler-0] 
> org.apache.hadoop.hdfs.DFSClient - Created HDFS_DELEGATION_TOKEN token 856 
> for kylin on 10.127.2.141:80202020-02-24T15:35:53,280 INFO 
> [metrics-blocking-reservoir-scheduler-0] 
> org.apache.tez.common.security.TokenCache - Got dt for 
> hdfs://ip-1-1-2-1.ec2.internal:8020; Kind: HDFS_DELEGATION_TOKEN, Service: 
> 10.127.2.141:8020, Ident: (HDFS_DELEGATION_TOKEN token 856 for 
> kylin)2020-02-24T15:35:53,280 INFO [metrics-blocking-reservoir-scheduler-0] 
> org.apache.tez.common.security.TokenCache - Got dt for 
> hdfs://ip-1-1-2-1.ec2.internal:8020; Kind: kms-dt, Service: 
> 10.127.2.141:9700, Ident: (owner=kylin, renewer=yarn, realUser=, 
> issueDate=1582558553105, maxDate=1583163353105, sequenceNumber=853, 
> masterKeyId=53)2020-02-24T15:35:53,310 INFO 
> [metrics-blocking-reservoir-scheduler-0] org.apache.tez.client.TezClient - 
> Tez system stage directory 
> hdfs://ip-1-1-2-1.ec2.internal:8020/tmp/hive/kylin/_tez_session_dir/3f98a154-e471-40fc-9829-4c4283266d46/.tez/application_1578089000827_0674
>  doesn't exist and is created2020-02-24T15:35:54,257 INFO [BadQueryDetector] 
> org.apache.kylin.rest.service.BadQueryDetector - Detect bad 
> query.2020-02-24T15:35:54,620 INFO [metrics-blocking-reservoir-scheduler-0] 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: 
> http://ip-1-1-2-1.ec2.internal:8188/ws/v1/timeline/2020-02-24T15:35:55,040 
> INFO [metrics-blocking-reservoir-scheduler-0] 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1578089000827_06742020-02-24T15:35:55,041 INFO 
> [metrics-blocking-reservoir-scheduler-0] org.apache.tez.client.TezClient - 
> The url to track the Tez Session: 
> http://ip-1-1-2-1.ec2.internal:20888/proxy/application_1578089000827_0674/2020-02-24T15:35:57,000
>  INFO [FetcherRunner 1354629870-25] 
> org.apache.kylin.job.impl.threadpool.DefaultFetcherRunner - Job Fetcher: 0 
> should running, 0 actual running, 0 stopped, 0 ready, 20 already succeed, 1 
> error, 0 discarded, 0 others2020-02-24T15:35:59,829 INFO 
> [metrics-blocking-reservoir-scheduler-0] org.apache.hadoop.hive.ql.Driver - 
> Compiling 
> command(queryId=kylin_20200224153550_249ef9e2-5723-403f-8ef9-e1a43de9b661): 
> ALTER TABLE KYLIN.HIVE_METRICS_QUERY_QA ADD IF NOT EXISTS PARTITION 
> (kday_date='2020-02-24')2020-02-24T15:36:01,467 INFO 
> [metrics-blocking-reservoir-scheduler-0] org.apache.hadoop.hive.ql.Driver - 
> Semantic Analysis Completed2020-02-24T15:36:01,471 INFO 
> [metrics-blocking-reservoir-scheduler-0] org.apache.hadoop.hive.ql.Driver - 
> Returning Hive schema: Schema(fieldSchemas:null, 
> properties:null)2020-02-24T15:36:01,485 INFO 
> [metrics-blocking-reservoir-scheduler-0] org.apache.hadoop.hive.ql.Driver - 
> Completed compiling 
> command(queryId=kylin_20200224153550_249ef9e2-5723-403f-8ef9-e1a43de9b661); 
> Time taken: 1.708 seconds2020-02-24T15:36:01,485 INFO 
> [metrics-blocking-reservoir-scheduler-0] org.apache.hadoop.hive.ql.Driver - 
> Concurrency mode is disabled, not creating a lock 
> manager2020-02-24T15:36:01,485 INFO [metrics-blocking-reservoir-scheduler-0] 
> org.apache.hadoop.hive.ql.Driver - Executing 
> command(queryId=kylin_20200224153550_249ef9e2-5723-403f-8ef9-e1a43de9b661): 
> ALTER TABLE KYLIN.HIVE_METRICS_QUERY_QA ADD IF NOT EXISTS PARTITION 
> (kday_date='2020-02-24')2020-02-24T15:36:01,506 INFO 
> [metrics-blocking-reservoir-scheduler-0] org.apache.hadoop.hive.ql.Driver - 
> Starting task [Stage-0:DDL] in serial mode2020-02-24T15:36:02,952 INFO 
> [metrics-blocking-reservoir-scheduler-0] hive.ql.metadata.Hive - Dumping 
> metastore api call timing information for : execution 
> phase2020-02-24T15:36:02,952 INFO [metrics-blocking-reservoir-scheduler-0] 
> hive.ql.metadata.Hive - Total time spent in this metastore function was 
> greater than 1000ms : add_partitions_(List, boolean, boolean, 
> )=11912020-02-24T15:36:02,952 INFO [metrics-blocking-reservoir-scheduler-0] 
> org.apache.hadoop.hive.ql.Driver - Completed executing 
> command(queryId=kylin_20200224153550_249ef9e2-5723-403f-8ef9-e1a43de9b661); 
> Time taken: 1.467 secondsOK2020-02-24T15:36:02,953 INFO 
> [metrics-blocking-reservoir-scheduler-0] org.apache.hadoop.hive.ql.Driver - 
> OK2020-02-24T15:36:02,954 INFO [metrics-blocking-reservoir-scheduler-0] 
> org.apache.kylin.metrics.lib.impl.hive.HiveProducer - Try to use new 
> partition content path: 
> s3://my_bucket/warehouse/kylin.db/hive_metrics_query_qa/kday_date=2020-02-24/ip-1-1-1-1-1582558548273-part-0000
>  for metric: METRICS_QUERY_QA2020-02-24T15:36:03,322 INFO 
> [metrics-blocking-reservoir-scheduler-0] 
> com.amazon.ws.emr.hadoop.fs.cse.CSEMultipartUploadOutputStream - close 
> closed:false 
> s3://my_bucket/warehouse/kylin.db/hive_metrics_query_qa/kday_date=2020-02-24/ip-1-1-1-1-1582558548273-part-00002020-02-24T15:36:03,847
>  INFO [metrics-blocking-reservoir-scheduler-0] 
> com.amazon.ws.emr.hadoop.fs.s3.upload.dispatch.DefaultMultipartUploadDispatcher
>  - Completed multipart upload of 1 parts 0 bytes2020-02-24T15:36:04,203 INFO 
> [metrics-blocking-reservoir-scheduler-0] 
> com.amazon.ws.emr.hadoop.fs.cse.CSEMultipartUploadOutputStream - Finished 
> uploading 
> my_bucket/warehouse/kylin.db/hive_metrics_query_qa/kday_date=2020-02-24/ip-1-1-1-1-1582558548273-part-0000.
>  Elapsed seconds: 0.2020-02-24T15:36:04,284 ERROR 
> [metrics-blocking-reservoir-scheduler-0] 
> org.apache.kylin.metrics.lib.impl.hive.HiveReservoirReporter - 
> nulljava.lang.UnsupportedOperationException at 
> com.amazon.ws.emr.hadoop.fs.s3n2.S3NativeFileSystem2.append(S3NativeFileSystem2.java:150)
>  ~[emrfs-hadoop-assembly-2.37.0.jar:?] at 
> org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1181) 
> ~[hadoop-common-2.8.5-amzn-5.jar:?] at 
> com.amazon.ws.emr.hadoop.fs.EmrFileSystem.append(EmrFileSystem.java:295) 
> ~[emrfs-hadoop-assembly-2.37.0.jar:?] at 
> org.apache.kylin.metrics.lib.impl.hive.HiveProducer.write(HiveProducer.java:204)
>  ~[kylin-metrics-reporter-hive-3.0.0.jar:3.0.0] at 
> org.apache.kylin.metrics.lib.impl.hive.HiveProducer.send(HiveProducer.java:134)
>  ~[kylin-metrics-reporter-hive-3.0.0.jar:3.0.0] at 
> org.apache.kylin.metrics.lib.impl.hive.HiveReservoirReporter$HiveReservoirListener.onRecordUpdate(HiveReservoirReporter.java:144)
>  [kylin-metrics-reporter-hive-3.0.0.jar:3.0.0] at 
> org.apache.kylin.metrics.lib.impl.BlockingReservoir.notifyListenerOfUpdatedRecord(BlockingReservoir.java:117)
>  [kylin-core-metrics-3.0.0.jar:3.0.0] at 
> org.apache.kylin.metrics.lib.impl.BlockingReservoir.onRecordUpdate(BlockingReservoir.java:105)
>  [kylin-core-metrics-3.0.0.jar:3.0.0] at 
> org.apache.kylin.metrics.lib.impl.BlockingReservoir.access$300(BlockingReservoir.java:37)
>  [kylin-core-metrics-3.0.0.jar:3.0.0] at 
> org.apache.kylin.metrics.lib.impl.BlockingReservoir$ReporterRunnable.run(BlockingReservoir.java:171)
>  [kylin-core-metrics-3.0.0.jar:3.0.0] at 
> java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]2020-02-24T15:36:04,290 
> WARN [metrics-blocking-reservoir-scheduler-0] 
> org.apache.kylin.metrics.lib.impl.BlockingReservoir - It fails to notify 
> listener 
> org.apache.kylin.metrics.lib.impl.hive.HiveReservoirReporter$HiveReservoirListener@1d460286
>  of updated record size 1132



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to