sixtus edited a comment on issue #8840: index_hadoop tasks fail on wrong file format when run inside indexer URL: https://github.com/apache/incubator-druid/issues/8840#issuecomment-554970896 I just noticed the `kill` task is also throwing an exception: ``` Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: Error: java.lang.IllegalArgumentException: Pathname /druid/indexer/foo/2019-11-17T17:00:00.000Z_2019-11-17T18:00:00.000Z/2019-11-18T07:08:21.067Z/0/index.zip.1 from hdfs://us2/druid/indexer/foo/2019-11-17T17:00:00.000Z_2019-11-17T18:00:00.000Z/2019-11-18T07:08:21.067Z/0/index.zip.1 is not a valid DFS filename. Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:217) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:476) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:473) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:473) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:414) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:929) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.druid.indexer.JobHelper$2.push(JobHelper.java:452) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at java.lang.reflect.Method.invoke(Method.java:498) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at com.sun.proxy.$Proxy78.push(Unknown Source) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.druid.indexer.JobHelper.serializeOutIndex(JobHelper.java:469) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:828) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:579) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at java.security.AccessController.doPrivileged(Native Method) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at javax.security.auth.Subject.doAs(Subject.java:422) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) Nov 18 11:59:13 dl385g10-nm14-01-nr106 druid-indexer[60784]: at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) ``` I verified, there is no broken name in the meta storage. I.e. the kill task must generated it itself (rather than use the path from metastore) and then runs into the same trap. From my limited understanding, it looks like it's not instantiating `HdfsDataSegmentPusher` but rather `LocalDataSegmentPusher`- still puzzled though.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org