[jira] [Comment Edited] (HADOOP-15607) AliyunOSS: fix duplicated partNumber issue in AliyunOSSBlockOutputStream
[ https://issues.apache.org/jira/browse/HADOOP-15607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563531#comment-16563531 ] wujinhu edited comment on HADOOP-15607 at 7/31/18 11:58 AM: [~Sammi] I have built in my local environment successfully. Can we get some logs from remote servers? [https://builds.apache.org/job/PreCommit-HADOOP-Build/14966/console] I have found some error messages from the link like this: setuptools_scm.version.SetuptoolsOutdatedWarning: your setuptools is too old (<12) Cleaning up... Command python setup.py egg_info failed with error code 1 in /tmp/pip_build_root/pylint Storing debug log for failure in /root/.pip/pip.log The command '/bin/sh -c pip install pylint' returned a non-zero code: 1 was (Author: wujinhu): [~Sammi] I have built in my local environment successfully. Can we get some logs from remote servers? > AliyunOSS: fix duplicated partNumber issue in AliyunOSSBlockOutputStream > - > > Key: HADOOP-15607 > URL: https://issues.apache.org/jira/browse/HADOOP-15607 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.1.0, 2.10.0, 2.9.1, 3.2.0, 3.0.3 >Reporter: wujinhu >Assignee: wujinhu >Priority: Major > Fix For: 3.2.0, 3.0.4, 3.1.2 > > Attachments: HADOOP-15607-branch-2.001.patch, HADOOP-15607.001.patch, > HADOOP-15607.002.patch, HADOOP-15607.003.patch, HADOOP-15607.004.patch > > > When I generated data with hive-tpcds tool, I got exception below: > 2018-07-16 14:50:43,680 INFO mapreduce.Job: Task Id : > attempt_1531723399698_0001_m_52_0, Status : FAILED > Error: com.aliyun.oss.OSSException: The list of parts was not in ascending > order. Parts list must specified in order by part number. > [ErrorCode]: InvalidPartOrder > [RequestId]: 5B4C40425FCC208D79D1EAF5 > [HostId]: 100.103.0.137 > [ResponseError]: > > > InvalidPartOrder > The list of parts was not in ascending order. Parts list must > specified in order by part number. > 5B4C40425FCC208D79D1EAF5 > xx.xx.xx.xx > current PartNumber 3, you given part number 3is not in > ascending order > > at > com.aliyun.oss.common.utils.ExceptionFactory.createOSSException(ExceptionFactory.java:99) > at > com.aliyun.oss.internal.OSSErrorResponseHandler.handle(OSSErrorResponseHandler.java:69) > at > com.aliyun.oss.common.comm.ServiceClient.handleResponse(ServiceClient.java:248) > at > com.aliyun.oss.common.comm.ServiceClient.sendRequestImpl(ServiceClient.java:130) > at > com.aliyun.oss.common.comm.ServiceClient.sendRequest(ServiceClient.java:68) > at com.aliyun.oss.internal.OSSOperation.send(OSSOperation.java:94) > at com.aliyun.oss.internal.OSSOperation.doOperation(OSSOperation.java:149) > at com.aliyun.oss.internal.OSSOperation.doOperation(OSSOperation.java:113) > at > com.aliyun.oss.internal.OSSMultipartOperation.completeMultipartUpload(OSSMultipartOperation.java:185) > at com.aliyun.oss.OSSClient.completeMultipartUpload(OSSClient.java:790) > at > org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystemStore.completeMultipartUpload(AliyunOSSFileSystemStore.java:643) > at > org.apache.hadoop.fs.aliyun.oss.AliyunOSSBlockOutputStream.close(AliyunOSSBlockOutputStream.java:120) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.close(TextOutputFormat.java:106) > at > org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.close(MultipleOutputs.java:574) > at org.notmysock.tpcds.GenTable$DSDGen.cleanup(GenTable.java:169) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:149) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1686) > > I reviewed code below, > {code:java} > blockId {code} > has thread synchronization problem > {code:java} > // code placeholder > private void uploadCurrentPart() throws IOException { > blockFiles.add(blockFile); > blockStream.flush(); > blockStream.close(); > if (blockId == 0) { > uploadId = store.getUploadId(key); > } > ListenableFuture partETagFuture = > executorService.submit(() -> { > PartETag partETag = store.uploadPart(blockFile, key, uploadId, > blockId + 1); > return partETag; > }); > pa
[jira] [Comment Edited] (HADOOP-15607) AliyunOSS: fix duplicated partNumber issue in AliyunOSSBlockOutputStream
[ https://issues.apache.org/jira/browse/HADOOP-15607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561567#comment-16561567 ] wujinhu edited comment on HADOOP-15607 at 7/30/18 7:34 AM: --- Thanks [~Sammi] I have attached a patch for branch-2.:) It seems there are some problems with the build system. was (Author: wujinhu): Thanks [~Sammi] I have attached a patch for branch-2.:) > AliyunOSS: fix duplicated partNumber issue in AliyunOSSBlockOutputStream > - > > Key: HADOOP-15607 > URL: https://issues.apache.org/jira/browse/HADOOP-15607 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.1.0, 2.10.0, 2.9.1, 3.2.0, 3.0.3 >Reporter: wujinhu >Assignee: wujinhu >Priority: Major > Fix For: 3.2.0, 3.0.4, 3.1.2 > > Attachments: HADOOP-15607-branch-2.001.patch, HADOOP-15607.001.patch, > HADOOP-15607.002.patch, HADOOP-15607.003.patch, HADOOP-15607.004.patch > > > When I generated data with hive-tpcds tool, I got exception below: > 2018-07-16 14:50:43,680 INFO mapreduce.Job: Task Id : > attempt_1531723399698_0001_m_52_0, Status : FAILED > Error: com.aliyun.oss.OSSException: The list of parts was not in ascending > order. Parts list must specified in order by part number. > [ErrorCode]: InvalidPartOrder > [RequestId]: 5B4C40425FCC208D79D1EAF5 > [HostId]: 100.103.0.137 > [ResponseError]: > > > InvalidPartOrder > The list of parts was not in ascending order. Parts list must > specified in order by part number. > 5B4C40425FCC208D79D1EAF5 > xx.xx.xx.xx > current PartNumber 3, you given part number 3is not in > ascending order > > at > com.aliyun.oss.common.utils.ExceptionFactory.createOSSException(ExceptionFactory.java:99) > at > com.aliyun.oss.internal.OSSErrorResponseHandler.handle(OSSErrorResponseHandler.java:69) > at > com.aliyun.oss.common.comm.ServiceClient.handleResponse(ServiceClient.java:248) > at > com.aliyun.oss.common.comm.ServiceClient.sendRequestImpl(ServiceClient.java:130) > at > com.aliyun.oss.common.comm.ServiceClient.sendRequest(ServiceClient.java:68) > at com.aliyun.oss.internal.OSSOperation.send(OSSOperation.java:94) > at com.aliyun.oss.internal.OSSOperation.doOperation(OSSOperation.java:149) > at com.aliyun.oss.internal.OSSOperation.doOperation(OSSOperation.java:113) > at > com.aliyun.oss.internal.OSSMultipartOperation.completeMultipartUpload(OSSMultipartOperation.java:185) > at com.aliyun.oss.OSSClient.completeMultipartUpload(OSSClient.java:790) > at > org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystemStore.completeMultipartUpload(AliyunOSSFileSystemStore.java:643) > at > org.apache.hadoop.fs.aliyun.oss.AliyunOSSBlockOutputStream.close(AliyunOSSBlockOutputStream.java:120) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.close(TextOutputFormat.java:106) > at > org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.close(MultipleOutputs.java:574) > at org.notmysock.tpcds.GenTable$DSDGen.cleanup(GenTable.java:169) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:149) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1686) > > I reviewed code below, > {code:java} > blockId {code} > has thread synchronization problem > {code:java} > // code placeholder > private void uploadCurrentPart() throws IOException { > blockFiles.add(blockFile); > blockStream.flush(); > blockStream.close(); > if (blockId == 0) { > uploadId = store.getUploadId(key); > } > ListenableFuture partETagFuture = > executorService.submit(() -> { > PartETag partETag = store.uploadPart(blockFile, key, uploadId, > blockId + 1); > return partETag; > }); > partETagsFutures.add(partETagFuture); > blockFile = newBlockFile(); > blockId++; > blockStream = new BufferedOutputStream(new FileOutputStream(blockFile)); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15607) AliyunOSS: fix duplicated partNumber issue in AliyunOSSBlockOutputStream
[ https://issues.apache.org/jira/browse/HADOOP-15607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16559315#comment-16559315 ] SammiChen edited comment on HADOOP-15607 at 7/27/18 6:32 AM: - Hi [~wujinhu], the 003 patch looks overall good. A few minor issues, # testMultiPartUploadConcurrent, cross line indent is 4 space, not 8 space # conf.setInt("fs.oss.upload.active.blocks", 20); Use the Constants field # conf.setInt(IO_CHUNK_BUFFER_SIZE, conf.getInt(Constants.MULTIPART_UPLOAD_PART_SIZE_KEY, 0)); line exceeds 80 characters The newly added UT really helps verify the issues is fixed. was (Author: sammi): Hi [~wujinhu], the 003 patch looks overall good. A few minor issues, # testMultiPartUploadConcurrent, cross line indent is 4 space, not 8 space # conf.setInt("fs.oss.upload.active.blocks", 20); Use the Constants field # conf.setInt(IO_CHUNK_BUFFER_SIZE, conf.getInt(Constants.MULTIPART_UPLOAD_PART_SIZE_KEY, 0)); line exceeds 80 characters # removePartFiles is also called in uploadCurrentPart. It can help to remove the temp files as soon as possible. But if a temp file is deleted, it's corresponding partETagFuture is not removed from the partETagsFutures. You will find many misleading "Failed to delete temporary file {}" messages later in the log file. The newly added UT really helps verify the issues is fixed. > AliyunOSS: fix duplicated partNumber issue in AliyunOSSBlockOutputStream > - > > Key: HADOOP-15607 > URL: https://issues.apache.org/jira/browse/HADOOP-15607 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.10.0, 2.9.1, 3.2.0, 3.0.3 >Reporter: wujinhu >Assignee: wujinhu >Priority: Major > Attachments: HADOOP-15607.001.patch, HADOOP-15607.002.patch, > HADOOP-15607.003.patch > > > When I generated data with hive-tpcds tool, I got exception below: > 2018-07-16 14:50:43,680 INFO mapreduce.Job: Task Id : > attempt_1531723399698_0001_m_52_0, Status : FAILED > Error: com.aliyun.oss.OSSException: The list of parts was not in ascending > order. Parts list must specified in order by part number. > [ErrorCode]: InvalidPartOrder > [RequestId]: 5B4C40425FCC208D79D1EAF5 > [HostId]: 100.103.0.137 > [ResponseError]: > > > InvalidPartOrder > The list of parts was not in ascending order. Parts list must > specified in order by part number. > 5B4C40425FCC208D79D1EAF5 > xx.xx.xx.xx > current PartNumber 3, you given part number 3is not in > ascending order > > at > com.aliyun.oss.common.utils.ExceptionFactory.createOSSException(ExceptionFactory.java:99) > at > com.aliyun.oss.internal.OSSErrorResponseHandler.handle(OSSErrorResponseHandler.java:69) > at > com.aliyun.oss.common.comm.ServiceClient.handleResponse(ServiceClient.java:248) > at > com.aliyun.oss.common.comm.ServiceClient.sendRequestImpl(ServiceClient.java:130) > at > com.aliyun.oss.common.comm.ServiceClient.sendRequest(ServiceClient.java:68) > at com.aliyun.oss.internal.OSSOperation.send(OSSOperation.java:94) > at com.aliyun.oss.internal.OSSOperation.doOperation(OSSOperation.java:149) > at com.aliyun.oss.internal.OSSOperation.doOperation(OSSOperation.java:113) > at > com.aliyun.oss.internal.OSSMultipartOperation.completeMultipartUpload(OSSMultipartOperation.java:185) > at com.aliyun.oss.OSSClient.completeMultipartUpload(OSSClient.java:790) > at > org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystemStore.completeMultipartUpload(AliyunOSSFileSystemStore.java:643) > at > org.apache.hadoop.fs.aliyun.oss.AliyunOSSBlockOutputStream.close(AliyunOSSBlockOutputStream.java:120) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.close(TextOutputFormat.java:106) > at > org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.close(MultipleOutputs.java:574) > at org.notmysock.tpcds.GenTable$DSDGen.cleanup(GenTable.java:169) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:149) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1686) > > I reviewed code below, > {code:java} > blockId {code} > has thread synchronization problem > {code:java} > // code placeholder > private void uploadCurrentPart() throws IOException { > blockFiles.add(blockF
[jira] [Comment Edited] (HADOOP-15607) AliyunOSS: fix duplicated partNumber issue in AliyunOSSBlockOutputStream
[ https://issues.apache.org/jira/browse/HADOOP-15607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16552455#comment-16552455 ] wujinhu edited comment on HADOOP-15607 at 7/23/18 8:05 AM: --- Thanks [~Sammi] [~uncleGen] for your comments. As the code shows, uploadCurrentPart() executes blockId++ when submit tasks to the thread pool. If task executes later than the main thread, then the blockId has been changed. Changing blockFiles type is to fix another problem(we cannot delete files in write(), because they may be used by upload threads), so we can track upload files by blockId. I have tried to add unit test to reproduce this issue in my mac environment, but failed. It seems difficult to reproduce this in local environment except when I debug old code sometimes. was (Author: wujinhu): Thanks [~Sammi] [~uncleGen] for your comments. As the code shows, uploadCurrentPart() executes blockId++ when submit tasks to the thread pool. If task executes later than the main thread, then the blockId has been changed. Changing blockFiles type is to fix another problem(we cannot delete files in write(), because they may be used by upload threads), so we can track upload files by blockId. I have tried to add unit test to reproduce this issue in my mac environment, but failed. It seems difficult to reproduce this in local environment only when I debug old code sometimes. > AliyunOSS: fix duplicated partNumber issue in AliyunOSSBlockOutputStream > - > > Key: HADOOP-15607 > URL: https://issues.apache.org/jira/browse/HADOOP-15607 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.10.0, 2.9.1, 3.2.0, 3.1.1, 3.0.3 >Reporter: wujinhu >Assignee: wujinhu >Priority: Major > Attachments: HADOOP-15607.001.patch, HADOOP-15607.002.patch > > > When I generated data with hive-tpcds tool, I got exception below: > 2018-07-16 14:50:43,680 INFO mapreduce.Job: Task Id : > attempt_1531723399698_0001_m_52_0, Status : FAILED > Error: com.aliyun.oss.OSSException: The list of parts was not in ascending > order. Parts list must specified in order by part number. > [ErrorCode]: InvalidPartOrder > [RequestId]: 5B4C40425FCC208D79D1EAF5 > [HostId]: 100.103.0.137 > [ResponseError]: > > > InvalidPartOrder > The list of parts was not in ascending order. Parts list must > specified in order by part number. > 5B4C40425FCC208D79D1EAF5 > xx.xx.xx.xx > current PartNumber 3, you given part number 3is not in > ascending order > > at > com.aliyun.oss.common.utils.ExceptionFactory.createOSSException(ExceptionFactory.java:99) > at > com.aliyun.oss.internal.OSSErrorResponseHandler.handle(OSSErrorResponseHandler.java:69) > at > com.aliyun.oss.common.comm.ServiceClient.handleResponse(ServiceClient.java:248) > at > com.aliyun.oss.common.comm.ServiceClient.sendRequestImpl(ServiceClient.java:130) > at > com.aliyun.oss.common.comm.ServiceClient.sendRequest(ServiceClient.java:68) > at com.aliyun.oss.internal.OSSOperation.send(OSSOperation.java:94) > at com.aliyun.oss.internal.OSSOperation.doOperation(OSSOperation.java:149) > at com.aliyun.oss.internal.OSSOperation.doOperation(OSSOperation.java:113) > at > com.aliyun.oss.internal.OSSMultipartOperation.completeMultipartUpload(OSSMultipartOperation.java:185) > at com.aliyun.oss.OSSClient.completeMultipartUpload(OSSClient.java:790) > at > org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystemStore.completeMultipartUpload(AliyunOSSFileSystemStore.java:643) > at > org.apache.hadoop.fs.aliyun.oss.AliyunOSSBlockOutputStream.close(AliyunOSSBlockOutputStream.java:120) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.close(TextOutputFormat.java:106) > at > org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.close(MultipleOutputs.java:574) > at org.notmysock.tpcds.GenTable$DSDGen.cleanup(GenTable.java:169) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:149) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1686) > > I reviewed code below, > {code:java} > blockId {code} > has thread synchronization problem > {code:java} > // code placeholder > private void uploadCurrentPart() throws IOExcepti
[jira] [Comment Edited] (HADOOP-15607) AliyunOSS: fix duplicated partNumber issue in AliyunOSSBlockOutputStream
[ https://issues.apache.org/jira/browse/HADOOP-15607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16545952#comment-16545952 ] Genmao Yu edited comment on HADOOP-15607 at 7/17/18 2:38 AM: - [~wujinhu] Overall LGTM,but maybe we should use one intermediate variable for blockId to pass to {{upload task}}, like: {code:java} int bid = ++blockId; PartETag partETag = store.uploadPart(currentFile, key, uploadId, bid); ... {code} was (Author: unclegen): [~wujinhu] Overall LGTM,but maybe we should use one intermediate variable for blockId to pass to {{upload task}}, like: {code:java} int bid = ++block; PartETag partETag = store.uploadPart(currentFile, key, uploadId, bid); ... {code} > AliyunOSS: fix duplicated partNumber issue in AliyunOSSBlockOutputStream > - > > Key: HADOOP-15607 > URL: https://issues.apache.org/jira/browse/HADOOP-15607 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.10.0, 2.9.1, 3.2.0, 3.1.1, 3.0.3 >Reporter: wujinhu >Assignee: wujinhu >Priority: Major > Attachments: HADOOP-15607.001.patch > > > When I generated data with hive-tpcds tool, I got exception below: > 2018-07-16 14:50:43,680 INFO mapreduce.Job: Task Id : > attempt_1531723399698_0001_m_52_0, Status : FAILED > Error: com.aliyun.oss.OSSException: The list of parts was not in ascending > order. Parts list must specified in order by part number. > [ErrorCode]: InvalidPartOrder > [RequestId]: 5B4C40425FCC208D79D1EAF5 > [HostId]: 100.103.0.137 > [ResponseError]: > > > InvalidPartOrder > The list of parts was not in ascending order. Parts list must > specified in order by part number. > 5B4C40425FCC208D79D1EAF5 > 100.103.0.137 > current PartNumber 3, you given part number 3is not in > ascending order > > at > com.aliyun.oss.common.utils.ExceptionFactory.createOSSException(ExceptionFactory.java:99) > at > com.aliyun.oss.internal.OSSErrorResponseHandler.handle(OSSErrorResponseHandler.java:69) > at > com.aliyun.oss.common.comm.ServiceClient.handleResponse(ServiceClient.java:248) > at > com.aliyun.oss.common.comm.ServiceClient.sendRequestImpl(ServiceClient.java:130) > at > com.aliyun.oss.common.comm.ServiceClient.sendRequest(ServiceClient.java:68) > at com.aliyun.oss.internal.OSSOperation.send(OSSOperation.java:94) > at com.aliyun.oss.internal.OSSOperation.doOperation(OSSOperation.java:149) > at com.aliyun.oss.internal.OSSOperation.doOperation(OSSOperation.java:113) > at > com.aliyun.oss.internal.OSSMultipartOperation.completeMultipartUpload(OSSMultipartOperation.java:185) > at com.aliyun.oss.OSSClient.completeMultipartUpload(OSSClient.java:790) > at > org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystemStore.completeMultipartUpload(AliyunOSSFileSystemStore.java:643) > at > org.apache.hadoop.fs.aliyun.oss.AliyunOSSBlockOutputStream.close(AliyunOSSBlockOutputStream.java:120) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.close(TextOutputFormat.java:106) > at > org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.close(MultipleOutputs.java:574) > at org.notmysock.tpcds.GenTable$DSDGen.cleanup(GenTable.java:169) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:149) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1686) > > I reviewed code below, > {code:java} > blockId {code} > has thread synchronization problem > {code:java} > // code placeholder > private void uploadCurrentPart() throws IOException { > blockFiles.add(blockFile); > blockStream.flush(); > blockStream.close(); > if (blockId == 0) { > uploadId = store.getUploadId(key); > } > ListenableFuture partETagFuture = > executorService.submit(() -> { > PartETag partETag = store.uploadPart(blockFile, key, uploadId, > blockId + 1); > return partETag; > }); > partETagsFutures.add(partETagFuture); > blockFile = newBlockFile(); > blockId++; > blockStream = new BufferedOutputStream(new FileOutputStream(blockFile)); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-i
[jira] [Comment Edited] (HADOOP-15607) AliyunOSS: fix duplicated partNumber issue in AliyunOSSBlockOutputStream
[ https://issues.apache.org/jira/browse/HADOOP-15607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16545952#comment-16545952 ] Genmao Yu edited comment on HADOOP-15607 at 7/17/18 2:38 AM: - [~wujinhu] Overall LGTM,but maybe we should use one intermediate variable for blockId to pass to {{upload task}}, like: {code:java} int bid = ++block; PartETag partETag = store.uploadPart(currentFile, key, uploadId, bid); ... {code} was (Author: unclegen): [~wujinhu] Overall LGTM,but maybe we should use one intermediate variable for blockId to pass to {{upload task}}, like: {code:java} int bid = block++ PartETag partETag = store.uploadPart(currentFile, key, uploadId, bid); ... {code} > AliyunOSS: fix duplicated partNumber issue in AliyunOSSBlockOutputStream > - > > Key: HADOOP-15607 > URL: https://issues.apache.org/jira/browse/HADOOP-15607 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.10.0, 2.9.1, 3.2.0, 3.1.1, 3.0.3 >Reporter: wujinhu >Assignee: wujinhu >Priority: Major > Attachments: HADOOP-15607.001.patch > > > When I generated data with hive-tpcds tool, I got exception below: > 2018-07-16 14:50:43,680 INFO mapreduce.Job: Task Id : > attempt_1531723399698_0001_m_52_0, Status : FAILED > Error: com.aliyun.oss.OSSException: The list of parts was not in ascending > order. Parts list must specified in order by part number. > [ErrorCode]: InvalidPartOrder > [RequestId]: 5B4C40425FCC208D79D1EAF5 > [HostId]: 100.103.0.137 > [ResponseError]: > > > InvalidPartOrder > The list of parts was not in ascending order. Parts list must > specified in order by part number. > 5B4C40425FCC208D79D1EAF5 > 100.103.0.137 > current PartNumber 3, you given part number 3is not in > ascending order > > at > com.aliyun.oss.common.utils.ExceptionFactory.createOSSException(ExceptionFactory.java:99) > at > com.aliyun.oss.internal.OSSErrorResponseHandler.handle(OSSErrorResponseHandler.java:69) > at > com.aliyun.oss.common.comm.ServiceClient.handleResponse(ServiceClient.java:248) > at > com.aliyun.oss.common.comm.ServiceClient.sendRequestImpl(ServiceClient.java:130) > at > com.aliyun.oss.common.comm.ServiceClient.sendRequest(ServiceClient.java:68) > at com.aliyun.oss.internal.OSSOperation.send(OSSOperation.java:94) > at com.aliyun.oss.internal.OSSOperation.doOperation(OSSOperation.java:149) > at com.aliyun.oss.internal.OSSOperation.doOperation(OSSOperation.java:113) > at > com.aliyun.oss.internal.OSSMultipartOperation.completeMultipartUpload(OSSMultipartOperation.java:185) > at com.aliyun.oss.OSSClient.completeMultipartUpload(OSSClient.java:790) > at > org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystemStore.completeMultipartUpload(AliyunOSSFileSystemStore.java:643) > at > org.apache.hadoop.fs.aliyun.oss.AliyunOSSBlockOutputStream.close(AliyunOSSBlockOutputStream.java:120) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.close(TextOutputFormat.java:106) > at > org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.close(MultipleOutputs.java:574) > at org.notmysock.tpcds.GenTable$DSDGen.cleanup(GenTable.java:169) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:149) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1686) > > I reviewed code below, > {code:java} > blockId {code} > has thread synchronization problem > {code:java} > // code placeholder > private void uploadCurrentPart() throws IOException { > blockFiles.add(blockFile); > blockStream.flush(); > blockStream.close(); > if (blockId == 0) { > uploadId = store.getUploadId(key); > } > ListenableFuture partETagFuture = > executorService.submit(() -> { > PartETag partETag = store.uploadPart(blockFile, key, uploadId, > blockId + 1); > return partETag; > }); > partETagsFutures.add(partETagFuture); > blockFile = newBlockFile(); > blockId++; > blockStream = new BufferedOutputStream(new FileOutputStream(blockFile)); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issu
[jira] [Comment Edited] (HADOOP-15607) AliyunOSS: fix duplicated partNumber issue in AliyunOSSBlockOutputStream
[ https://issues.apache.org/jira/browse/HADOOP-15607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16545952#comment-16545952 ] Genmao Yu edited comment on HADOOP-15607 at 7/17/18 2:37 AM: - [~wujinhu] Overall LGTM,but maybe we should use one intermediate variable for blockId to pass to {{upload task}}, like: {code:java} int bid = block++ PartETag partETag = store.uploadPart(currentFile, key, uploadId, bid); ... {code} was (Author: unclegen): [~wujinhu] Overall LGTM,but maybe we should use one intermediate variable for blockId to pass to upload task, like: {code:java} int bid = block++ PartETag partETag = store.uploadPart(currentFile, key, uploadId, bid); ... {code} > AliyunOSS: fix duplicated partNumber issue in AliyunOSSBlockOutputStream > - > > Key: HADOOP-15607 > URL: https://issues.apache.org/jira/browse/HADOOP-15607 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.10.0, 2.9.1, 3.2.0, 3.1.1, 3.0.3 >Reporter: wujinhu >Assignee: wujinhu >Priority: Major > Attachments: HADOOP-15607.001.patch > > > When I generated data with hive-tpcds tool, I got exception below: > 2018-07-16 14:50:43,680 INFO mapreduce.Job: Task Id : > attempt_1531723399698_0001_m_52_0, Status : FAILED > Error: com.aliyun.oss.OSSException: The list of parts was not in ascending > order. Parts list must specified in order by part number. > [ErrorCode]: InvalidPartOrder > [RequestId]: 5B4C40425FCC208D79D1EAF5 > [HostId]: 100.103.0.137 > [ResponseError]: > > > InvalidPartOrder > The list of parts was not in ascending order. Parts list must > specified in order by part number. > 5B4C40425FCC208D79D1EAF5 > 100.103.0.137 > current PartNumber 3, you given part number 3is not in > ascending order > > at > com.aliyun.oss.common.utils.ExceptionFactory.createOSSException(ExceptionFactory.java:99) > at > com.aliyun.oss.internal.OSSErrorResponseHandler.handle(OSSErrorResponseHandler.java:69) > at > com.aliyun.oss.common.comm.ServiceClient.handleResponse(ServiceClient.java:248) > at > com.aliyun.oss.common.comm.ServiceClient.sendRequestImpl(ServiceClient.java:130) > at > com.aliyun.oss.common.comm.ServiceClient.sendRequest(ServiceClient.java:68) > at com.aliyun.oss.internal.OSSOperation.send(OSSOperation.java:94) > at com.aliyun.oss.internal.OSSOperation.doOperation(OSSOperation.java:149) > at com.aliyun.oss.internal.OSSOperation.doOperation(OSSOperation.java:113) > at > com.aliyun.oss.internal.OSSMultipartOperation.completeMultipartUpload(OSSMultipartOperation.java:185) > at com.aliyun.oss.OSSClient.completeMultipartUpload(OSSClient.java:790) > at > org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystemStore.completeMultipartUpload(AliyunOSSFileSystemStore.java:643) > at > org.apache.hadoop.fs.aliyun.oss.AliyunOSSBlockOutputStream.close(AliyunOSSBlockOutputStream.java:120) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.close(TextOutputFormat.java:106) > at > org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.close(MultipleOutputs.java:574) > at org.notmysock.tpcds.GenTable$DSDGen.cleanup(GenTable.java:169) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:149) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1686) > > I reviewed code below, > {code:java} > blockId {code} > has thread synchronization problem > {code:java} > // code placeholder > private void uploadCurrentPart() throws IOException { > blockFiles.add(blockFile); > blockStream.flush(); > blockStream.close(); > if (blockId == 0) { > uploadId = store.getUploadId(key); > } > ListenableFuture partETagFuture = > executorService.submit(() -> { > PartETag partETag = store.uploadPart(blockFile, key, uploadId, > blockId + 1); > return partETag; > }); > partETagsFutures.add(partETagFuture); > blockFile = newBlockFile(); > blockId++; > blockStream = new BufferedOutputStream(new FileOutputStream(blockFile)); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-un