[jira] [Updated] (HADOOP-16927) Update hadoop-thirdparty dependency version to 1.0.0
[ https://issues.apache.org/jira/browse/HADOOP-16927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated HADOOP-16927: --- Labels: release-blocker (was: ) > Update hadoop-thirdparty dependency version to 1.0.0 > > > Key: HADOOP-16927 > URL: https://issues.apache.org/jira/browse/HADOOP-16927 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > Labels: release-blocker > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on issue #1890: HADOOP-16854 Fix to prevent OutOfMemoryException and Make the threadpool and bytebuffer pool common across all AbfsOutputStream instances
hadoop-yetus commented on issue #1890: HADOOP-16854 Fix to prevent OutOfMemoryException and Make the threadpool and bytebuffer pool common across all AbfsOutputStream instances URL: https://github.com/apache/hadoop/pull/1890#issuecomment-600997678 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 0s | Docker mode activated. | | -1 :x: | patch | 0m 5s | https://github.com/apache/hadoop/pull/1890 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. | | Subsystem | Report/Notes | |--:|:-| | GITHUB PR | https://github.com/apache/hadoop/pull/1890 | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1890/9/console | | versions | git=2.17.1 | | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] brfrn169 commented on issue #1889: HDFS-15215 The Timestamp for longest write/read lock held log is wrong
brfrn169 commented on issue #1889: HDFS-15215 The Timestamp for longest write/read lock held log is wrong URL: https://github.com/apache/hadoop/pull/1889#issuecomment-600962616 Ping @xkrogen Could you please take a look at it when you get a chance? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16927) Update hadoop-thirdparty dependency version to 1.0.0
[ https://issues.apache.org/jira/browse/HADOOP-16927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HADOOP-16927: -- Target Version/s: 3.3.0 > Update hadoop-thirdparty dependency version to 1.0.0 > > > Key: HADOOP-16927 > URL: https://issues.apache.org/jira/browse/HADOOP-16927 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16927) Update hadoop-thirdparty dependency version to 1.0.0
[ https://issues.apache.org/jira/browse/HADOOP-16927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062244#comment-17062244 ] Ayush Saxena commented on HADOOP-16927: --- Should be 3.3.0, I have updated it here. > Update hadoop-thirdparty dependency version to 1.0.0 > > > Key: HADOOP-16927 > URL: https://issues.apache.org/jira/browse/HADOOP-16927 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16836) Bug in widely-used helper function caused valid configuration value to fail on multiple tests, causing build failure
[ https://issues.apache.org/jira/browse/HADOOP-16836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ctest updated HADOOP-16836: --- Description: {code:java} org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryMixedLengths1 org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryUnknownLength org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryMixedLengths1 org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryUnknownLength{code} 4 actively-used tests above call the helper function `TestTFileStreams#writeRecords()` to write key-value pairs (kv pairs), then call `TestTFileByteArrays#readRecords()` to assert the key and the value part (v) of these kv pairs matched with what they wrote. All v of kv pairs are hardcode strings with a length of 6. `readRecords()` uses `org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValueLength()` to get full length of the v of these kv pairs. But `getValueLength()` can only get v's full length when it is less than the value of configuration parameter `tfile.io.chunk.size`, otherwise `readRecords()` will throw an exception. So, *when `tfile.io.chunk.size` is configured/set to a value less than 6, these 4 tests failed because of the exception from `readRecords()`, even 6 is a valid value for `tfile.io.chunk.size`.* The definition of `tfile.io.chunk.size` is "Value chunk size in bytes. Default to 1MB. Values of the length less than the chunk size is guaranteed to have known value length in read time (See also TFile.Reader.Scanner.Entry.isValueLengthKnown())". *Fixes* `readRecords()` should call `org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValue(byte[])` instead, which returns the correct full length of the `value` part despite whether the value's length is larger than `tfile.io.chunk.size`. was: {code:java} org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryMixedLengths1 org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryUnknownLength org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryMixedLengths1 org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryUnknownLength{code} 4 actively-used tests above call the helper function `TestTFileStreams#writeRecords()` to write key-value pairs (kv pairs), then call `TestTFileByteArrays#readRecords()` to assert the key and the value part (v) of these kv pairs matched with what they wrote. All v of kv pairs are hardcode strings with a length of 6. `readRecords()` uses `org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValueLength()` to get full length of the v of these kv pairs. But `getValueLength()` can only get the full length of v when v's full length is less than the value of configuration parameter `tfile.io.chunk.size`, otherwise `readRecords()` will throw an exception. So, *when `tfile.io.chunk.size` is configured/set to a value less than 6, these 4 tests failed because of the exception from `readRecords()`, even 6 is a valid value for `tfile.io.chunk.size`.* The definition of `tfile.io.chunk.size` is "Value chunk size in bytes. Default to 1MB. Values of the length less than the chunk size is guaranteed to have known value length in read time (See also TFile.Reader.Scanner.Entry.isValueLengthKnown())". *Fixes* `readRecords()` should call `org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValue(byte[])` instead, which returns the correct full length of the `value` part despite whether the value's length is larger than `tfile.io.chunk.size`. > Bug in widely-used helper function caused valid configuration value to fail > on multiple tests, causing build failure > > > Key: HADOOP-16836 > URL: https://issues.apache.org/jira/browse/HADOOP-16836 > Project: Hadoop Common > Issue Type: Bug > Components: common >Affects Versions: 3.3.0, 3.2.1 >Reporter: Ctest >Priority: Blocker > Labels: configuration, easyfix, patch, test > Attachments: HADOOP-16836-000.patch, HADOOP-16836-000.patch > > > {code:java} > org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryMixedLengths1 > org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryUnknownLength > org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryMixedLengths1 > org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryUnknownLength{code} > > 4 actively-used tests above call the helper function > `TestTFileStreams#writeRecords()` to write key-value pairs (kv pairs), then > call `TestTFileByteArrays#readRecords()` to assert the key and the value part > (v) of these kv pairs matched with what they wrote. All v of kv pairs are > hardcode strings with a
[jira] [Commented] (HADOOP-16836) Bug in widely-used helper function caused valid configuration value to fail on multiple tests, causing build failure
[ https://issues.apache.org/jira/browse/HADOOP-16836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062165#comment-17062165 ] Ctest commented on HADOOP-16836: I updated the description to make it more readable. > Bug in widely-used helper function caused valid configuration value to fail > on multiple tests, causing build failure > > > Key: HADOOP-16836 > URL: https://issues.apache.org/jira/browse/HADOOP-16836 > Project: Hadoop Common > Issue Type: Bug > Components: common >Affects Versions: 3.3.0, 3.2.1 >Reporter: Ctest >Priority: Blocker > Labels: configuration, easyfix, patch, test > Attachments: HADOOP-16836-000.patch, HADOOP-16836-000.patch > > > {code:java} > org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryMixedLengths1 > org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryUnknownLength > org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryMixedLengths1 > org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryUnknownLength{code} > > 4 actively-used tests above call the helper function > `TestTFileStreams#writeRecords()` to write key-value pairs (kv pairs), then > call `TestTFileByteArrays#readRecords()` to assert the key and the value part > (v) of these kv pairs matched with what they wrote. All v of kv pairs are > hardcode strings with a length of 6. > > `readRecords()` uses > `org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValueLength()` > to get full length of the v of these kv pairs. But `getValueLength()` can > only get the full length of v when v's full length is less than the value of > configuration parameter `tfile.io.chunk.size`, otherwise `readRecords()` will > throw an exception. So, *when `tfile.io.chunk.size` is configured/set to a > value less than 6, these 4 tests failed because of the exception from > `readRecords()`, even 6 is a valid value for `tfile.io.chunk.size`.* > The definition of `tfile.io.chunk.size` is "Value chunk size in bytes. > Default to 1MB. Values of the length less than the chunk size is guaranteed > to have known value length in read time (See also > TFile.Reader.Scanner.Entry.isValueLengthKnown())". > *Fixes* > `readRecords()` should call > `org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValue(byte[])` > instead, which returns the correct full length of the `value` part despite > whether the value's length is larger than `tfile.io.chunk.size`. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16836) Bug in widely-used helper function caused valid configuration value to fail on multiple tests, causing build failure
[ https://issues.apache.org/jira/browse/HADOOP-16836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ctest updated HADOOP-16836: --- Description: {code:java} org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryMixedLengths1 org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryUnknownLength org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryMixedLengths1 org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryUnknownLength{code} 4 actively-used tests above call the helper function `TestTFileStreams#writeRecords()` to write key-value pairs (kv pairs), then call `TestTFileByteArrays#readRecords()` to assert the key and the value part (v) of these kv pairs matched with what they wrote. All v of kv pairs are hardcode strings with a length of 6. `readRecords()` uses `org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValueLength()` to get full length of the v of these kv pairs. But `getValueLength()` can only get the full length of v when v's full length is less than the value of configuration parameter `tfile.io.chunk.size`, otherwise `readRecords()` will throw an exception. So, *when `tfile.io.chunk.size` is configured/set to a value less than 6, these 4 tests failed because of the exception from `readRecords()`, even 6 is a valid value for `tfile.io.chunk.size`.* The definition of `tfile.io.chunk.size` is "Value chunk size in bytes. Default to 1MB. Values of the length less than the chunk size is guaranteed to have known value length in read time (See also TFile.Reader.Scanner.Entry.isValueLengthKnown())". *Fixes* `readRecords()` should call `org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValue(byte[])` instead, which returns the correct full length of the `value` part despite whether the value's length is larger than `tfile.io.chunk.size`. was: {code:java} org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryMixedLengths1 org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryUnknownLength org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryMixedLengths1 org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryUnknownLength{code} 4 actively-used tests above call the helper function `TestTFileStreams#writeRecords()` to write key-value pairs (kv pairs), then call `TestTFileByteArrays#readRecords()` to assert the key and the value part (v) of these kv pairs matched with what they wrote. All v of kv pairs are hardcode strings with a length of 6. `readRecords()` uses `org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValueLength()` to get full length of the v of these kv pairs. But `getValueLength()` can only get the full length of v when v's full length is less than the value of configuration parameter `tfile.io.chunk.size`, otherwise `readRecords()` will throw an exception. So, when `tfile.io.chunk.size` is configured/set to a value less than 6, these 4 tests failed because of the exception from `readRecords()`, even 6 is a valid value for `tfile.io.chunk.size`. The definition of `tfile.io.chunk.size` is "Value chunk size in bytes. Default to 1MB. Values of the length less than the chunk size is guaranteed to have known value length in read time (See also TFile.Reader.Scanner.Entry.isValueLengthKnown())". *Fixes* `readRecords()` should call `org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValue(byte[])` instead, which returns the correct full length of the `value` part despite whether the value's length is larger than `tfile.io.chunk.size`. > Bug in widely-used helper function caused valid configuration value to fail > on multiple tests, causing build failure > > > Key: HADOOP-16836 > URL: https://issues.apache.org/jira/browse/HADOOP-16836 > Project: Hadoop Common > Issue Type: Bug > Components: common >Affects Versions: 3.3.0, 3.2.1 >Reporter: Ctest >Priority: Blocker > Labels: configuration, easyfix, patch, test > Attachments: HADOOP-16836-000.patch, HADOOP-16836-000.patch > > > {code:java} > org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryMixedLengths1 > org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryUnknownLength > org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryMixedLengths1 > org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryUnknownLength{code} > > 4 actively-used tests above call the helper function > `TestTFileStreams#writeRecords()` to write key-value pairs (kv pairs), then > call `TestTFileByteArrays#readRecords()` to assert the key and the value part > (v) of these kv pairs matched with what they wrote. All v of kv pairs are > hardcode
[jira] [Updated] (HADOOP-16836) Bug in widely-used helper function caused valid configuration value to fail on multiple tests, causing build failure
[ https://issues.apache.org/jira/browse/HADOOP-16836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ctest updated HADOOP-16836: --- Description: {code:java} org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryMixedLengths1 org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryUnknownLength org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryMixedLengths1 org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryUnknownLength{code} 4 actively-used tests above call the helper function `TestTFileStreams#writeRecords()` to write key-value pairs (kv pairs), then call `TestTFileByteArrays#readRecords()` to assert the key and the value part (v) of these kv pairs matched with what they wrote. All v of kv pairs are hardcode strings with a length of 6. `readRecords()` uses `org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValueLength()` to get full length of the v of these kv pairs. But `getValueLength()` can only get the full length of v when v's full length is less than the value of configuration parameter `tfile.io.chunk.size`, otherwise `readRecords()` will throw an exception. So, when `tfile.io.chunk.size` is configured/set to a value less than 6, these 4 tests failed because of the exception from `readRecords()`, even 6 is a valid value for `tfile.io.chunk.size`. The definition of `tfile.io.chunk.size` is "Value chunk size in bytes. Default to 1MB. Values of the length less than the chunk size is guaranteed to have known value length in read time (See also TFile.Reader.Scanner.Entry.isValueLengthKnown())". *Fixes* `readRecords()` should call `org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValue(byte[])` instead, which returns the correct full length of the `value` part despite whether the value's length is larger than `tfile.io.chunk.size`. was: Test helper function `org.apache.hadoop.io.file.tfile.TestTFileByteArrays#readRecords(org.apache.hadoop.fs.FileSystem, org.apache.hadoop.fs.Path, int, org.apache.hadoop.conf.Configuration)` (abbreviate as `readRecords()` below) are called in 4 actively-used tests below: {code:java} org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryMixedLengths1 org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryUnknownLength org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryMixedLengths1 org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryUnknownLength{code} These tests first call `org.apache.hadoop.io.file.tfile.TestTFileStreams#writeRecords(int count, boolean knownKeyLength, boolean knownValueLength, boolean close)` to write `key-value` pair records in a `TFile` object, then call the helper function `readRecords()` to assert the `key` part and the `value` part of `key-value` pair records stored matched with what they wrote perviously. The `value` parts of `key-value` pairs from these tests are hardcode strings with a length of 6. Assertions in `readRecords()` are directly related to the value of the configuration parameter `tfile.io.chunk.size`. The formal definition of `tfile.io.chunk.size` is "Value chunk size in bytes. Default to 1MB. Values of the length less than the chunk size is guaranteed to have known value length in read time (See also TFile.Reader.Scanner.Entry.isValueLengthKnown())". When `tfile.io.chunk.size` is configured to a value less than the length of the `value` part of the `key-value` pairs from these 4 tests, these tests will fail, even though the configured value for `tfile.io.chunk.size` is correct in semantic. *Consequence* At least 4 actively-used tests failed on correctly configured parameters. Tests used `readRecords()` could fail if the length of the hardcoded `value` part they tested is larger than the configured value of `tfile.io.chunk.size`. This caused build failure of Hadoop-Common if these tests are not skipped. *Root Cause* `readRecords()` used `org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValueLength()` (abbreviate as `getValueLength()` below) to get the full length of the `value` part in the `key-value` pair. But `getValueLength()` can only get the full length of the `value` part when the full length is less than `tfile.io.chunk.size`, otherwise, `getValueLength()` throws an exception, causing `readRecords()` to fail, and thus resulting in failures in the aforementioned 4 tests. This is because `getValueLength()` do not know the full length of the `value` part when `value` part's size is larger than `tfile.io.chunk.size`. *Fixes* `readRecords()` should instead call `org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValue(byte[])` (abbreviate as `getValue()` below), which returns the correct full length of the `value` part despite whether the `value` length is larger than `tfile.io.chunk.size`. > Bug in widely-used helper function caused
[jira] [Commented] (HADOOP-16927) Update hadoop-thirdparty dependency version to 1.0.0
[ https://issues.apache.org/jira/browse/HADOOP-16927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062045#comment-17062045 ] Wei-Chiu Chuang commented on HADOOP-16927: -- Target version? > Update hadoop-thirdparty dependency version to 1.0.0 > > > Key: HADOOP-16927 > URL: https://issues.apache.org/jira/browse/HADOOP-16927 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-16927) Update hadoop-thirdparty dependency version to 1.0.0
[ https://issues.apache.org/jira/browse/HADOOP-16927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B reassigned HADOOP-16927: -- Assignee: Vinayakumar B > Update hadoop-thirdparty dependency version to 1.0.0 > > > Key: HADOOP-16927 > URL: https://issues.apache.org/jira/browse/HADOOP-16927 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-16927) Update hadoop-thirdparty dependency version to 1.0.0
Vinayakumar B created HADOOP-16927: -- Summary: Update hadoop-thirdparty dependency version to 1.0.0 Key: HADOOP-16927 URL: https://issues.apache.org/jira/browse/HADOOP-16927 Project: Hadoop Common Issue Type: Improvement Reporter: Vinayakumar B -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on issue #1890: HADOOP-16854 Fix to prevent OutOfMemoryException and Make the threadpool and bytebuffer pool common across all AbfsOutputStream instances
hadoop-yetus commented on issue #1890: HADOOP-16854 Fix to prevent OutOfMemoryException and Make the threadpool and bytebuffer pool common across all AbfsOutputStream instances URL: https://github.com/apache/hadoop/pull/1890#issuecomment-600821237 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 0s | Docker mode activated. | | -1 :x: | patch | 0m 5s | https://github.com/apache/hadoop/pull/1890 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. | | Subsystem | Report/Notes | |--:|:-| | GITHUB PR | https://github.com/apache/hadoop/pull/1890 | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1890/8/console | | versions | git=2.17.1 | | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on issue #1861: HADOOP-13230. Optionally retain directory markers
steveloughran commented on issue #1861: HADOOP-13230. Optionally retain directory markers URL: https://github.com/apache/hadoop/pull/1861#issuecomment-600775261 HADOOP-13230. directory markers the LIST call asks for two objects when needEmptyDirectoryFlag = true, so can distinguish dir marker exists from dir marker exists + children moved much of the prefix/object analysis into S3ListResult where I intend to add some unit tests for the result parsing. Change the enum for all innerGetFileStatus calls from ALL to FILES_AND_DIRECTORIES, as we no longer need to do *any* HEAD / request on a marker; the list finds it after all. There may be more risk of delayed consistency in listings Tests: one mocking test fails (as usual); also failures in ITestS3AFileOperationCost, ITestS3GuardOutOfBandOperations, ITestRestrictedReadAccess ``` [ERROR] Failures: [ERROR] ITestS3AFileOperationCost.testCostOfGetFileStatusOnEmptyDir:161->Assert.assertEquals:645->Assert.failNotEquals:834->Assert.fail:88 Count of object_list_requests starting=1 current=2 diff=1: object_list_requests expected:<0> but was:<1> [ERROR] ITestS3AFileOperationCost.testCostOfGetFileStatusOnEmptyDir:159->Assert.assertEquals:645->Assert.failNotEquals:834->Assert.fail:88 Count of object_metadata_requests starting=4 current=5 diff=1: object_metadata_requests expected:<2> but was:<1> [ERROR] ITestS3AFileOperationCost.testCostOfGetFileStatusOnMissingFile:180->Assert.assertEquals:645->Assert.failNotEquals:834->Assert.fail:88 Count of object_metadata_requests starting=0 current=1 diff=1: object_metadata_requests expected:<2> but was:<1> [ERROR] ITestS3AFileOperationCost.testCostOfGetFileStatusOnMissingFile:180->Assert.assertEquals:645->Assert.failNotEquals:834->Assert.fail:88 Count of object_metadata_requests starting=2 current=3 diff=1: object_metadata_requests expected:<2> but was:<1> [ERROR] ITestS3AFileOperationCost.testCostOfGetFileStatusOnMissingSubPath:192->Assert.assertEquals:645->Assert.failNotEquals:834->Assert.fail:88 Count of object_metadata_requests starting=0 current=1 diff=1: object_metadata_requests expected:<2> but was:<1> [ERROR] ITestS3AFileOperationCost.testCostOfGetFileStatusOnMissingSubPath:192->Assert.assertEquals:645->Assert.failNotEquals:834->Assert.fail:88 Count of object_metadata_requests starting=2 current=3 diff=1: object_metadata_requests expected:<2> but was:<1> [ERROR] ITestS3AFileOperationCost.testCostOfGetFileStatusOnNonEmptyDir:215->Assert.assertEquals:645->Assert.failNotEquals:834->Assert.fail:88 Count of object_metadata_requests starting=5 current=6 diff=1: object_metadata_requests expected:<2> but was:<1> [ERROR] ITestS3AFileOperationCost.testCreateCost:511->verifyOperationCount:140->Assert.assertEquals:645->Assert.failNotEquals:834->Assert.fail:88 Count of object_metadata_requests starting=2 current=3 diff=1: object_metadata_requests expected:<2> but was:<1> [ERROR] ITestS3AFileOperationCost.testDirProbes:474 [LIST output is not considered empty] Expecting: to match 'is empty' predicate. [ERROR] ITestRestrictedReadAccess.testNoReadAccess:304->checkDeleteOperations:637->accessDenied:680 Expected a java.nio.file.AccessDeniedException to be thrown, but got the result: : true [ERROR] ITestRestrictedReadAccess.testNoReadAccess:298->checkBasicFileOperations:413->accessDeniedIf:697 Expected a java.nio.file.AccessDeniedException to be thrown, but got the result: : [Lorg.apache.hadoop.fs.FileStatus;@1a902257 [ERROR] Errors: [ERROR] ITestS3GuardOutOfBandOperations.testListingDelete:988->expectExceptionWhenReadingOpenFileAPI:1055 » Execution ``` the access ones are failing because LIST is working whereas a HEAD would fail if the caller doesn't have read access to the object. OOB ops may be from me setting up a new bucket. Cost ones: well, our costs have come down. literally This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] jojochuang commented on issue #1879: HDFS-15208. Supress bogus AbstractWadlGeneratorGrammarGenerator in KMS stderr in hdfs
jojochuang commented on issue #1879: HDFS-15208. Supress bogus AbstractWadlGeneratorGrammarGenerator in KMS stderr in hdfs URL: https://github.com/apache/hadoop/pull/1879#issuecomment-600746943 Thanks. I'll cherrypick this to lower branches. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16920) ABFS: Make list page size configurable
[ https://issues.apache.org/jira/browse/HADOOP-16920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061772#comment-17061772 ] Hudson commented on HADOOP-16920: - FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #18064 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/18064/]) HADOOP-16920 ABFS: Make list page size configurable. (github: rev 6ce5f8734f1864a2d628b23479cf3f6621b2fcb4) * (edit) hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/FileSystemConfigurations.java * (edit) hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java * (edit) hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AbfsConfiguration.java * (edit) hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/ConfigurationKeys.java * (edit) hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAbfsClient.java > ABFS: Make list page size configurable > -- > > Key: HADOOP-16920 > URL: https://issues.apache.org/jira/browse/HADOOP-16920 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Bilahari T H >Assignee: Bilahari T H >Priority: Minor > > Make list page size configurable -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on issue #1893: HADOOP-16920 ABFS: Make list page size configurable
steveloughran commented on issue #1893: HADOOP-16920 ABFS: Make list page size configurable URL: https://github.com/apache/hadoop/pull/1893#issuecomment-600646846 Thanks +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran merged pull request #1893: HADOOP-16920 ABFS: Make list page size configurable
steveloughran merged pull request #1893: HADOOP-16920 ABFS: Make list page size configurable URL: https://github.com/apache/hadoop/pull/1893 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1890: HADOOP-16854 Fix to prevent OutOfMemoryException and Make the threadpool and bytebuffer pool common across all AbfsOutputSt
steveloughran commented on a change in pull request #1890: HADOOP-16854 Fix to prevent OutOfMemoryException and Make the threadpool and bytebuffer pool common across all AbfsOutputStream instances URL: https://github.com/apache/hadoop/pull/1890#discussion_r394372894 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsOutputStream.java ## @@ -63,20 +70,53 @@ private final int bufferSize; private byte[] buffer; private int bufferIndex; - private final int maxConcurrentRequestCount; + + private static int maxConcurrentRequestcount; + private static int maxBufferCount; private ConcurrentLinkedDeque writeOperations; - private final ThreadPoolExecutor threadExecutor; - private final ExecutorCompletionService completionService; + private static final Object INIT_LOCK = new Object(); + private static ThreadPoolExecutor threadExecutor; + private static ExecutorCompletionService completionService; + + private static final int ONE_MB = 1024 * 1024; + private static final int HUNDRED_MB = 100 * ONE_MB; + private static final int MIN_MEMORY_THRESHOLD = HUNDRED_MB; /** * Queue storing buffers with the size of the Azure block ready for * reuse. The pool allows reusing the blocks instead of allocating new * blocks. After the data is sent to the service, the buffer is returned * back to the queue */ - private final ElasticByteBufferPool byteBufferPool - = new ElasticByteBufferPool(); + private static final ElasticByteBufferPool BYTE_BUFFER_POOL + = new ElasticByteBufferPool(); + private static AtomicInteger buffersToBeReturned = new AtomicInteger(0); + + static { +if (threadExecutor == null) { + synchronized (INIT_LOCK) { +if (threadExecutor == null) { + int availableProcessors = Runtime.getRuntime().availableProcessors(); + maxConcurrentRequestcount = 4 * availableProcessors; + maxBufferCount = maxConcurrentRequestcount + availableProcessors + 1; Review comment: I don't like the hard coded assumptions about #of CPUs and amount of space which can be used for buffering. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1890: HADOOP-16854 Fix to prevent OutOfMemoryException and Make the threadpool and bytebuffer pool common across all AbfsOutputSt
steveloughran commented on a change in pull request #1890: HADOOP-16854 Fix to prevent OutOfMemoryException and Make the threadpool and bytebuffer pool common across all AbfsOutputStream instances URL: https://github.com/apache/hadoop/pull/1890#discussion_r394372525 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsOutputStream.java ## @@ -63,20 +70,53 @@ private final int bufferSize; private byte[] buffer; private int bufferIndex; - private final int maxConcurrentRequestCount; + + private static int maxConcurrentRequestcount; + private static int maxBufferCount; private ConcurrentLinkedDeque writeOperations; - private final ThreadPoolExecutor threadExecutor; - private final ExecutorCompletionService completionService; + private static final Object INIT_LOCK = new Object(); + private static ThreadPoolExecutor threadExecutor; + private static ExecutorCompletionService completionService; + + private static final int ONE_MB = 1024 * 1024; + private static final int HUNDRED_MB = 100 * ONE_MB; + private static final int MIN_MEMORY_THRESHOLD = HUNDRED_MB; /** * Queue storing buffers with the size of the Azure block ready for * reuse. The pool allows reusing the blocks instead of allocating new * blocks. After the data is sent to the service, the buffer is returned * back to the queue */ - private final ElasticByteBufferPool byteBufferPool - = new ElasticByteBufferPool(); + private static final ElasticByteBufferPool BYTE_BUFFER_POOL + = new ElasticByteBufferPool(); + private static AtomicInteger buffersToBeReturned = new AtomicInteger(0); + + static { +if (threadExecutor == null) { + synchronized (INIT_LOCK) { +if (threadExecutor == null) { + int availableProcessors = Runtime.getRuntime().availableProcessors(); + maxConcurrentRequestcount = 4 * availableProcessors; + maxBufferCount = maxConcurrentRequestcount + availableProcessors + 1; + ThreadFactory deamonThreadFactory = new ThreadFactory() { +@Override +public Thread newThread(Runnable runnable) { + Thread deamonThread = Executors.defaultThreadFactory() + .newThread(runnable); + deamonThread.setDaemon(true); + return deamonThread; +} + }; + threadExecutor = new ThreadPoolExecutor(maxConcurrentRequestcount, Review comment: use HadoopExecutors for executors if possible This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1890: HADOOP-16854 Fix to prevent OutOfMemoryException and Make the threadpool and bytebuffer pool common across all AbfsOutputSt
steveloughran commented on a change in pull request #1890: HADOOP-16854 Fix to prevent OutOfMemoryException and Make the threadpool and bytebuffer pool common across all AbfsOutputStream instances URL: https://github.com/apache/hadoop/pull/1890#discussion_r394373334 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsOutputStream.java ## @@ -63,20 +70,53 @@ private final int bufferSize; private byte[] buffer; private int bufferIndex; - private final int maxConcurrentRequestCount; + + private static int maxConcurrentRequestcount; + private static int maxBufferCount; private ConcurrentLinkedDeque writeOperations; - private final ThreadPoolExecutor threadExecutor; - private final ExecutorCompletionService completionService; + private static final Object INIT_LOCK = new Object(); + private static ThreadPoolExecutor threadExecutor; + private static ExecutorCompletionService completionService; + + private static final int ONE_MB = 1024 * 1024; + private static final int HUNDRED_MB = 100 * ONE_MB; + private static final int MIN_MEMORY_THRESHOLD = HUNDRED_MB; /** * Queue storing buffers with the size of the Azure block ready for * reuse. The pool allows reusing the blocks instead of allocating new * blocks. After the data is sent to the service, the buffer is returned * back to the queue */ - private final ElasticByteBufferPool byteBufferPool - = new ElasticByteBufferPool(); + private static final ElasticByteBufferPool BYTE_BUFFER_POOL Review comment: ElasticByteBufferPool is is trouble because it never frees cached buffers This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] bilaharith commented on a change in pull request #1890: HADOOP-16854 Fix to prevent OutOfMemoryException and Make the threadpool and bytebuffer pool common across all AbfsOutputStrea
bilaharith commented on a change in pull request #1890: HADOOP-16854 Fix to prevent OutOfMemoryException and Make the threadpool and bytebuffer pool common across all AbfsOutputStream instances URL: https://github.com/apache/hadoop/pull/1890#discussion_r394372966 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsOutputStream.java ## @@ -63,20 +70,53 @@ private final int bufferSize; private byte[] buffer; private int bufferIndex; - private final int maxConcurrentRequestCount; + + private static int maxConcurrentRequestcount; + private static int maxBufferCount; private ConcurrentLinkedDeque writeOperations; - private final ThreadPoolExecutor threadExecutor; - private final ExecutorCompletionService completionService; + private static final Object INIT_LOCK = new Object(); + private static ThreadPoolExecutor threadExecutor; + private static ExecutorCompletionService completionService; + + private static final int ONE_MB = 1024 * 1024; + private static final int HUNDRED_MB = 100 * ONE_MB; + private static final int MIN_MEMORY_THRESHOLD = HUNDRED_MB; /** * Queue storing buffers with the size of the Azure block ready for * reuse. The pool allows reusing the blocks instead of allocating new * blocks. After the data is sent to the service, the buffer is returned * back to the queue */ - private final ElasticByteBufferPool byteBufferPool - = new ElasticByteBufferPool(); + private static final ElasticByteBufferPool BYTE_BUFFER_POOL + = new ElasticByteBufferPool(); + private static AtomicInteger buffersToBeReturned = new AtomicInteger(0); + + static { +if (threadExecutor == null) { + synchronized (INIT_LOCK) { +if (threadExecutor == null) { + int availableProcessors = Runtime.getRuntime().availableProcessors(); Review comment: 4*core is the existing calculation. Making the same configurable in the next iteration. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1890: HADOOP-16854 Fix to prevent OutOfMemoryException and Make the threadpool and bytebuffer pool common across all AbfsOutputSt
steveloughran commented on a change in pull request #1890: HADOOP-16854 Fix to prevent OutOfMemoryException and Make the threadpool and bytebuffer pool common across all AbfsOutputStream instances URL: https://github.com/apache/hadoop/pull/1890#discussion_r394371435 ## File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemFlush.java ## @@ -207,6 +210,70 @@ public Void call() throws Exception { assertEquals((long) TEST_BUFFER_SIZE * FLUSH_TIMES, fileStatus.getLen()); } + @Test + public void testWriteWithMultipleOutputStreamAtTheSameTime() + throws IOException, InterruptedException, ExecutionException { +AzureBlobFileSystem fs = getFileSystem(); +String testFilePath = methodName.getMethodName(); +Path[] testPaths = new Path[CONCURRENT_STREAM_OBJS_TEST_OBJ_COUNT]; +createNStreamsAndWriteDifferentSizesConcurrently(fs, testFilePath, +CONCURRENT_STREAM_OBJS_TEST_OBJ_COUNT, testPaths); +assertSuccessfulWritesOnAllStreams(fs, +CONCURRENT_STREAM_OBJS_TEST_OBJ_COUNT, testPaths); + } + + private void assertSuccessfulWritesOnAllStreams(final FileSystem fs, + final int numConcurrentObjects, final Path[] testPaths) + throws IOException { +for (int i = 0; i < numConcurrentObjects; i++) { + FileStatus fileStatus = fs.getFileStatus(testPaths[i]); + int numWritesMadeOnStream = i + 1; + long expectedLength = TEST_BUFFER_SIZE * numWritesMadeOnStream; + assertThat(fileStatus.getLen(), is(equalTo(expectedLength))); +} + } + + private void createNStreamsAndWriteDifferentSizesConcurrently( + final FileSystem fs, final String testFilePath, + final int numConcurrentObjects, final Path[] testPaths) + throws ExecutionException, InterruptedException { +final byte[] b = new byte[TEST_BUFFER_SIZE]; +new Random().nextBytes(b); +final ExecutorService es = Executors.newFixedThreadPool(40); +final List> futureTasks = new ArrayList<>(); +for (int i = 0; i < numConcurrentObjects; i++) { + Path testPath = new Path(testFilePath + i); + testPaths[i] = testPath; + int numWritesToBeDone = i + 1; + futureTasks.add(es.submit(() -> { +try (FSDataOutputStream stream = fs.create(testPath)) { + makeNWritesToStream(stream, numWritesToBeDone, b, es); +} +return null; + })); +} +for (Future futureTask : futureTasks) { + futureTask.get(); +} +es.shutdownNow(); + } + + private void makeNWritesToStream(final FSDataOutputStream stream, + final int numWrites, final byte[] b, final ExecutorService es) + throws IOException, ExecutionException, InterruptedException { +final List> futureTasks = new ArrayList<>(); +for (int i = 0; i < numWrites; i++) { + futureTasks.add(es.submit(() -> { +stream.write(b); +return null; + })); +} +for (Future futureTask : futureTasks) { Review comment: I'm sure there is a way to block for multiple futures. See also FutureIOSupport.awaitFuture This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16054) Update Dockerfile to use Bionic
[ https://issues.apache.org/jira/browse/HADOOP-16054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061753#comment-17061753 ] Hudson commented on HADOOP-16054: - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18063 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/18063/]) HADOOP-16054. Update Dockerfile to use Bionic. (github: rev 367833cf41753f5b1c996f9a2a9c1f20e2011173) * (edit) dev-support/docker/Dockerfile > Update Dockerfile to use Bionic > --- > > Key: HADOOP-16054 > URL: https://issues.apache.org/jira/browse/HADOOP-16054 > Project: Hadoop Common > Issue Type: Improvement > Components: build, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > Fix For: 3.3.0 > > > Ubuntu xenial goes EoL in April 2021. Let's upgrade until the date. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1890: HADOOP-16854 Fix to prevent OutOfMemoryException and Make the threadpool and bytebuffer pool common across all AbfsOutputSt
steveloughran commented on a change in pull request #1890: HADOOP-16854 Fix to prevent OutOfMemoryException and Make the threadpool and bytebuffer pool common across all AbfsOutputStream instances URL: https://github.com/apache/hadoop/pull/1890#discussion_r394358730 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsOutputStream.java ## @@ -63,20 +70,53 @@ private final int bufferSize; private byte[] buffer; private int bufferIndex; - private final int maxConcurrentRequestCount; + + private static int maxConcurrentRequestcount; + private static int maxBufferCount; private ConcurrentLinkedDeque writeOperations; - private final ThreadPoolExecutor threadExecutor; - private final ExecutorCompletionService completionService; + private static final Object INIT_LOCK = new Object(); + private static ThreadPoolExecutor threadExecutor; + private static ExecutorCompletionService completionService; + + private static final int ONE_MB = 1024 * 1024; + private static final int HUNDRED_MB = 100 * ONE_MB; + private static final int MIN_MEMORY_THRESHOLD = HUNDRED_MB; /** * Queue storing buffers with the size of the Azure block ready for * reuse. The pool allows reusing the blocks instead of allocating new * blocks. After the data is sent to the service, the buffer is returned * back to the queue */ - private final ElasticByteBufferPool byteBufferPool - = new ElasticByteBufferPool(); + private static final ElasticByteBufferPool BYTE_BUFFER_POOL + = new ElasticByteBufferPool(); + private static AtomicInteger buffersToBeReturned = new AtomicInteger(0); + + static { +if (threadExecutor == null) { + synchronized (INIT_LOCK) { +if (threadExecutor == null) { + int availableProcessors = Runtime.getRuntime().availableProcessors(); Review comment: this is making some big assumptions about exclusive access to CPUs. Why the specific choice of 4*core? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16054) Update Dockerfile to use Bionic
[ https://issues.apache.org/jira/browse/HADOOP-16054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-16054: Fix Version/s: 3.3.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Update Dockerfile to use Bionic > --- > > Key: HADOOP-16054 > URL: https://issues.apache.org/jira/browse/HADOOP-16054 > Project: Hadoop Common > Issue Type: Improvement > Components: build, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > Fix For: 3.3.0 > > > Ubuntu xenial goes EoL in April 2021. Let's upgrade until the date. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran merged pull request #1862: HADOOP-16054. Update Dockerfile to use Bionic
steveloughran merged pull request #1862: HADOOP-16054. Update Dockerfile to use Bionic URL: https://github.com/apache/hadoop/pull/1862 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran merged pull request #1879: HDFS-15208. Supress bogus AbstractWadlGeneratorGrammarGenerator in KMS stderr in hdfs
steveloughran merged pull request #1879: HDFS-15208. Supress bogus AbstractWadlGeneratorGrammarGenerator in KMS stderr in hdfs URL: https://github.com/apache/hadoop/pull/1879 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on issue #1879: HDFS-15208. Supress bogus AbstractWadlGeneratorGrammarGenerator in KMS stderr in hdfs
steveloughran commented on issue #1879: HDFS-15208. Supress bogus AbstractWadlGeneratorGrammarGenerator in KMS stderr in hdfs URL: https://github.com/apache/hadoop/pull/1879#issuecomment-600621222 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities
steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities URL: https://github.com/apache/hadoop/pull/1858#discussion_r394338079 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TextFileBasedIdentityHandler.java ## @@ -0,0 +1,192 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs.azurebfs.utils; + +import com.google.common.base.Preconditions; +import com.google.common.base.Strings; +import java.io.File; +import java.io.IOException; +import java.util.HashMap; +import org.apache.commons.io.FileUtils; +import org.apache.commons.io.LineIterator; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.COLON; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EMPTY_STRING; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HASH; + + +/** + * {@code TextFileBasedIdentityHandler} is a {@link IdentityHandler} implements + * translation operation which returns identity mapped to AAD identity by + * loading the mapping file from the configured location. Location of the + * mapping file should be configured in {@code core-site.xml} + * + * User identity file should be delimited by colon in below format. + * + * OBJ_ID:USER_NAME:USER_ID:GROUP_ID:SPI_NAME:APP_ID + * + * + * Example: + * + * a2b27aec-77bd-46dd-8c8c-39611a31:user1:11000:21000:spi-user1:abcf86e9-5a5b-49e2-a253-f5c9e2afd4ec + * + * + * Group identity file should be delimited by colon in below format. + * + * OBJ_ID:GROUP_NAME:GROUP_ID:SGP_NAME + * + * + * Example: + * + * 1d23024d-957c-4456-aac1-a57f9e2de914:group1:21000:sgp-group1 + * + */ +public class TextFileBasedIdentityHandler implements IdentityHandler { + private static final Logger LOG = LoggerFactory.getLogger(TextFileBasedIdentityHandler.class); + + /** + * Expected no of fields in the user mapping file + */ + private static final int NO_OF_FIELDS_USER_MAPPING = 6; + /** + * Expected no of fields in the group mapping file + */ + private static final int NO_OF_FIELDS_GROUP_MAPPING = 4; + /** + * Array index for the local username. + * Example: + * a2b27aec-77bd-46dd-8c8c-39611a31:user1:11000:21000:spi-user1:abcf86e9-5a5b-49e2-a253-f5c9e2afd4ec + */ + private static final int ARRAY_INDEX_FOR_LOCAL_USER_NAME = 1; + /** + * Array index for the security group name + * Example: + * 1d23024d-957c-4456-aac1-a57f9e2de914:group1:21000:sgp-group1 + */ + private static final int ARRAY_INDEX_FOR_LOCAL_GROUP_NAME = 1; + /** + * Array index for the AAD Service Principal's Object ID + */ + private static final int ARRAY_INDEX_FOR_AAD_SP_OBJECT_ID = 0; + /** + * Array index for the AAD Security Group's Object ID + */ + private static final int ARRAY_INDEX_FOR_AAD_SG_OBJECT_ID = 0; + private String userMappingFileLocation; + private String groupMappingFileLocation; + private HashMap userMap; + private HashMap groupMap; + + public TextFileBasedIdentityHandler(String userMappingFilePath, String groupMappingFilePath) { +Preconditions.checkArgument(!Strings.isNullOrEmpty(userMappingFilePath), +"Local User to Service Principal mapping filePath cannot by Null or Empty"); +Preconditions.checkArgument(!Strings.isNullOrEmpty(groupMappingFilePath), +"Local Group to Security Group mapping filePath cannot by Null or Empty"); +this.userMappingFileLocation = userMappingFilePath; +this.groupMappingFileLocation = groupMappingFilePath; +//Lazy Loading +this.userMap = new HashMap<>(); +this.groupMap = new HashMap<>(); + } + + /** + * Perform lookup from Service Principal's Object ID to Local Username + * @param originalIdentity AAD object ID + * @return Local User name, if no name found or on exception, returns empty string. + * */ + public synchronized String lookupForLocalUserIdentity(String originalIdentity) throws IOException { +if (originalIdentity == null || originalIdentity.isEmpty()) { + return EMPTY_STRING; +} + +if
[GitHub] [hadoop] steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities
steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities URL: https://github.com/apache/hadoop/pull/1858#discussion_r394340912 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TextFileBasedIdentityHandler.java ## @@ -0,0 +1,192 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs.azurebfs.utils; + +import com.google.common.base.Preconditions; +import com.google.common.base.Strings; +import java.io.File; +import java.io.IOException; +import java.util.HashMap; +import org.apache.commons.io.FileUtils; +import org.apache.commons.io.LineIterator; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.COLON; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EMPTY_STRING; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HASH; + + +/** + * {@code TextFileBasedIdentityHandler} is a {@link IdentityHandler} implements + * translation operation which returns identity mapped to AAD identity by + * loading the mapping file from the configured location. Location of the + * mapping file should be configured in {@code core-site.xml} + * + * User identity file should be delimited by colon in below format. + * + * OBJ_ID:USER_NAME:USER_ID:GROUP_ID:SPI_NAME:APP_ID Review comment: + add use of "#" as comment This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities
steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities URL: https://github.com/apache/hadoop/pull/1858#discussion_r394333622 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/IdentityHandler.java ## @@ -0,0 +1,42 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs.azurebfs.utils; + +import java.io.IOException; + + +/** + * {@code IdentityHandler} defines the set of methods to support various + * identity lookup services. + */ +public interface IdentityHandler { + + /** + * Perform lookup from Service Principal's Object ID to Username Review comment: nit: add trailing "." This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities
steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities URL: https://github.com/apache/hadoop/pull/1858#discussion_r394337426 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TextFileBasedIdentityHandler.java ## @@ -0,0 +1,192 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs.azurebfs.utils; + +import com.google.common.base.Preconditions; +import com.google.common.base.Strings; +import java.io.File; +import java.io.IOException; +import java.util.HashMap; +import org.apache.commons.io.FileUtils; +import org.apache.commons.io.LineIterator; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.COLON; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EMPTY_STRING; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HASH; + + +/** + * {@code TextFileBasedIdentityHandler} is a {@link IdentityHandler} implements + * translation operation which returns identity mapped to AAD identity by + * loading the mapping file from the configured location. Location of the + * mapping file should be configured in {@code core-site.xml} + * + * User identity file should be delimited by colon in below format. + * + * OBJ_ID:USER_NAME:USER_ID:GROUP_ID:SPI_NAME:APP_ID + * + * + * Example: + * + * a2b27aec-77bd-46dd-8c8c-39611a31:user1:11000:21000:spi-user1:abcf86e9-5a5b-49e2-a253-f5c9e2afd4ec + * + * + * Group identity file should be delimited by colon in below format. + * + * OBJ_ID:GROUP_NAME:GROUP_ID:SGP_NAME + * + * + * Example: + * + * 1d23024d-957c-4456-aac1-a57f9e2de914:group1:21000:sgp-group1 + * + */ +public class TextFileBasedIdentityHandler implements IdentityHandler { + private static final Logger LOG = LoggerFactory.getLogger(TextFileBasedIdentityHandler.class); + + /** + * Expected no of fields in the user mapping file + */ + private static final int NO_OF_FIELDS_USER_MAPPING = 6; + /** + * Expected no of fields in the group mapping file + */ + private static final int NO_OF_FIELDS_GROUP_MAPPING = 4; + /** + * Array index for the local username. + * Example: + * a2b27aec-77bd-46dd-8c8c-39611a31:user1:11000:21000:spi-user1:abcf86e9-5a5b-49e2-a253-f5c9e2afd4ec + */ + private static final int ARRAY_INDEX_FOR_LOCAL_USER_NAME = 1; + /** + * Array index for the security group name + * Example: + * 1d23024d-957c-4456-aac1-a57f9e2de914:group1:21000:sgp-group1 + */ + private static final int ARRAY_INDEX_FOR_LOCAL_GROUP_NAME = 1; + /** + * Array index for the AAD Service Principal's Object ID + */ + private static final int ARRAY_INDEX_FOR_AAD_SP_OBJECT_ID = 0; + /** + * Array index for the AAD Security Group's Object ID + */ + private static final int ARRAY_INDEX_FOR_AAD_SG_OBJECT_ID = 0; + private String userMappingFileLocation; + private String groupMappingFileLocation; + private HashMap userMap; + private HashMap groupMap; + + public TextFileBasedIdentityHandler(String userMappingFilePath, String groupMappingFilePath) { +Preconditions.checkArgument(!Strings.isNullOrEmpty(userMappingFilePath), +"Local User to Service Principal mapping filePath cannot by Null or Empty"); +Preconditions.checkArgument(!Strings.isNullOrEmpty(groupMappingFilePath), +"Local Group to Security Group mapping filePath cannot by Null or Empty"); +this.userMappingFileLocation = userMappingFilePath; +this.groupMappingFileLocation = groupMappingFilePath; +//Lazy Loading +this.userMap = new HashMap<>(); +this.groupMap = new HashMap<>(); + } + + /** + * Perform lookup from Service Principal's Object ID to Local Username + * @param originalIdentity AAD object ID + * @return Local User name, if no name found or on exception, returns empty string. + * */ + public synchronized String lookupForLocalUserIdentity(String originalIdentity) throws IOException { +if (originalIdentity == null || originalIdentity.isEmpty()) { + return EMPTY_STRING; +} + +if
[GitHub] [hadoop] steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities
steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities URL: https://github.com/apache/hadoop/pull/1858#discussion_r394339880 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TextFileBasedIdentityHandler.java ## @@ -0,0 +1,192 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs.azurebfs.utils; + +import com.google.common.base.Preconditions; +import com.google.common.base.Strings; +import java.io.File; +import java.io.IOException; +import java.util.HashMap; +import org.apache.commons.io.FileUtils; +import org.apache.commons.io.LineIterator; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.COLON; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EMPTY_STRING; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HASH; + + +/** + * {@code TextFileBasedIdentityHandler} is a {@link IdentityHandler} implements + * translation operation which returns identity mapped to AAD identity by + * loading the mapping file from the configured location. Location of the + * mapping file should be configured in {@code core-site.xml} + * + * User identity file should be delimited by colon in below format. + * + * OBJ_ID:USER_NAME:USER_ID:GROUP_ID:SPI_NAME:APP_ID + * + * + * Example: + * + * a2b27aec-77bd-46dd-8c8c-39611a31:user1:11000:21000:spi-user1:abcf86e9-5a5b-49e2-a253-f5c9e2afd4ec + * + * + * Group identity file should be delimited by colon in below format. + * + * OBJ_ID:GROUP_NAME:GROUP_ID:SGP_NAME + * + * + * Example: + * + * 1d23024d-957c-4456-aac1-a57f9e2de914:group1:21000:sgp-group1 + * + */ +public class TextFileBasedIdentityHandler implements IdentityHandler { + private static final Logger LOG = LoggerFactory.getLogger(TextFileBasedIdentityHandler.class); + + /** + * Expected no of fields in the user mapping file + */ + private static final int NO_OF_FIELDS_USER_MAPPING = 6; + /** + * Expected no of fields in the group mapping file + */ + private static final int NO_OF_FIELDS_GROUP_MAPPING = 4; + /** + * Array index for the local username. + * Example: + * a2b27aec-77bd-46dd-8c8c-39611a31:user1:11000:21000:spi-user1:abcf86e9-5a5b-49e2-a253-f5c9e2afd4ec + */ + private static final int ARRAY_INDEX_FOR_LOCAL_USER_NAME = 1; + /** + * Array index for the security group name + * Example: + * 1d23024d-957c-4456-aac1-a57f9e2de914:group1:21000:sgp-group1 + */ + private static final int ARRAY_INDEX_FOR_LOCAL_GROUP_NAME = 1; + /** + * Array index for the AAD Service Principal's Object ID + */ + private static final int ARRAY_INDEX_FOR_AAD_SP_OBJECT_ID = 0; + /** + * Array index for the AAD Security Group's Object ID + */ + private static final int ARRAY_INDEX_FOR_AAD_SG_OBJECT_ID = 0; + private String userMappingFileLocation; + private String groupMappingFileLocation; + private HashMap userMap; + private HashMap groupMap; + + public TextFileBasedIdentityHandler(String userMappingFilePath, String groupMappingFilePath) { +Preconditions.checkArgument(!Strings.isNullOrEmpty(userMappingFilePath), +"Local User to Service Principal mapping filePath cannot by Null or Empty"); +Preconditions.checkArgument(!Strings.isNullOrEmpty(groupMappingFilePath), +"Local Group to Security Group mapping filePath cannot by Null or Empty"); +this.userMappingFileLocation = userMappingFilePath; +this.groupMappingFileLocation = groupMappingFilePath; +//Lazy Loading +this.userMap = new HashMap<>(); +this.groupMap = new HashMap<>(); + } + + /** + * Perform lookup from Service Principal's Object ID to Local Username + * @param originalIdentity AAD object ID + * @return Local User name, if no name found or on exception, returns empty string. + * */ + public synchronized String lookupForLocalUserIdentity(String originalIdentity) throws IOException { +if (originalIdentity == null || originalIdentity.isEmpty()) { + return EMPTY_STRING; +} + +if
[GitHub] [hadoop] steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities
steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities URL: https://github.com/apache/hadoop/pull/1858#discussion_r394334105 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TextFileBasedIdentityHandler.java ## @@ -0,0 +1,192 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs.azurebfs.utils; + +import com.google.common.base.Preconditions; Review comment: see the other comments about import ordering This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities
steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities URL: https://github.com/apache/hadoop/pull/1858#discussion_r394339067 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TextFileBasedIdentityHandler.java ## @@ -0,0 +1,192 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs.azurebfs.utils; + +import com.google.common.base.Preconditions; +import com.google.common.base.Strings; +import java.io.File; +import java.io.IOException; +import java.util.HashMap; +import org.apache.commons.io.FileUtils; +import org.apache.commons.io.LineIterator; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.COLON; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EMPTY_STRING; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HASH; + + +/** + * {@code TextFileBasedIdentityHandler} is a {@link IdentityHandler} implements + * translation operation which returns identity mapped to AAD identity by + * loading the mapping file from the configured location. Location of the + * mapping file should be configured in {@code core-site.xml} + * + * User identity file should be delimited by colon in below format. + * + * OBJ_ID:USER_NAME:USER_ID:GROUP_ID:SPI_NAME:APP_ID + * + * + * Example: + * + * a2b27aec-77bd-46dd-8c8c-39611a31:user1:11000:21000:spi-user1:abcf86e9-5a5b-49e2-a253-f5c9e2afd4ec + * + * + * Group identity file should be delimited by colon in below format. + * + * OBJ_ID:GROUP_NAME:GROUP_ID:SGP_NAME + * + * + * Example: + * + * 1d23024d-957c-4456-aac1-a57f9e2de914:group1:21000:sgp-group1 + * + */ +public class TextFileBasedIdentityHandler implements IdentityHandler { + private static final Logger LOG = LoggerFactory.getLogger(TextFileBasedIdentityHandler.class); + + /** + * Expected no of fields in the user mapping file + */ + private static final int NO_OF_FIELDS_USER_MAPPING = 6; + /** + * Expected no of fields in the group mapping file + */ + private static final int NO_OF_FIELDS_GROUP_MAPPING = 4; + /** + * Array index for the local username. + * Example: + * a2b27aec-77bd-46dd-8c8c-39611a31:user1:11000:21000:spi-user1:abcf86e9-5a5b-49e2-a253-f5c9e2afd4ec + */ + private static final int ARRAY_INDEX_FOR_LOCAL_USER_NAME = 1; + /** + * Array index for the security group name + * Example: + * 1d23024d-957c-4456-aac1-a57f9e2de914:group1:21000:sgp-group1 + */ + private static final int ARRAY_INDEX_FOR_LOCAL_GROUP_NAME = 1; + /** + * Array index for the AAD Service Principal's Object ID + */ + private static final int ARRAY_INDEX_FOR_AAD_SP_OBJECT_ID = 0; + /** + * Array index for the AAD Security Group's Object ID + */ + private static final int ARRAY_INDEX_FOR_AAD_SG_OBJECT_ID = 0; + private String userMappingFileLocation; + private String groupMappingFileLocation; + private HashMap userMap; + private HashMap groupMap; + + public TextFileBasedIdentityHandler(String userMappingFilePath, String groupMappingFilePath) { +Preconditions.checkArgument(!Strings.isNullOrEmpty(userMappingFilePath), +"Local User to Service Principal mapping filePath cannot by Null or Empty"); +Preconditions.checkArgument(!Strings.isNullOrEmpty(groupMappingFilePath), +"Local Group to Security Group mapping filePath cannot by Null or Empty"); +this.userMappingFileLocation = userMappingFilePath; +this.groupMappingFileLocation = groupMappingFilePath; +//Lazy Loading +this.userMap = new HashMap<>(); +this.groupMap = new HashMap<>(); + } + + /** + * Perform lookup from Service Principal's Object ID to Local Username + * @param originalIdentity AAD object ID + * @return Local User name, if no name found or on exception, returns empty string. + * */ + public synchronized String lookupForLocalUserIdentity(String originalIdentity) throws IOException { +if (originalIdentity == null || originalIdentity.isEmpty()) { + return EMPTY_STRING; +} + +if
[GitHub] [hadoop] steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities
steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities URL: https://github.com/apache/hadoop/pull/1858#discussion_r394337251 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TextFileBasedIdentityHandler.java ## @@ -0,0 +1,192 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs.azurebfs.utils; + +import com.google.common.base.Preconditions; +import com.google.common.base.Strings; +import java.io.File; +import java.io.IOException; +import java.util.HashMap; +import org.apache.commons.io.FileUtils; +import org.apache.commons.io.LineIterator; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.COLON; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EMPTY_STRING; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HASH; + + +/** + * {@code TextFileBasedIdentityHandler} is a {@link IdentityHandler} implements + * translation operation which returns identity mapped to AAD identity by + * loading the mapping file from the configured location. Location of the + * mapping file should be configured in {@code core-site.xml} + * + * User identity file should be delimited by colon in below format. + * + * OBJ_ID:USER_NAME:USER_ID:GROUP_ID:SPI_NAME:APP_ID + * + * + * Example: + * + * a2b27aec-77bd-46dd-8c8c-39611a31:user1:11000:21000:spi-user1:abcf86e9-5a5b-49e2-a253-f5c9e2afd4ec + * + * + * Group identity file should be delimited by colon in below format. + * + * OBJ_ID:GROUP_NAME:GROUP_ID:SGP_NAME + * + * + * Example: + * + * 1d23024d-957c-4456-aac1-a57f9e2de914:group1:21000:sgp-group1 + * + */ +public class TextFileBasedIdentityHandler implements IdentityHandler { + private static final Logger LOG = LoggerFactory.getLogger(TextFileBasedIdentityHandler.class); + + /** + * Expected no of fields in the user mapping file + */ + private static final int NO_OF_FIELDS_USER_MAPPING = 6; + /** + * Expected no of fields in the group mapping file + */ + private static final int NO_OF_FIELDS_GROUP_MAPPING = 4; + /** + * Array index for the local username. + * Example: + * a2b27aec-77bd-46dd-8c8c-39611a31:user1:11000:21000:spi-user1:abcf86e9-5a5b-49e2-a253-f5c9e2afd4ec + */ + private static final int ARRAY_INDEX_FOR_LOCAL_USER_NAME = 1; + /** + * Array index for the security group name + * Example: + * 1d23024d-957c-4456-aac1-a57f9e2de914:group1:21000:sgp-group1 + */ + private static final int ARRAY_INDEX_FOR_LOCAL_GROUP_NAME = 1; + /** + * Array index for the AAD Service Principal's Object ID + */ + private static final int ARRAY_INDEX_FOR_AAD_SP_OBJECT_ID = 0; + /** + * Array index for the AAD Security Group's Object ID + */ + private static final int ARRAY_INDEX_FOR_AAD_SG_OBJECT_ID = 0; + private String userMappingFileLocation; + private String groupMappingFileLocation; + private HashMap userMap; + private HashMap groupMap; + + public TextFileBasedIdentityHandler(String userMappingFilePath, String groupMappingFilePath) { +Preconditions.checkArgument(!Strings.isNullOrEmpty(userMappingFilePath), +"Local User to Service Principal mapping filePath cannot by Null or Empty"); +Preconditions.checkArgument(!Strings.isNullOrEmpty(groupMappingFilePath), +"Local Group to Security Group mapping filePath cannot by Null or Empty"); +this.userMappingFileLocation = userMappingFilePath; +this.groupMappingFileLocation = groupMappingFilePath; +//Lazy Loading +this.userMap = new HashMap<>(); +this.groupMap = new HashMap<>(); + } + + /** + * Perform lookup from Service Principal's Object ID to Local Username + * @param originalIdentity AAD object ID + * @return Local User name, if no name found or on exception, returns empty string. + * */ + public synchronized String lookupForLocalUserIdentity(String originalIdentity) throws IOException { +if (originalIdentity == null || originalIdentity.isEmpty()) { Review comment: or use `Strings.isNullOrEmpty()`
[GitHub] [hadoop] steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities
steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities URL: https://github.com/apache/hadoop/pull/1858#discussion_r394334417 ## File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/TestTextFileBasedIdentityHandler.java ## @@ -0,0 +1,141 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.File; Review comment: see the other comments about import ordering This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities
steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities URL: https://github.com/apache/hadoop/pull/1858#discussion_r394333793 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/IdentityHandler.java ## @@ -0,0 +1,42 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs.azurebfs.utils; + +import java.io.IOException; + + +/** + * {@code IdentityHandler} defines the set of methods to support various + * identity lookup services. + */ +public interface IdentityHandler { + + /** + * Perform lookup from Service Principal's Object ID to Username + * @param originalIdentity AAD object ID + * @return User name, if no name found returns empty string. + * */ + String lookupForLocalUserIdentity(String originalIdentity) throws IOException; + + /** + * Perform lookup from Security Group's Object ID to Security Group name Review comment: nit: add trailing "." This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities
steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities URL: https://github.com/apache/hadoop/pull/1858#discussion_r394336524 ## File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/TestTextFileBasedIdentityHandler.java ## @@ -0,0 +1,141 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.File; +import java.io.IOException; +import java.nio.charset.Charset; +import org.apache.commons.io.FileUtils; +import org.apache.hadoop.fs.azurebfs.utils.TextFileBasedIdentityHandler; +import org.junit.Assert; +import org.junit.BeforeClass; +import org.junit.ClassRule; +import org.junit.Test; +import org.junit.rules.TemporaryFolder; + + +public class TestTextFileBasedIdentityHandler { + + @ClassRule + public static TemporaryFolder tempDir = new TemporaryFolder(); + private static File userMappingFile = null; + private static File groupMappingFile = null; + private static final String NEW_LINE = "\n"; + private static String testUserDataLine1 = + "a2b27aec-77bd-46dd-8c8c-39611a31:user1:11000:21000:spi-user1:abcf86e9-5a5b-49e2-a253-f5c9e2afd4ec" + + NEW_LINE; + private static String testUserDataLine2 = + "#i2j27aec-77bd-46dd-8c8c-39611a31:user2:41000:21000:spi-user2:mnof86e9-5a5b-49e2-a253-f5c9e2afd4ec" + + NEW_LINE; + private static String testUserDataLine3 = + "c2d27aec-77bd-46dd-8c8c-39611a31:user2:21000:21000:spi-user2:deff86e9-5a5b-49e2-a253-f5c9e2afd4ec" + + NEW_LINE; + private static String testUserDataLine4 = "e2f27aec-77bd-46dd-8c8c-39611a31c" + NEW_LINE; + private static String testUserDataLine5 = + "g2h27aec-77bd-46dd-8c8c-39611a31:user4:41000:21000:spi-user4:jklf86e9-5a5b-49e2-a253-f5c9e2afd4ec" + + NEW_LINE; + private static String testUserDataLine6 = " " + NEW_LINE; + private static String testUserDataLine7 = + "i2j27aec-77bd-46dd-8c8c-39611a31:user5:41000:21000:spi-user5:mknf86e9-5a5b-49e2-a253-f5c9e2afd4ec" + + NEW_LINE; + + private static String testGroupDataLine1 = "1d23024d-957c-4456-aac1-a57f9e2de914:group1:21000:sgp-group1" + NEW_LINE; + private static String testGroupDataLine2 = "3d43024d-957c-4456-aac1-a57f9e2de914:group2:21000:sgp-group2" + NEW_LINE; + private static String testGroupDataLine3 = "5d63024d-957c-4456-aac1-a57f9e2de914" + NEW_LINE; + private static String testGroupDataLine4 = " " + NEW_LINE; + private static String testGroupDataLine5 = "7d83024d-957c-4456-aac1-a57f9e2de914:group4:21000:sgp-group4" + NEW_LINE; + + @BeforeClass + public static void init() throws IOException { +userMappingFile = tempDir.newFile("user-mapping.conf"); +groupMappingFile = tempDir.newFile("group-mapping.conf"); + +//Stage data for user mapping +FileUtils.writeStringToFile(userMappingFile, testUserDataLine1, Charset.forName("UTF-8"), true); +FileUtils.writeStringToFile(userMappingFile, testUserDataLine2, Charset.forName("UTF-8"), true); +FileUtils.writeStringToFile(userMappingFile, testUserDataLine3, Charset.forName("UTF-8"), true); +FileUtils.writeStringToFile(userMappingFile, testUserDataLine4, Charset.forName("UTF-8"), true); +FileUtils.writeStringToFile(userMappingFile, testUserDataLine5, Charset.forName("UTF-8"), true); +FileUtils.writeStringToFile(userMappingFile, testUserDataLine6, Charset.forName("UTF-8"), true); +FileUtils.writeStringToFile(userMappingFile, testUserDataLine7, Charset.forName("UTF-8"), true); +FileUtils.writeStringToFile(userMappingFile, NEW_LINE, Charset.forName("UTF-8"), true); + +//Stage data for group mapping +FileUtils.writeStringToFile(groupMappingFile, testGroupDataLine1, Charset.forName("UTF-8"), true); +FileUtils.writeStringToFile(groupMappingFile, testGroupDataLine2, Charset.forName("UTF-8"), true); +FileUtils.writeStringToFile(groupMappingFile, testGroupDataLine3, Charset.forName("UTF-8"), true); +FileUtils.writeStringToFile(groupMappingFile, testGroupDataLine4, Charset.forName("UTF-8"), true); +
[GitHub] [hadoop] steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities
steveloughran commented on a change in pull request #1858: HDFS-15168: ABFS enhancement to translate AAD Object to Linux idenities URL: https://github.com/apache/hadoop/pull/1858#discussion_r394339436 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TextFileBasedIdentityHandler.java ## @@ -0,0 +1,192 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs.azurebfs.utils; + +import com.google.common.base.Preconditions; +import com.google.common.base.Strings; +import java.io.File; +import java.io.IOException; +import java.util.HashMap; +import org.apache.commons.io.FileUtils; +import org.apache.commons.io.LineIterator; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.COLON; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EMPTY_STRING; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HASH; + + +/** + * {@code TextFileBasedIdentityHandler} is a {@link IdentityHandler} implements + * translation operation which returns identity mapped to AAD identity by + * loading the mapping file from the configured location. Location of the + * mapping file should be configured in {@code core-site.xml} + * + * User identity file should be delimited by colon in below format. + * + * OBJ_ID:USER_NAME:USER_ID:GROUP_ID:SPI_NAME:APP_ID + * + * + * Example: + * + * a2b27aec-77bd-46dd-8c8c-39611a31:user1:11000:21000:spi-user1:abcf86e9-5a5b-49e2-a253-f5c9e2afd4ec + * + * + * Group identity file should be delimited by colon in below format. + * + * OBJ_ID:GROUP_NAME:GROUP_ID:SGP_NAME + * + * + * Example: + * + * 1d23024d-957c-4456-aac1-a57f9e2de914:group1:21000:sgp-group1 + * + */ +public class TextFileBasedIdentityHandler implements IdentityHandler { + private static final Logger LOG = LoggerFactory.getLogger(TextFileBasedIdentityHandler.class); + + /** + * Expected no of fields in the user mapping file + */ + private static final int NO_OF_FIELDS_USER_MAPPING = 6; + /** + * Expected no of fields in the group mapping file + */ + private static final int NO_OF_FIELDS_GROUP_MAPPING = 4; + /** + * Array index for the local username. + * Example: + * a2b27aec-77bd-46dd-8c8c-39611a31:user1:11000:21000:spi-user1:abcf86e9-5a5b-49e2-a253-f5c9e2afd4ec + */ + private static final int ARRAY_INDEX_FOR_LOCAL_USER_NAME = 1; + /** + * Array index for the security group name + * Example: + * 1d23024d-957c-4456-aac1-a57f9e2de914:group1:21000:sgp-group1 + */ + private static final int ARRAY_INDEX_FOR_LOCAL_GROUP_NAME = 1; + /** + * Array index for the AAD Service Principal's Object ID + */ + private static final int ARRAY_INDEX_FOR_AAD_SP_OBJECT_ID = 0; + /** + * Array index for the AAD Security Group's Object ID + */ + private static final int ARRAY_INDEX_FOR_AAD_SG_OBJECT_ID = 0; + private String userMappingFileLocation; + private String groupMappingFileLocation; + private HashMap userMap; + private HashMap groupMap; + + public TextFileBasedIdentityHandler(String userMappingFilePath, String groupMappingFilePath) { +Preconditions.checkArgument(!Strings.isNullOrEmpty(userMappingFilePath), +"Local User to Service Principal mapping filePath cannot by Null or Empty"); +Preconditions.checkArgument(!Strings.isNullOrEmpty(groupMappingFilePath), +"Local Group to Security Group mapping filePath cannot by Null or Empty"); +this.userMappingFileLocation = userMappingFilePath; +this.groupMappingFileLocation = groupMappingFilePath; +//Lazy Loading +this.userMap = new HashMap<>(); +this.groupMap = new HashMap<>(); + } + + /** + * Perform lookup from Service Principal's Object ID to Local Username Review comment: nit "." This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards,
[jira] [Commented] (HADOOP-16858) S3Guard fsck: Add option to remove orphaned entries
[ https://issues.apache.org/jira/browse/HADOOP-16858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061661#comment-17061661 ] Hudson commented on HADOOP-16858: - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18061 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/18061/]) HADOOP-16858. S3Guard fsck: Add option to remove orphaned entries (github: rev c91ff8c18ffc070eeef22afeb2e519b184398e89) * (edit) hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md * (edit) hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardToolDynamoDB.java * (edit) hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsckViolationHandler.java * (edit) hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsck.java * (edit) hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardTool.java * (edit) hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardFsck.java > S3Guard fsck: Add option to remove orphaned entries > --- > > Key: HADOOP-16858 > URL: https://issues.apache.org/jira/browse/HADOOP-16858 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > Fix For: 3.3.0, 3.3.1 > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-16858) S3Guard fsck: Add option to remove orphaned entries
[ https://issues.apache.org/jira/browse/HADOOP-16858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Bota resolved HADOOP-16858. - Resolution: Fixed > S3Guard fsck: Add option to remove orphaned entries > --- > > Key: HADOOP-16858 > URL: https://issues.apache.org/jira/browse/HADOOP-16858 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > Fix For: 3.3.1 > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16858) S3Guard fsck: Add option to remove orphaned entries
[ https://issues.apache.org/jira/browse/HADOOP-16858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061655#comment-17061655 ] Gabor Bota commented on HADOOP-16858: - Got +1 from @stevel on PR #1851, merging. > S3Guard fsck: Add option to remove orphaned entries > --- > > Key: HADOOP-16858 > URL: https://issues.apache.org/jira/browse/HADOOP-16858 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] bgaborg merged pull request #1851: HADOOP-16858. S3Guard fsck: Add option to remove orphaned entries
bgaborg merged pull request #1851: HADOOP-16858. S3Guard fsck: Add option to remove orphaned entries URL: https://github.com/apache/hadoop/pull/1851 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16858) S3Guard fsck: Add option to remove orphaned entries
[ https://issues.apache.org/jira/browse/HADOOP-16858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Bota updated HADOOP-16858: Fix Version/s: 3.3.1 > S3Guard fsck: Add option to remove orphaned entries > --- > > Key: HADOOP-16858 > URL: https://issues.apache.org/jira/browse/HADOOP-16858 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > Fix For: 3.3.1 > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1881: HADOOP-16910 Adding file system counters in ABFS
steveloughran commented on a change in pull request #1881: HADOOP-16910 Adding file system counters in ABFS URL: https://github.com/apache/hadoop/pull/1881#discussion_r394273761 ## File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemOauth.java ## @@ -143,7 +143,7 @@ public void testBlobDataReader() throws Exception { // TEST WRITE FILE try { - abfsStore.openFileForWrite(EXISTED_FILE_PATH, true); + abfsStore.openFileForWrite(EXISTED_FILE_PATH, fs.getFsStatistics(), true); Review comment: add a .close() at the end so if something went wrong and the file was opened, we close the stream This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1881: HADOOP-16910 Adding file system counters in ABFS
steveloughran commented on a change in pull request #1881: HADOOP-16910 Adding file system counters in ABFS URL: https://github.com/apache/hadoop/pull/1881#discussion_r394273198 ## File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAbfsStreamStatistics.java ## @@ -0,0 +1,147 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs; + +import org.junit.Assert; +import org.junit.Test; + +import org.apache.hadoop.fs.FSDataOutputStream; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; + +/** + * Test Abfs Stream. + */ + +public class ITestAbfsStreamStatistics extends AbstractAbfsIntegrationTest { + public ITestAbfsStreamStatistics() throws Exception { + } + + /*** + * Testing {@code incrementReadOps()} in class {@code AbfsInputStream} and + * {@code incrementWriteOps()} in class {@code AbfsOutputStream}. + * + * @throws Exception + */ + @Test + public void testAbfsStreamOps() throws Exception { +describe("Test to see correct population of read and write operations in " ++ "Abfs"); + +final AzureBlobFileSystem fs = getFileSystem(); +Path smallOperationsFile = new Path("testOneReadWriteOps"); +Path largeOperationsFile = new Path("testLargeReadWriteOps"); +FileSystem.Statistics statistics = fs.getFsStatistics(); +String testReadWriteOps = "test this"; +statistics.reset(); + +//Test for zero write operation +assertReadWriteOps("write", 0, statistics.getWriteOps()); + +//Test for zero read operation +assertReadWriteOps("read", 0, statistics.getReadOps()); + +FSDataOutputStream outForOneOperation = null; +FSDataInputStream inForOneOperation = null; +try { + outForOneOperation = fs.create(smallOperationsFile); + statistics.reset(); + outForOneOperation.write(testReadWriteOps.getBytes()); + + //Test for a single write operation + assertReadWriteOps("write", 1, statistics.getWriteOps()); + + inForOneOperation = fs.open(smallOperationsFile); + inForOneOperation.read(testReadWriteOps.getBytes(), 0, + testReadWriteOps.getBytes().length); + + //Test for a single read operation + assertReadWriteOps("read", 1, statistics.getReadOps()); + +} finally { + if (inForOneOperation != null) { +inForOneOperation.close(); + } + if (outForOneOperation != null) { +outForOneOperation.close(); + } +} + +//Validating if content is being written in the smallOperationsFile +Assert.assertTrue("Mismatch in content validation", +validateContent(fs, smallOperationsFile, +testReadWriteOps.getBytes())); + +FSDataOutputStream outForLargeOperations = null; +FSDataInputStream inForLargeOperations = null; +StringBuilder largeOperationsValidationString = new StringBuilder(); +try { + outForLargeOperations = fs.create(largeOperationsFile); + statistics.reset(); + int largeValue = 100; + for (int i = 0; i < largeValue; i++) { +outForLargeOperations.write(testReadWriteOps.getBytes()); + +//Creating the String for content Validation +largeOperationsValidationString.append(testReadWriteOps); + } + + //Test for 100 write operations + assertReadWriteOps("write", largeValue, statistics.getWriteOps()); + + inForLargeOperations = fs.open(largeOperationsFile); + for (int i = 0; i < largeValue; i++) +inForLargeOperations +.read(testReadWriteOps.getBytes(), 0, +testReadWriteOps.getBytes().length); + + //Test for 100 read operations + assertReadWriteOps("read", largeValue, statistics.getReadOps()); + +} finally { + if (inForLargeOperations != null) { Review comment: ``` IOUtils.cleanupWithLogger(LOG, inForLargeOperations, outForLargeOperations) ``` that does close on all non-null arguments, catches failures so they don't get in the way of whatever was thrown earlier. We use this throughout the hadoop codebase to clean up robustly, so get used to it
[GitHub] [hadoop] steveloughran commented on a change in pull request #1881: HADOOP-16910 Adding file system counters in ABFS
steveloughran commented on a change in pull request #1881: HADOOP-16910 Adding file system counters in ABFS URL: https://github.com/apache/hadoop/pull/1881#discussion_r394271627 ## File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/AbstractAbfsTestWithTimeout.java ## @@ -67,4 +77,46 @@ public void nameThread() { protected int getTestTimeoutMillis() { return TEST_TIMEOUT; } + + /** + * Describe a test in the logs. + * + * @param text text to print + * @param args arguments to format in the printing + */ + protected void describe(String text, Object... args) { +LOG.info("\n\n{}: {}\n", +methodName.getMethodName(), +String.format(text, args)); + } + + /** + * Validate Contents written on a file in Abfs. + * + * @param fsAzureBlobFileSystem + * @param path Path of the file + * @param originalByteArray original byte array + * @return + * @throws IOException + */ + protected boolean validateContent(AzureBlobFileSystem fs, Path path, + byte[] originalByteArray) + throws IOException { +FSDataInputStream in = fs.open(path); Review comment: needs to be closed. you can use `try (FSDataInputStream ...) { }` to manage this automatically This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus removed a comment on issue #1881: HADOOP-16910 Adding file system counters in ABFS
hadoop-yetus removed a comment on issue #1881: HADOOP-16910 Adding file system counters in ABFS URL: https://github.com/apache/hadoop/pull/1881#issuecomment-595688964 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 25m 39s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 4 new or modified test files. | ||| _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 19m 4s | trunk passed | | +1 :green_heart: | compile | 0m 31s | trunk passed | | +1 :green_heart: | checkstyle | 0m 25s | trunk passed | | +1 :green_heart: | mvnsite | 0m 34s | trunk passed | | +1 :green_heart: | shadedclient | 15m 4s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 26s | trunk passed | | +0 :ok: | spotbugs | 0m 51s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 0m 49s | trunk passed | ||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 27s | the patch passed | | +1 :green_heart: | compile | 0m 24s | the patch passed | | +1 :green_heart: | javac | 0m 24s | the patch passed | | -0 :warning: | checkstyle | 0m 17s | hadoop-tools/hadoop-azure: The patch generated 2 new + 1 unchanged - 0 fixed = 3 total (was 1) | | +1 :green_heart: | mvnsite | 0m 26s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 13m 42s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 23s | the patch passed | | +1 :green_heart: | findbugs | 0m 53s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 1m 20s | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 32s | The patch does not generate ASF License warnings. | | | | 82m 33s | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.7 Server=19.03.7 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1881/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1881 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 6db0065e9119 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 004e955 | | Default Java | 1.8.0_242 | | checkstyle | https://builds.apache.org/job/hadoop-multibranch/job/PR-1881/1/artifact/out/diff-checkstyle-hadoop-tools_hadoop-azure.txt | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1881/1/testReport/ | | Max. process+thread count | 422 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1881/1/console | | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 | | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on issue #1899: HADOOP-16914 Adding Output Stream Counters in ABFS
hadoop-yetus commented on issue #1899: HADOOP-16914 Adding Output Stream Counters in ABFS URL: https://github.com/apache/hadoop/pull/1899#issuecomment-600547500 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 28s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 4 new or modified test files. | ||| _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 21m 45s | trunk passed | | +1 :green_heart: | compile | 0m 27s | trunk passed | | +1 :green_heart: | checkstyle | 0m 20s | trunk passed | | +1 :green_heart: | mvnsite | 0m 30s | trunk passed | | +1 :green_heart: | shadedclient | 16m 13s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 23s | trunk passed | | +0 :ok: | spotbugs | 0m 49s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 0m 48s | trunk passed | | -0 :warning: | patch | 1m 5s | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | ||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 25s | the patch passed | | +1 :green_heart: | compile | 0m 21s | the patch passed | | +1 :green_heart: | javac | 0m 21s | the patch passed | | -0 :warning: | checkstyle | 0m 14s | hadoop-tools/hadoop-azure: The patch generated 16 new + 1 unchanged - 0 fixed = 17 total (was 1) | | +1 :green_heart: | mvnsite | 0m 24s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 15m 16s | patch has no errors when building and testing our client artifacts. | | -1 :x: | javadoc | 0m 19s | hadoop-tools_hadoop-azure generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) | | -1 :x: | findbugs | 0m 53s | hadoop-tools/hadoop-azure generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) | ||| _ Other Tests _ | | +1 :green_heart: | unit | 1m 8s | hadoop-azure in the patch passed. | | -1 :x: | asflicense | 0m 27s | The patch generated 3 ASF License warnings. | | | | 61m 55s | | | Reason | Tests | |---:|:--| | FindBugs | module:hadoop-tools/hadoop-azure | | | Increment of volatile field org.apache.hadoop.fs.azurebfs.services.AbfsOutputStreamStatisticsImpl.queueShrink in org.apache.hadoop.fs.azurebfs.services.AbfsOutputStreamStatisticsImpl.queueShrinked() At AbfsOutputStreamStatisticsImpl.java:in org.apache.hadoop.fs.azurebfs.services.AbfsOutputStreamStatisticsImpl.queueShrinked() At AbfsOutputStreamStatisticsImpl.java:[line 70] | | | Increment of volatile field org.apache.hadoop.fs.azurebfs.services.AbfsOutputStreamStatisticsImpl.writeCurrentBufferOperations in org.apache.hadoop.fs.azurebfs.services.AbfsOutputStreamStatisticsImpl.writeCurrentBuffer() At AbfsOutputStreamStatisticsImpl.java:in org.apache.hadoop.fs.azurebfs.services.AbfsOutputStreamStatisticsImpl.writeCurrentBuffer() At AbfsOutputStreamStatisticsImpl.java:[line 78] | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1899/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1899 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ff2694ac7a1e 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 8d63734 | | Default Java | 1.8.0_242 | | checkstyle | https://builds.apache.org/job/hadoop-multibranch/job/PR-1899/2/artifact/out/diff-checkstyle-hadoop-tools_hadoop-azure.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1899/2/artifact/out/diff-javadoc-javadoc-hadoop-tools_hadoop-azure.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1899/2/artifact/out/new-findbugs-hadoop-tools_hadoop-azure.html | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1899/2/testReport/ | | asflicense | https://builds.apache.org/job/hadoop-multibranch/job/PR-1899/2/artifact/out/patch-asflicense-problems.txt | | Max. process+thread count | 308 (vs. ulimit of 5500) | |
[GitHub] [hadoop] mehakmeet commented on a change in pull request #1899: HADOOP-16914 Adding Output Stream Counters in ABFS
mehakmeet commented on a change in pull request #1899: HADOOP-16914 Adding Output Stream Counters in ABFS URL: https://github.com/apache/hadoop/pull/1899#discussion_r394230246 ## File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAbfsOutputStream.java ## @@ -0,0 +1,278 @@ +package org.apache.hadoop.fs.azurebfs; + +import java.io.IOException; + +import org.junit.Test; + +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.azurebfs.services.AbfsOutputStream; +import org.apache.hadoop.fs.permission.FsPermission; + +/** + * Test AbfsOutputStream statistics. + */ +public class ITestAbfsOutputStream extends AbstractAbfsIntegrationTest { + + public ITestAbfsOutputStream() throws Exception { + } + + /** + * Tests to check bytes Uploading in {@link AbfsOutputStream}. + * + * @throws IOException + */ + @Test + public void testAbfsOutputStreamUploadingBytes() throws IOException { +describe("Testing Bytes uploaded in AbfsOutputSteam"); +final AzureBlobFileSystem fs = getFileSystem(); +Path uploadBytesFilePath = new Path("AbfsOutputStreamStatsPath"); +AzureBlobFileSystemStore abfss = fs.getAbfsStore(); +FileSystem.Statistics statistics = fs.getFsStatistics(); +abfss.getAbfsConfiguration().setDisableOutputStreamFlush(false); +String testBytesToUpload = "bytes"; + +AbfsOutputStream outForSomeBytes = null; +try { + outForSomeBytes = (AbfsOutputStream) abfss.createFile(uploadBytesFilePath, + statistics, + true, + FsPermission.getDefault(), FsPermission.getUMask(fs.getConf())); + + //Test for zero bytes To upload + assertValues("bytes to upload", 0, + outForSomeBytes.getOutputStreamStatistics().bytesToUpload); + + outForSomeBytes.write(testBytesToUpload.getBytes()); + outForSomeBytes.flush(); + + //Test for some bytes to upload + assertValues("bytes to upload", testBytesToUpload.getBytes().length, + outForSomeBytes.getOutputStreamStatistics().bytesToUpload); + + //Test for relation between bytesUploadSuccessful, bytesUploadFailed + // and bytesToUpload + assertValues("bytesUploadSuccessful equal to difference between " + + "bytesToUpload and bytesUploadFailed", + outForSomeBytes.getOutputStreamStatistics().bytesUploadSuccessful, + outForSomeBytes.getOutputStreamStatistics().bytesToUpload - + outForSomeBytes.getOutputStreamStatistics().bytesUploadFailed); + +} finally { + if (outForSomeBytes != null) { +outForSomeBytes.close(); + } +} + +AbfsOutputStream outForLargeBytes = null; +try { + outForLargeBytes = + (AbfsOutputStream) abfss.createFile(uploadBytesFilePath, + statistics + , true, FsPermission.getDefault(), + FsPermission.getUMask(fs.getConf())); + + int largeValue = 10; + for (int i = 0; i < largeValue; i++) { +outForLargeBytes.write(testBytesToUpload.getBytes()); + } + outForLargeBytes.flush(); + + //Test for large bytes to upload + assertValues("bytes to upload", + largeValue * (testBytesToUpload.getBytes().length), + outForLargeBytes.getOutputStreamStatistics().bytesToUpload); + + //Test for relation between bytesUploadSuccessful, bytesUploadFailed + // and bytesToUpload + assertValues("bytesUploadSuccessful equal to difference between " + + "bytesToUpload and bytesUploadFailed", + outForSomeBytes.getOutputStreamStatistics().bytesUploadSuccessful, + outForSomeBytes.getOutputStreamStatistics().bytesToUpload - + outForSomeBytes.getOutputStreamStatistics().bytesUploadFailed); + +} finally { + if (outForLargeBytes != null) { +outForLargeBytes.close(); + } +} + + } + + /** + * Tests to check time spend on waiting for tasks to be complete on a + * blocking queue in {@link AbfsOutputStream}. + * + * @throws IOException + */ + @Test + public void testAbfsOutputStreamTimeSpendOnWaitTask() throws IOException { Review comment: Need help on how to write tests for this counter. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] mehakmeet commented on a change in pull request #1899: HADOOP-16914 Adding Output Stream Counters in ABFS
mehakmeet commented on a change in pull request #1899: HADOOP-16914 Adding Output Stream Counters in ABFS URL: https://github.com/apache/hadoop/pull/1899#discussion_r394227017 ## File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAbfsOutputStream.java ## @@ -133,58 +133,60 @@ public void testAbfsOutputStreamTimeSpendOnWaitTask() throws IOException { public void testAbfsOutputStreamQueueShrink() throws IOException { describe("Testing Queue Shrink calls in AbfsOutputStream"); Review comment: Need help in writing test for this counter. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] mehakmeet commented on a change in pull request #1899: HADOOP-16914 Adding Output Stream Counters in ABFS
mehakmeet commented on a change in pull request #1899: HADOOP-16914 Adding Output Stream Counters in ABFS URL: https://github.com/apache/hadoop/pull/1899#discussion_r394228752 ## File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAbfsOutputStream.java ## @@ -133,58 +133,60 @@ public void testAbfsOutputStreamTimeSpendOnWaitTask() throws IOException { public void testAbfsOutputStreamQueueShrink() throws IOException { describe("Testing Queue Shrink calls in AbfsOutputStream"); final AzureBlobFileSystem fs = getFileSystem(); -Path TEST_PATH = new Path("AbfsOutputStreamStatsPath"); +Path queueShrinkFilePath = new Path("AbfsOutputStreamStatsPath"); AzureBlobFileSystemStore abfss = fs.getAbfsStore(); abfss.getAbfsConfiguration().setDisableOutputStreamFlush(false); FileSystem.Statistics statistics = fs.getFsStatistics(); String testQueueShrink = "testQueue"; - AbfsOutputStream outForOneOp = null; try { - outForOneOp = (AbfsOutputStream) abfss.createFile(TEST_PATH, statistics, -true, - FsPermission.getDefault(), FsPermission.getUMask(fs.getConf())); + outForOneOp = + (AbfsOutputStream) abfss.createFile(queueShrinkFilePath, statistics, + true, + FsPermission.getDefault(), FsPermission.getUMask(fs.getConf())); //Test for shrinking Queue zero time - Assert.assertEquals("Mismatch in number of queueShrink() Calls", 0, + assertValues("number of queueShrink() Calls", 0, outForOneOp.getOutputStreamStatistics().queueShrink); outForOneOp.write(testQueueShrink.getBytes()); // Queue is shrunk 2 times when outStream is flushed outForOneOp.flush(); //Test for shrinking Queue 2 times - Assert.assertEquals("Mismatch in number of queueShrink() Calls", 2, + assertValues("number of queueShrink() Calls", 2, outForOneOp.getOutputStreamStatistics().queueShrink); } finally { - if(outForOneOp != null){ + if (outForOneOp != null) { outForOneOp.close(); } } AbfsOutputStream outForLargeOps = null; try { - outForLargeOps = (AbfsOutputStream) abfss.createFile(TEST_PATH, + outForLargeOps = (AbfsOutputStream) abfss.createFile(queueShrinkFilePath, statistics, true, FsPermission.getDefault(), FsPermission.getUMask(fs.getConf())); + int largeValue = 1000; //QueueShrink is called 2 times in 1 flush(), hence 1000 flushes must // give 2000 QueueShrink calls - for (int i = 0; i < 1000; i++) { + for (int i = 0; i < largeValue; i++) { outForLargeOps.write(testQueueShrink.getBytes()); //Flush is quite expensive so 1000 calls only which takes 1 min+ outForLargeOps.flush(); Review comment: any way around flush() to get queueShrink() calls after writing ? flush() is quite expensive as it takes some time even at 1000 calls to test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] mehakmeet commented on a change in pull request #1899: HADOOP-16914 Adding Output Stream Counters in ABFS
mehakmeet commented on a change in pull request #1899: HADOOP-16914 Adding Output Stream Counters in ABFS URL: https://github.com/apache/hadoop/pull/1899#discussion_r394227894 ## File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAbfsOutputStream.java ## @@ -28,28 +26,38 @@ public ITestAbfsOutputStream() throws Exception { public void testAbfsOutputStreamUploadingBytes() throws IOException { Review comment: Need help in simulating Bytes to fail to upload in this test to get some values for bytesUploadFailed counter. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] mehakmeet commented on a change in pull request #1899: HADOOP-16914 Adding Output Stream Counters in ABFS
mehakmeet commented on a change in pull request #1899: HADOOP-16914 Adding Output Stream Counters in ABFS URL: https://github.com/apache/hadoop/pull/1899#discussion_r394227017 ## File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAbfsOutputStream.java ## @@ -133,58 +133,60 @@ public void testAbfsOutputStreamTimeSpendOnWaitTask() throws IOException { public void testAbfsOutputStreamQueueShrink() throws IOException { describe("Testing Queue Shrink calls in AbfsOutputStream"); Review comment: Need help in writing test for this counter. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] mukund-thakur commented on a change in pull request #1897: HADOOP-16319 skip invalid tests when default encryption enabled
mukund-thakur commented on a change in pull request #1897: HADOOP-16319 skip invalid tests when default encryption enabled URL: https://github.com/apache/hadoop/pull/1897#discussion_r394126196 ## File path: hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AMiscOperations.java ## @@ -256,6 +277,7 @@ public void testS3AToStringUnitialized() throws Throwable { } /** +<<< ours Review comment: Looks like this came during merging. Can be removed later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] mukund-thakur commented on a change in pull request #1881: HADOOP-16910 Adding file system counters in ABFS
mukund-thakur commented on a change in pull request #1881: HADOOP-16910 Adding file system counters in ABFS URL: https://github.com/apache/hadoop/pull/1881#discussion_r394122012 ## File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/AbstractAbfsTestWithTimeout.java ## @@ -6,9 +6,9 @@ * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * + * Review comment: remove This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org