[jira] [Updated] (HADOOP-13837) Always get unable to kill error message even the hadoop process was successfully killed

2017-12-03 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HADOOP-13837:
-
Attachment: HADOOP-13837.05.patch

> Always get unable to kill error message even the hadoop process was 
> successfully killed
> ---
>
> Key: HADOOP-13837
> URL: https://issues.apache.org/jira/browse/HADOOP-13837
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: scripts
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Critical
> Attachments: HADOOP-13837.01.patch, HADOOP-13837.02.patch, 
> HADOOP-13837.03.patch, HADOOP-13837.04.patch, HADOOP-13837.05.patch, 
> check_proc.sh
>
>
> *Reproduce steps*
> # Setup a hadoop cluster
> # Stop resource manager : yarn --daemon stop resourcemanager
> # Stop node manager : yarn --daemon stop nodemanager
> WARNING: nodemanager did not stop gracefully after 5 seconds: Trying to kill 
> with kill -9
> ERROR: Unable to kill 20325
> it always gets "Unable to kill " error message, this gives user 
> impression there is something wrong with the node manager process because it 
> was not able to be forcibly killed. But in fact, the kill command works as 
> expected.
> This was because hadoop-functions.sh did not check process existence after 
> kill properly. Currently it checks the process liveness right after the kill 
> command
> {code}
> ...
> kill -9 "${pid}" >/dev/null 2>&1
> if ps -p "${pid}" > /dev/null 2>&1; then
>   hadoop_error "ERROR: Unable to kill ${pid}"
> ...
> {code}
> when resource manager stopped before node managers, it always takes some 
> additional time until the process completely terminates. I tried to print 
> output of {{ps -p }} in a while loop after kill -9, and found 
> following
> {noformat}
> 16212 ?00:00:11 java 
> 0
>   PID TTY  TIME CMD
> 16212 ?00:00:11 java 
> 0
>   PID TTY  TIME CMD
> 16212 ?00:00:11 java 
> 0
>   PID TTY  TIME CMD
> 1
>   PID TTY  TIME CMD
> 1
>   PID TTY  TIME CMD
> 1
>   PID TTY  TIME CMD
> ...
> {noformat}
> in the first 3 times of the loop, the process did not terminate so the exit 
> code of {{ps -p}} are still {{0}}
> *Proposal of a fix*
> Firstly I was thinking to add a more comprehensive pid check, it checks the 
> pid liveness until reaches the HADOOP_STOP_TIMEOUT, but this seems to add too 
> much complexity. Second fix was to simply add a {{sleep 3}} after {{kill 
> -9}}, it should fix the error in most cases with relative small changes to 
> the script.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13837) Always get unable to kill error message even the hadoop process was successfully killed

2017-12-03 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276404#comment-16276404
 ] 

Weiwei Yang commented on HADOOP-13837:
--

I am still seeing this issue in latest trunk, I am increasing severity to 
critical as this is almost happen every time which creates bad user experience.

> Always get unable to kill error message even the hadoop process was 
> successfully killed
> ---
>
> Key: HADOOP-13837
> URL: https://issues.apache.org/jira/browse/HADOOP-13837
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: scripts
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Attachments: HADOOP-13837.01.patch, HADOOP-13837.02.patch, 
> HADOOP-13837.03.patch, HADOOP-13837.04.patch, check_proc.sh
>
>
> *Reproduce steps*
> # Setup a hadoop cluster
> # Stop resource manager : yarn --daemon stop resourcemanager
> # Stop node manager : yarn --daemon stop nodemanager
> WARNING: nodemanager did not stop gracefully after 5 seconds: Trying to kill 
> with kill -9
> ERROR: Unable to kill 20325
> it always gets "Unable to kill " error message, this gives user 
> impression there is something wrong with the node manager process because it 
> was not able to be forcibly killed. But in fact, the kill command works as 
> expected.
> This was because hadoop-functions.sh did not check process existence after 
> kill properly. Currently it checks the process liveness right after the kill 
> command
> {code}
> ...
> kill -9 "${pid}" >/dev/null 2>&1
> if ps -p "${pid}" > /dev/null 2>&1; then
>   hadoop_error "ERROR: Unable to kill ${pid}"
> ...
> {code}
> when resource manager stopped before node managers, it always takes some 
> additional time until the process completely terminates. I tried to print 
> output of {{ps -p }} in a while loop after kill -9, and found 
> following
> {noformat}
> 16212 ?00:00:11 java 
> 0
>   PID TTY  TIME CMD
> 16212 ?00:00:11 java 
> 0
>   PID TTY  TIME CMD
> 16212 ?00:00:11 java 
> 0
>   PID TTY  TIME CMD
> 1
>   PID TTY  TIME CMD
> 1
>   PID TTY  TIME CMD
> 1
>   PID TTY  TIME CMD
> ...
> {noformat}
> in the first 3 times of the loop, the process did not terminate so the exit 
> code of {{ps -p}} are still {{0}}
> *Proposal of a fix*
> Firstly I was thinking to add a more comprehensive pid check, it checks the 
> pid liveness until reaches the HADOOP_STOP_TIMEOUT, but this seems to add too 
> much complexity. Second fix was to simply add a {{sleep 3}} after {{kill 
> -9}}, it should fix the error in most cases with relative small changes to 
> the script.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13837) Always get unable to kill error message even the hadoop process was successfully killed

2017-12-03 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HADOOP-13837:
-
Priority: Critical  (was: Major)

> Always get unable to kill error message even the hadoop process was 
> successfully killed
> ---
>
> Key: HADOOP-13837
> URL: https://issues.apache.org/jira/browse/HADOOP-13837
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: scripts
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Critical
> Attachments: HADOOP-13837.01.patch, HADOOP-13837.02.patch, 
> HADOOP-13837.03.patch, HADOOP-13837.04.patch, check_proc.sh
>
>
> *Reproduce steps*
> # Setup a hadoop cluster
> # Stop resource manager : yarn --daemon stop resourcemanager
> # Stop node manager : yarn --daemon stop nodemanager
> WARNING: nodemanager did not stop gracefully after 5 seconds: Trying to kill 
> with kill -9
> ERROR: Unable to kill 20325
> it always gets "Unable to kill " error message, this gives user 
> impression there is something wrong with the node manager process because it 
> was not able to be forcibly killed. But in fact, the kill command works as 
> expected.
> This was because hadoop-functions.sh did not check process existence after 
> kill properly. Currently it checks the process liveness right after the kill 
> command
> {code}
> ...
> kill -9 "${pid}" >/dev/null 2>&1
> if ps -p "${pid}" > /dev/null 2>&1; then
>   hadoop_error "ERROR: Unable to kill ${pid}"
> ...
> {code}
> when resource manager stopped before node managers, it always takes some 
> additional time until the process completely terminates. I tried to print 
> output of {{ps -p }} in a while loop after kill -9, and found 
> following
> {noformat}
> 16212 ?00:00:11 java 
> 0
>   PID TTY  TIME CMD
> 16212 ?00:00:11 java 
> 0
>   PID TTY  TIME CMD
> 16212 ?00:00:11 java 
> 0
>   PID TTY  TIME CMD
> 1
>   PID TTY  TIME CMD
> 1
>   PID TTY  TIME CMD
> 1
>   PID TTY  TIME CMD
> ...
> {noformat}
> in the first 3 times of the loop, the process did not terminate so the exit 
> code of {{ps -p}} are still {{0}}
> *Proposal of a fix*
> Firstly I was thinking to add a more comprehensive pid check, it checks the 
> pid liveness until reaches the HADOOP_STOP_TIMEOUT, but this seems to add too 
> much complexity. Second fix was to simply add a {{sleep 3}} after {{kill 
> -9}}, it should fix the error in most cases with relative small changes to 
> the script.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15088) BufferedInputStream.skip function can return 0 when the file is corrupted, causing an infinite loop

2017-12-03 Thread John Doe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Doe updated HADOOP-15088:
--
Description: 
When a file is corrupted, the skip function can return 0, causing an infinite 
loop.
Here is the code:

{code:java}
  private boolean slowReadUntilMatch(Pattern markPattern, boolean includePat,
 DataOutputBuffer outBufOrNull) throws 
IOException {
  ...
  for (long skiplen = endPos; skiplen > 0; ) {
skiplen -= bin_.skip(skiplen); 
  }
  ...
  }
{code}
Similar bugs are 
[Hadoop-8614|https://issues.apache.org/jira/browse/HADOOP-8614], 
[Yarn-2905|https://issues.apache.org/jira/browse/YARN-2905], 
[Yarn-163|https://issues.apache.org/jira/browse/YARN-163], 
[MAPREDUCE-6990|https://issues.apache.org/jira/browse/MAPREDUCE-6990]

  was:
When a file is corrupted, the skip function can return 0, causing an infinite 
loop.
Here is the code:

{code:java}
  private boolean slowReadUntilMatch(Pattern markPattern, boolean includePat,
 DataOutputBuffer outBufOrNull) throws 
IOException {
  ...
  for (long skiplen = endPos; skiplen > 0; ) {
skiplen -= bin_.skip(skiplen); 
  }
  ...
  }
{code}



> BufferedInputStream.skip function can return 0 when the file is corrupted, 
> causing an infinite loop
> ---
>
> Key: HADOOP-15088
> URL: https://issues.apache.org/jira/browse/HADOOP-15088
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: streaming
>Affects Versions: 2.5.0
>Reporter: John Doe
>
> When a file is corrupted, the skip function can return 0, causing an infinite 
> loop.
> Here is the code:
> {code:java}
>   private boolean slowReadUntilMatch(Pattern markPattern, boolean includePat,
>  DataOutputBuffer outBufOrNull) throws 
> IOException {
>   ...
>   for (long skiplen = endPos; skiplen > 0; ) {
> skiplen -= bin_.skip(skiplen); 
>   }
>   ...
>   }
> {code}
> Similar bugs are 
> [Hadoop-8614|https://issues.apache.org/jira/browse/HADOOP-8614], 
> [Yarn-2905|https://issues.apache.org/jira/browse/YARN-2905], 
> [Yarn-163|https://issues.apache.org/jira/browse/YARN-163], 
> [MAPREDUCE-6990|https://issues.apache.org/jira/browse/MAPREDUCE-6990]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15088) BufferedInputStream.skip function can return 0 when the file is corrupted, causing an infinite loop

2017-12-03 Thread John Doe (JIRA)
John Doe created HADOOP-15088:
-

 Summary: BufferedInputStream.skip function can return 0 when the 
file is corrupted, causing an infinite loop
 Key: HADOOP-15088
 URL: https://issues.apache.org/jira/browse/HADOOP-15088
 Project: Hadoop Common
  Issue Type: Bug
  Components: streaming
Affects Versions: 2.5.0
Reporter: John Doe


When a file is corrupted, the skip function can return 0, causing an infinite 
loop.
Here is the code:

{code:java}
  private boolean slowReadUntilMatch(Pattern markPattern, boolean includePat,
 DataOutputBuffer outBufOrNull) throws 
IOException {
  ...
  for (long skiplen = endPos; skiplen > 0; ) {
skiplen -= bin_.skip(skiplen); 
  }
  ...
  }
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15039) Move SemaphoredDelegatingExecutor to hadoop-common

2017-12-03 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276251#comment-16276251
 ] 

Kai Zheng commented on HADOOP-15039:


Thanks [~uncleGen] for the work. It LGTM and the minor comment is, would you 
clean up some bit about SemaphoredDelegatingExecutor? The comments in the class 
header would be generic since it's moved to the common module a a utility. 

+1 pending on the minor.

> Move SemaphoredDelegatingExecutor to hadoop-common
> --
>
> Key: HADOOP-15039
> URL: https://issues.apache.org/jira/browse/HADOOP-15039
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, fs/oss, fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Genmao Yu
>Assignee: Genmao Yu
>Priority: Minor
> Attachments: HADOOP-15039.001.patch, HADOOP-15039.002.patch, 
> HADOOP-15039.003.patch
>
>
> Detailed discussions in HADOOP-14999 and HADOOP-15027.
> share {{SemaphoredDelegatingExecutor}} and move it to {{hadoop-common}}.
> cc [~ste...@apache.org] 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15087) Write directly without creating temp directory to avoid rename

2017-12-03 Thread Yonger (JIRA)
Yonger created HADOOP-15087:
---

 Summary: Write directly without creating temp directory to avoid 
rename 
 Key: HADOOP-15087
 URL: https://issues.apache.org/jira/browse/HADOOP-15087
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/s3
Reporter: Yonger


Rename in workloads like Teragen/Terasort who use Hadoop default 
outputcommitters really hurt performance a lot. 
Stocator announce it doesn't create the temporary directories any all, and 
still preserves Hadoop's fault tolerance. I add a switch when creating file via 
integrating it's code into s3a, I got 5x performance gain in Teragen and 15% 
performance improvement in Terasort.

 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15039) Move SemaphoredDelegatingExecutor to hadoop-common

2017-12-03 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HADOOP-15039:
---
Summary: Move SemaphoredDelegatingExecutor to hadoop-common  (was: move 
SemaphoredDelegatingExecutor to hadoop-common)

> Move SemaphoredDelegatingExecutor to hadoop-common
> --
>
> Key: HADOOP-15039
> URL: https://issues.apache.org/jira/browse/HADOOP-15039
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, fs/oss, fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Genmao Yu
>Assignee: Genmao Yu
>Priority: Minor
> Attachments: HADOOP-15039.001.patch, HADOOP-15039.002.patch, 
> HADOOP-15039.003.patch
>
>
> Detailed discussions in HADOOP-14999 and HADOOP-15027.
> share {{SemaphoredDelegatingExecutor}} and move it to {{hadoop-common}}.
> cc [~ste...@apache.org] 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15072) Upgrade Apache Kerby version to 1.1.0

2017-12-03 Thread Jiajia Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276239#comment-16276239
 ] 

Jiajia Li commented on HADOOP-15072:


bq. What's "includes a GSSAPI module"?
Apache Kerby 1.1.0 includes the GSSAPI module. Kerby provides a GSS 
implementation because JDK hides the GSSAPI layer, it could be more flexibility 
with this feature. For example, supporting to send a JWT AccessToken via the 
GSS API.

bq. Does this replace the one in the JDK?
At least not now to replace, just provides the way for new authentication 
mechanism that cannot be done by the built-in Java GSS implementation.

> Upgrade Apache Kerby version to 1.1.0
> -
>
> Key: HADOOP-15072
> URL: https://issues.apache.org/jira/browse/HADOOP-15072
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HADOOP-15072-001.patch
>
>
> Apache Kerby 1.1.0 implements cross-realm support, and also includes a GSSAPI 
> module.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15038) Abstract MetadataStore in S3Guard into a common module.

2017-12-03 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276234#comment-16276234
 ] 

Kai Zheng edited comment on HADOOP-15038 at 12/4/17 2:56 AM:
-

Thanks for the discussion, folks.

The idea to extract MetadataStore into a Hadoop common module sounds good to 
me, but when do it, we should be careful not to introduce non-trivial 
dependencies.

[~ste...@apache.org] it looks to me the cloudup package would be a good fit for 
the new module. Could you introduce the new module when do the work of 
HADOOP-14766? If nice, how to name the module (under hadoop-common-project)? 
IMO hadoop-cloud-common might be better, sure hadoop-cloud-core is also good to 
me.

[~fabbri] do you have any concerns if the main work of MetadataStore is 
extracted and put into hadoop-common-project/hadoop-cloud-common? 

Not sure if [~chris.douglas] would have more thoughts on this.

Thanks!


was (Author: drankye):
Thanks for the discussion, folks.

The idea to extract MetadataStore into a Hadoop common module sounds good to 
me, but when do it, we should be careful not to introduce non-trivial 
dependencies.

[~ste...@apache.org] it looks to me the cloudup package would be a good fit for 
the new module. Could you introduce the new module when do the work of 
HADOOP-14766? If nice, how to name the module (under hadoop-common-project)? 
IMO hadoop-cloud-common might be better, sure hadoop-cloud-core is also good to 
me.

[~ajfabbri] do you have any concerns if the main work of MetadataStore is 
extracted and put into hadoop-common-project/hadoop-cloud-common? 

Not sure if [~chris.douglas] would have more thoughts on this.

Thanks!

> Abstract MetadataStore in S3Guard into a common module.
> ---
>
> Key: HADOOP-15038
> URL: https://issues.apache.org/jira/browse/HADOOP-15038
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 3.0.0-beta1
>Reporter: Genmao Yu
>
> Open this JIRA to discuss if we should move {{MetadataStore}} in {{S3Guard}} 
> into a common module. 
> Based on this work, other filesystem or object store can implement their own 
> metastore for optimization (known issues like consistency problem and 
> metadata operation performance). [~ste...@apache.org] and other guys have 
> done many base and great works in {{S3Guard}}. It is very helpful to start 
> work. I did some perf test in HADOOP-14098, and started related work for 
> Aliyun OSS.  Indeed there are still works to do for {{S3Guard}}, like 
> metadata cache inconsistent with S3 and so on. It also will be a problem for 
> other object store. However, we can do these works in parallel.
> [~ste...@apache.org] [~fabbri] [~drankye] Any suggestion is appreciated.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15038) Abstract MetadataStore in S3Guard into a common module.

2017-12-03 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276234#comment-16276234
 ] 

Kai Zheng commented on HADOOP-15038:


Thanks for the discussion, folks.

The idea to extract MetadataStore into a Hadoop common module sounds good to 
me, but when do it, we should be careful not to introduce non-trivial 
dependencies.

[~ste...@apache.org] it looks to me the cloudup package would be a good fit for 
the new module. Could you introduce the new module when do the work of 
HADOOP-14766? If nice, how to name the module (under hadoop-common-project)? 
IMO hadoop-cloud-common might be better, sure hadoop-cloud-core is also good to 
me.

[~ajfabbri] do you have any concerns if the main work of MetadataStore is 
extracted and put into hadoop-common-project/hadoop-cloud-common? 

Not sure if [~chris.douglas] would have more thoughts on this.

Thanks!

> Abstract MetadataStore in S3Guard into a common module.
> ---
>
> Key: HADOOP-15038
> URL: https://issues.apache.org/jira/browse/HADOOP-15038
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 3.0.0-beta1
>Reporter: Genmao Yu
>
> Open this JIRA to discuss if we should move {{MetadataStore}} in {{S3Guard}} 
> into a common module. 
> Based on this work, other filesystem or object store can implement their own 
> metastore for optimization (known issues like consistency problem and 
> metadata operation performance). [~ste...@apache.org] and other guys have 
> done many base and great works in {{S3Guard}}. It is very helpful to start 
> work. I did some perf test in HADOOP-14098, and started related work for 
> Aliyun OSS.  Indeed there are still works to do for {{S3Guard}}, like 
> metadata cache inconsistent with S3 and so on. It also will be a problem for 
> other object store. However, we can do these works in parallel.
> [~ste...@apache.org] [~fabbri] [~drankye] Any suggestion is appreciated.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13192) org.apache.hadoop.util.LineReader cannot handle multibyte delimiters correctly

2017-12-03 Thread David Jou (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276229#comment-16276229
 ] 

David Jou commented on HADOOP-13192:


I wanna to report test case to show multibyte delimiter between buffers still 
incorrect. If the ambiguous characters is longer than one, the match processing 
will only do once and send all ambiguous characters as data when not matched.

Delimiter = "***|";
String CurrentBufferTailToken
= "***|data***";
String NextBufferHeadToken
= "*|";
   

> org.apache.hadoop.util.LineReader cannot handle multibyte delimiters correctly
> --
>
> Key: HADOOP-13192
> URL: https://issues.apache.org/jira/browse/HADOOP-13192
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: util
>Affects Versions: 2.6.2
>Reporter: binde
>Assignee: binde
>Priority: Critical
> Fix For: 2.8.0, 2.7.3, 2.6.5, 3.0.0-alpha1
>
> Attachments: 
> 0001-HADOOP-13192-org.apache.hadoop.util.LineReader-match.patch, 
> 0002-fix-bug-hadoop-1392-add-test-case-for-LineReader.patch, 
> HADOOP-13192.final.patch
>
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> org.apache.hadoop.util.LineReader.readCustomLine()  has a bug,
> when line is   bccc, recordDelimiter is aaab, the result should be a,ccc,
> show the code on line 310:
>   for (; bufferPosn < bufferLength; ++bufferPosn) {
> if (buffer[bufferPosn] == recordDelimiterBytes[delPosn]) {
>   delPosn++;
>   if (delPosn >= recordDelimiterBytes.length) {
> bufferPosn++;
> break;
>   }
> } else if (delPosn != 0) {
>   bufferPosn--;
>   delPosn = 0;
> }
>   }
> shoud be :
>   for (; bufferPosn < bufferLength; ++bufferPosn) {
> if (buffer[bufferPosn] == recordDelimiterBytes[delPosn]) {
>   delPosn++;
>   if (delPosn >= recordDelimiterBytes.length) {
> bufferPosn++;
> break;
>   }
> } else if (delPosn != 0) {
>  // - change here - start 
>   bufferPosn -= delPosn;
>  // - change here - end 
>   
>   delPosn = 0;
> }
>   }



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14294) Rename ADLS mountpoint properties

2017-12-03 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276069#comment-16276069
 ] 

John Zhuge commented on HADOOP-14294:
-

Excellent! I have added you as a Hadoop Common contributor and assigned this 
JIRA to you.

> Rename ADLS mountpoint properties
> -
>
> Key: HADOOP-14294
> URL: https://issues.apache.org/jira/browse/HADOOP-14294
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/adl
>Affects Versions: 2.8.0
>Reporter: John Zhuge
>Assignee: Rohan Garg
>
> Follow up to HADOOP-14038. Rename the prefix of 
> {{dfs.adls..mountpoint}} and {{dfs.adls..hostname}} to 
> {{fs.adl.}}.
> Borrow code from 
> https://issues.apache.org/jira/secure/attachment/12857500/HADOOP-14038.006.patch
>  and add a few unit tests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-14294) Rename ADLS mountpoint properties

2017-12-03 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge reassigned HADOOP-14294:
---

Assignee: Rohan Garg  (was: John Zhuge)

> Rename ADLS mountpoint properties
> -
>
> Key: HADOOP-14294
> URL: https://issues.apache.org/jira/browse/HADOOP-14294
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/adl
>Affects Versions: 2.8.0
>Reporter: John Zhuge
>Assignee: Rohan Garg
>
> Follow up to HADOOP-14038. Rename the prefix of 
> {{dfs.adls..mountpoint}} and {{dfs.adls..hostname}} to 
> {{fs.adl.}}.
> Borrow code from 
> https://issues.apache.org/jira/secure/attachment/12857500/HADOOP-14038.006.patch
>  and add a few unit tests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14294) Rename ADLS mountpoint properties

2017-12-03 Thread Rohan Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276030#comment-16276030
 ] 

Rohan Garg commented on HADOOP-14294:
-

[~jzhuge] : Hi John, can I take this up? I have created a patch draft for it : 
https://gist.github.com/rohangarg/7aa44c4292704eddfe3f03d6c38867e1

> Rename ADLS mountpoint properties
> -
>
> Key: HADOOP-14294
> URL: https://issues.apache.org/jira/browse/HADOOP-14294
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/adl
>Affects Versions: 2.8.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>
> Follow up to HADOOP-14038. Rename the prefix of 
> {{dfs.adls..mountpoint}} and {{dfs.adls..hostname}} to 
> {{fs.adl.}}.
> Borrow code from 
> https://issues.apache.org/jira/secure/attachment/12857500/HADOOP-14038.006.patch
>  and add a few unit tests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org