[jira] [Work logged] (HADOOP-15880) WASB doesn't honor fs.trash.interval and this fails to auto purge trash folder

2021-03-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15880?focusedWorklogId=562148=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-562148
 ]

ASF GitHub Bot logged work on HADOOP-15880:
---

Author: ASF GitHub Bot
Created on: 08/Mar/21 06:35
Start Date: 08/Mar/21 06:35
Worklog Time Spent: 10m 
  Work Description: lamber-ken closed pull request #2750:
URL: https://github.com/apache/hadoop/pull/2750


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 562148)
Time Spent: 20m  (was: 10m)

> WASB doesn't honor fs.trash.interval and this fails to auto purge trash folder
> --
>
> Key: HADOOP-15880
> URL: https://issues.apache.org/jira/browse/HADOOP-15880
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation, fs/azure
>Affects Versions: 2.7.3
> Environment: Any HDInsigth cluster pointing to WASB. 
>Reporter: Sunil Kumar Chakrapani
>Priority: Minor
>  Labels: WASB, pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> when "fs.trash.interval" is set to a value,  trash for the local hdfs got 
> cleared where as the trash folder on WASB doesn't get deleted and the files 
> get piled up on WASB store..
> WASB doesn't pick up  fs.trash.interval value and this fails to auto purge 
> trash folder on WASB store.
>  
> *Issue : WASB doesn't honor fs.trash.interval and this fails to auto purge 
> trash folder*
> *Steps to reproduce Scenario:*
> *Delete any file stored on HDFS*
> hdfs dfs -D "fs.default.name=hdfs://mycluster/" -rm /hivestore.txt
> 18/10/23 06:18:05 INFO fs.TrashPolicyDefault: Moved: 
> 'hdfs://mycluster/hivestore.txt' to trash at: 
> hdfs://mycluster/user/sshuser/.Trash/Current/hivestore.txt
> *When deleted the file is moved to trash folder* 
> hdfs dfs -rm wasb:///hivestore.txt
> 18/10/23 06:19:13 INFO fs.TrashPolicyDefault: Moved: 
> 'wasb://kcspark-2018-10-18t17-07-40-5...@kcdnsproxy.blob.core.windows.net/hivestore.txt'
>  to trash at: 
> wasb://kcspark-2018-10-18t17-07-40-5...@kcdnsproxy.blob.core.windows.net/user/sshuser/.Trash/Current/hivestore.txt
> *Reduced the fs.trash.interval from 360 to 1 and restarted all related 
> services.*
> *Trash for the local hdfs gets cleared honoring the "fs.trash.interval" 
> value.*
> hdfs dfs -D "fs.default.name=hdfs://mycluster/" -ls 
> hdfs://mycluster/user/sshuser/.Trash/Current/
> ls: File hdfs://mycluster/user/sshuser/.Trash/Current does not exist.
> *Where as the trash for WASB doesn't get cleared.*
> hdfs dfs -ls 
> wasb://kcspark-2018-10-18t17-07-40-5...@kcdnsproxy.blob.core.windows.net/user/sshuser/.Trash/Current/
> Found 1 items
> -rw-r--r-- 1 sshuser supergroup 1084 2018-10-23 06:19 
> wasb://kcspark-2018-10-18t17-07-40-5...@kcdnsproxy.blob.core.windows.net/user/sshuser/.Trash/Current/hivestore.txt
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-15880) WASB doesn't honor fs.trash.interval and this fails to auto purge trash folder

2021-03-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15880?focusedWorklogId=561824=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-561824
 ]

ASF GitHub Bot logged work on HADOOP-15880:
---

Author: ASF GitHub Bot
Created on: 06/Mar/21 17:21
Start Date: 06/Mar/21 17:21
Worklog Time Spent: 10m 
  Work Description: lamber-ken opened a new pull request #2750:
URL: https://github.com/apache/hadoop/pull/2750


   
   ## ISSUE
   https://issues.apache.org/jira/browse/HDFS-15880
   
   ## OUTLINE
   
   The SNN always send end logsegment rpc in cycle(dfs.ha.log-roll.period=120s).
   - 
   - If no edit request occurs during this period, ANN will still send the RPC 
request to the journal node which is redundant.
   - The `FSEditLog#endCurrentLogSegment` is synchronized will affetc other rpc 
requests.
   
   
   Editlogs only contains two operations :
   ```
   LogSegmentOp [opCode=OP_START_LOG_SEGMENT, txid=-12345]
   LogSegmentOp [opCode=OP_END_LOG_SEGMENT, txid=-12345]
   ```
   
   ```
   [dcadmin@dw-work-006 ~]$ ll 
/work/data/hadoop/hdfs/journalnode/nn-work/current/
   total 1100
   -rw-r- 1 dcadmin dcadmin   8 Mar  7 01:02 committed-txid
   -rw-r- 1 dcadmin dcadmin  42 Mar  7 00:35 
edits_001-002
   -rw-r- 1 dcadmin dcadmin  42 Mar  7 00:37 
edits_003-004
   -rw-r- 1 dcadmin dcadmin  42 Mar  7 00:39 
edits_005-006
   -rw-r- 1 dcadmin dcadmin  42 Mar  7 00:41 
edits_007-008
   -rw-r- 1 dcadmin dcadmin  42 Mar  7 00:43 
edits_009-010
   -rw-r- 1 dcadmin dcadmin  42 Mar  7 00:45 
edits_011-012
   -rw-r- 1 dcadmin dcadmin  42 Mar  7 00:48 
edits_013-014
   -rw-r- 1 dcadmin dcadmin  42 Mar  7 00:50 
edits_015-016
   -rw-r- 1 dcadmin dcadmin  42 Mar  7 00:52 
edits_017-018
   -rw-r- 1 dcadmin dcadmin  42 Mar  7 00:54 
edits_019-020
   -rw-r- 1 dcadmin dcadmin  42 Mar  7 00:56 
edits_021-022
   -rw-r- 1 dcadmin dcadmin  42 Mar  7 00:58 
edits_023-024
   -rw-r- 1 dcadmin dcadmin  42 Mar  7 01:00 
edits_025-026
   -rw-r- 1 dcadmin dcadmin  42 Mar  7 01:02 
edits_027-028
   ```
   
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 561824)
Remaining Estimate: 0h
Time Spent: 10m

> WASB doesn't honor fs.trash.interval and this fails to auto purge trash folder
> --
>
> Key: HADOOP-15880
> URL: https://issues.apache.org/jira/browse/HADOOP-15880
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation, fs/azure
>Affects Versions: 2.7.3
> Environment: Any HDInsigth cluster pointing to WASB. 
>Reporter: Sunil Kumar Chakrapani
>Priority: Minor
>  Labels: WASB
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> when "fs.trash.interval" is set to a value,  trash for the local hdfs got 
> cleared where as the trash folder on WASB doesn't get deleted and the files 
> get piled up on WASB store..
> WASB doesn't pick up  fs.trash.interval value and this fails to auto purge 
> trash folder on WASB store.
>  
> *Issue : WASB doesn't honor fs.trash.interval and this fails to auto purge 
> trash folder*
> *Steps to reproduce Scenario:*
> *Delete any file stored on HDFS*
> hdfs dfs -D "fs.default.name=hdfs://mycluster/" -rm /hivestore.txt
> 18/10/23 06:18:05 INFO fs.TrashPolicyDefault: Moved: 
> 'hdfs://mycluster/hivestore.txt' to trash at: 
> hdfs://mycluster/user/sshuser/.Trash/Current/hivestore.txt
> *When deleted the file is moved to trash folder* 
> hdfs dfs -rm wasb:///hivestore.txt
> 18/10/23 06:19:13 INFO fs.TrashPolicyDefault: Moved: 
> 'wasb://kcspark-2018-10-18t17-07-40-5...@kcdnsproxy.blob.core.windows.net/hivestore.txt'
>  to trash at: 
> wasb://kcspark-2018-10-18t17-07-40-5...@kcdnsproxy.blob.core.windows.net/user/sshuser/.Trash/Current/hivestore.txt
> *Reduced the fs.trash.interval from 360 to 1 and restarted all related 
> services.*
> *Trash for the local hdfs gets cleared