[ 
https://issues.apache.org/jira/browse/GOBBLIN-1669?focusedWorklogId=793468&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793468
 ]

ASF GitHub Bot logged work on GOBBLIN-1669:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/Jul/22 23:13
            Start Date: 20/Jul/22 23:13
    Worklog Time Spent: 10m 
      Work Description: Will-Lo commented on code in PR #3528:
URL: https://github.com/apache/gobblin/pull/3528#discussion_r926123411


##########
gobblin-data-management/src/test/java/org/apache/gobblin/data/management/copy/TimeAwareRecursiveCopyableDatasetTest.java:
##########
@@ -216,6 +223,40 @@ public void testGetFilesAtPath() throws IOException {
       
Assert.assertTrue(candidateFiles.contains(PathUtils.getPathWithoutSchemeAndAuthority(fileStatus.getPath()).toString()));
     }
 
+    // test ds of daily/yyyy-MM-dd-HH-mm-ss
+    datePattern = "yyyy-MM-dd-HH-mm-ss";
+    formatter = DateTimeFormat.forPattern(datePattern);
+    endDate = 
LocalDateTime.now(DateTimeZone.forID(TimeAwareRecursiveCopyableDataset.DEFAULT_DATE_PATTERN_TIMEZONE));
+
+    candidateFiles = new HashSet<>();
+    for (int i = 0; i < MAX_NUM_DAILY_DIRS; i++) {
+      String startDate = 
endDate.minusDays(i).withMinuteOfHour(random.nextInt(60)).withSecondOfMinute(random.nextInt(60)).toString(formatter);

Review Comment:
   It shouldn't affect the test since the test is looking for the past 2 days 
lookback, so the minute/second shouldn't affect the outcome. We just want to 
make sure that the date partitioned by the second will actually be picked up 
(it wasn't before due to how the iterator would assume every second path 
exists, but only look for paths in the increment of minutes)





Issue Time Tracking
-------------------

    Worklog Id:     (was: 793468)
    Time Spent: 1h 40m  (was: 1.5h)

> Support seconds with TimeAwareRecursiveCopyableDataset
> ------------------------------------------------------
>
>                 Key: GOBBLIN-1669
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1669
>             Project: Apache Gobblin
>          Issue Type: Improvement
>          Components: gobblin-service
>            Reporter: William Lo
>            Assignee: Abhishek Tiwari
>            Priority: Major
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> # Support seconds with the timeiterator
>  # Optimize non-nested timestamp representations e.g. yyyy-mm-dd-hh-mm-ss to 
> not use an iterator, and instead list the top level directory to reduce the 
> number of FS calls.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to