[ https://issues.apache.org/jira/browse/GOBBLIN-1001?focusedWorklogId=358097&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-358097 ]
ASF GitHub Bot logged work on GOBBLIN-1001: ------------------------------------------- Author: ASF GitHub Bot Created on: 11/Dec/19 21:28 Start Date: 11/Dec/19 21:28 Worklog Time Spent: 10m Work Description: zxcware commented on issue #2846: [GOBBLIN-1001] Implement TimePartitionGlobFinder URL: https://github.com/apache/incubator-gobblin/pull/2846#issuecomment-564740742 @autumnust Yeah, `yesterdayPartition` is really specific, I'm thinking about generalize it to `enforcePreviousN`(looking for better name suggestions) partitions. Its main responsibility is to create `EmptyFileSystemDataset` if any of the previous N doesn't exist, signaling quiet time. In addition, it focuses on time partitions and supports different time formats(not limitted to `yyyy/MM/dd`) compared to vanilla `DefaultFileSystemGlobFinder`. (I'm adding comments about it s usage) By `enforcePreviousN`, it's tied with company requirements even less and makes it more justifiable to open-source. In our use case, we capture the quiet time signal to publish compaction watermark. It can be captured by others to do different operations. Another consideration was we have to make internal copies of open source compaction constructs(`MRTask`, `Verifier`, `CompactionAction`), if `EmptyFileSystemDataset` is made internal. Compared to make `EmptyFileSystemDataset` first citizen of open source compaction flow, the implementation and mountainous cost of internalization is high, given most of our pipelines use open source compaction constructs ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 358097) Time Spent: 50m (was: 40m) > Implement TimePartitionGlobFinder > --------------------------------- > > Key: GOBBLIN-1001 > URL: https://issues.apache.org/jira/browse/GOBBLIN-1001 > Project: Apache Gobblin > Issue Type: Task > Reporter: Zhixiong Chen > Assignee: Zhixiong Chen > Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)