[ 
https://issues.apache.org/jira/browse/GOBBLIN-1602?focusedWorklogId=721350&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-721350
 ]

ASF GitHub Bot logged work on GOBBLIN-1602:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 05/Feb/22 02:25
            Start Date: 05/Feb/22 02:25
    Worklog Time Spent: 10m 
      Work Description: phet commented on a change in pull request #3459:
URL: https://github.com/apache/gobblin/pull/3459#discussion_r799924710



##########
File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/hive/HiveCopyEntityHelper.java
##########
@@ -750,9 +751,14 @@ else if (desiredTargetExistingPaths.size() > 0) {
 
   private void checkPartitionedTableCompatibility(Table desiredTargetTable, 
Table existingTargetTable)
       throws IOException {
-    if 
(!desiredTargetTable.getDataLocation().equals(existingTargetTable.getDataLocation()))
 {
-      throw new 
HiveTableLocationNotMatchException(desiredTargetTable.getDataLocation(),
-          existingTargetTable.getDataLocation());
+    try {
+      if (!this.targetFs.resolvePath(desiredTargetTable.getDataLocation())
+          
.equals(this.targetFs.resolvePath(existingTargetTable.getDataLocation()))) {
+        throw new 
HiveTableLocationNotMatchException(desiredTargetTable.getDataLocation(), 
existingTargetTable.getDataLocation());

Review comment:
       the exception args no longer reflect the equality operands.  could this 
message become confusing (e.g. if the two do match on their own, but however 
didn't upon mapping through `this.targetFs.resolvePath()`)?

##########
File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/hive/UnpartitionedTableFileSet.java
##########
@@ -64,11 +65,21 @@ public UnpartitionedTableFileSet(String name, HiveDataset 
dataset, HiveCopyEntit
 
     Optional<Table> existingTargetTable = this.helper.getExistingTargetTable();
     if (existingTargetTable.isPresent()) {
-      if 
(!this.helper.getTargetTable().getDataLocation().equals(existingTargetTable.get().getDataLocation()))
 {
+      boolean path_mismatch = false;
+      try {
+        if 
(!this.helper.getTargetFs().resolvePath(this.helper.getTargetTable().getDataLocation())
+            
.equals(this.helper.getTargetFs().resolvePath(existingTargetTable.get().getDataLocation())))
 {
+          path_mismatch = true;
+        }
+      } catch (FileNotFoundException e) {
+        // If desired path does not exist, then user is defining a different 
snapshot path so check policy
+        path_mismatch = true;
+      }
+      if (path_mismatch) {

Review comment:
       I was actually contemplating similar (boolean flag) above, when I saw 
two code paths to the same exception thrown.  since java has no widespread, 
canonical lib for converting exception control flow into values (like scala's 
https://www.scala-lang.org/api/2.12.4/scala/util/control/Exception$.html ) 
there's no simple way to phrase this.
   
   as this already recurs twice, I'd seek a utility abstraction.  maybe just a 
static method taking a `FileSystem` and two 'locations' (what we call 
`.getDataLocation()` on).  it would be an equality predicate (capturing any 
exception within and converting to `false`).

##########
File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/hive/HivePartitionFileSet.java
##########
@@ -194,9 +194,10 @@ private Partition getTargetPartition(Partition 
originPartition, Path targetLocat
     }
   }
 
-  private static void checkPartitionCompatibility(Partition 
desiredTargetPartition, Partition existingTargetPartition)
+  private void checkPartitionCompatibility(Partition desiredTargetPartition, 
Partition existingTargetPartition)
       throws IOException {
-    if 
(!desiredTargetPartition.getDataLocation().equals(existingTargetPartition.getDataLocation()))
 {
+    if 
(!hiveCopyEntityHelper.getTargetFs().resolvePath(desiredTargetPartition.getDataLocation())
+        
.equals(hiveCopyEntityHelper.getTargetFs().resolvePath(existingTargetPartition.getDataLocation())))
 {

Review comment:
       same Q here about exception args no longer paralleling the cmp




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 721350)
    Time Spent: 1h  (was: 50m)

> Handle hive table mismatch when paths are equivalent in the underlying FS
> -------------------------------------------------------------------------
>
>                 Key: GOBBLIN-1602
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1602
>             Project: Apache Gobblin
>          Issue Type: Task
>          Components: gobblin-core
>            Reporter: William Lo
>            Assignee: Abhishek Tiwari
>            Priority: Major
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> In scenarios where the paths are equivalent in the underlying FS, hive copy 
> should not treat these paths separately if the user provided URI does not 
> match the hive registered URI



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to