[jira] [Updated] (HIVE-18696) The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-05-22 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-18696:
---
Fix Version/s: (was: 3.1.0)
   4.0.0

> The partition folders might not get cleaned up properly in the 
> HiveMetaStore.add_partitions_core method if an exception occurs
> --
>
> Key: HIVE-18696
> URL: https://issues.apache.org/jira/browse/HIVE-18696
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-18696.1.patch, HIVE-18696.2.patch, 
> HIVE-18696.3.patch, HIVE-18696.4.patch, HIVE-18696.5.patch, HIVE-18696.6.patch
>
>
> When trying to add multiple partitions, but one of them cannot be created 
> successfully, none of the partitions are created, but the folders might not 
> be cleaned up properly. See the test case "testAddPartitionsOneInvalid" in 
> the TestAddPartitions test.
> This is the problematic code in the HiveMetaStore.add_partitions_core method:
> {code:java}
> for (final Partition part : parts) {
>   if (!part.getTableName().equals(tblName) || 
> !part.getDbName().equals(dbName)) {
> throw new MetaException("Partition does not belong to target 
> table "
> + dbName + "." + tblName + ": " + part);
>   }
>   boolean shouldAdd = startAddPartition(ms, part, ifNotExists);
>   if (!shouldAdd) {
> existingParts.add(part);
> LOG.info("Not adding partition " + part + " as it already 
> exists");
> continue;
>   }
>   final UserGroupInformation ugi;
>   try {
> ugi = UserGroupInformation.getCurrentUser();
>   } catch (IOException e) {
> throw new RuntimeException(e);
>   }
>   partFutures.add(threadPool.submit(new Callable() {
> @Override
> public Partition call() throws Exception {
>   ugi.doAs(new PrivilegedExceptionAction() {
> @Override
> public Object run() throws Exception {
>   try {
> boolean madeDir = createLocationForAddedPartition(table, 
> part);
> if (addedPartitions.put(new PartValEqWrapper(part), 
> madeDir) != null) {
>   // Technically, for ifNotExists case, we could insert 
> one and discard the other
>   // because the first one now "exists", but it seems 
> better to report the problem
>   // upstream as such a command doesn't make sense.
>   throw new MetaException("Duplicate partitions in the 
> list: " + part);
> }
> initializeAddedPartition(table, part, madeDir);
>   } catch (MetaException e) {
> throw new IOException(e.getMessage(), e);
>   }
>   return null;
> }
>   });
>   return part;
> }
>   }));
> }
> {code}
> When going through the partitions, let's say for the first two partitions the 
> threads are successfully submitted to create the folders. But an exception 
> occurs for the third partition in the code before submitting the thread. (It 
> can happen if the partition has different table or db name as the others or 
> it has invalid value.)
>  In this case the execution will jump to the finally part where the folders 
> in the "addedPartitions" map will be cleaned up. However it can happen that 
> the threads for the first two partitions are not finished with the folder 
> creation yet, so the map can be empty or it can contain only one of the 
> partitions.
> This issue also happens in the HiveMetastore.add_partitions_pspec_core 
> method, as this code part is the same as in the add_partitions_core method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18696) The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-04-10 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-18696:
--
   Resolution: Fixed
Fix Version/s: 3.1.0
   Status: Resolved  (was: Patch Available)

Pushed to master.
Thanks for the patch [~kuczoram]!

> The partition folders might not get cleaned up properly in the 
> HiveMetaStore.add_partitions_core method if an exception occurs
> --
>
> Key: HIVE-18696
> URL: https://issues.apache.org/jira/browse/HIVE-18696
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-18696.1.patch, HIVE-18696.2.patch, 
> HIVE-18696.3.patch, HIVE-18696.4.patch, HIVE-18696.5.patch, HIVE-18696.6.patch
>
>
> When trying to add multiple partitions, but one of them cannot be created 
> successfully, none of the partitions are created, but the folders might not 
> be cleaned up properly. See the test case "testAddPartitionsOneInvalid" in 
> the TestAddPartitions test.
> This is the problematic code in the HiveMetaStore.add_partitions_core method:
> {code:java}
> for (final Partition part : parts) {
>   if (!part.getTableName().equals(tblName) || 
> !part.getDbName().equals(dbName)) {
> throw new MetaException("Partition does not belong to target 
> table "
> + dbName + "." + tblName + ": " + part);
>   }
>   boolean shouldAdd = startAddPartition(ms, part, ifNotExists);
>   if (!shouldAdd) {
> existingParts.add(part);
> LOG.info("Not adding partition " + part + " as it already 
> exists");
> continue;
>   }
>   final UserGroupInformation ugi;
>   try {
> ugi = UserGroupInformation.getCurrentUser();
>   } catch (IOException e) {
> throw new RuntimeException(e);
>   }
>   partFutures.add(threadPool.submit(new Callable() {
> @Override
> public Partition call() throws Exception {
>   ugi.doAs(new PrivilegedExceptionAction() {
> @Override
> public Object run() throws Exception {
>   try {
> boolean madeDir = createLocationForAddedPartition(table, 
> part);
> if (addedPartitions.put(new PartValEqWrapper(part), 
> madeDir) != null) {
>   // Technically, for ifNotExists case, we could insert 
> one and discard the other
>   // because the first one now "exists", but it seems 
> better to report the problem
>   // upstream as such a command doesn't make sense.
>   throw new MetaException("Duplicate partitions in the 
> list: " + part);
> }
> initializeAddedPartition(table, part, madeDir);
>   } catch (MetaException e) {
> throw new IOException(e.getMessage(), e);
>   }
>   return null;
> }
>   });
>   return part;
> }
>   }));
> }
> {code}
> When going through the partitions, let's say for the first two partitions the 
> threads are successfully submitted to create the folders. But an exception 
> occurs for the third partition in the code before submitting the thread. (It 
> can happen if the partition has different table or db name as the others or 
> it has invalid value.)
>  In this case the execution will jump to the finally part where the folders 
> in the "addedPartitions" map will be cleaned up. However it can happen that 
> the threads for the first two partitions are not finished with the folder 
> creation yet, so the map can be empty or it can contain only one of the 
> partitions.
> This issue also happens in the HiveMetastore.add_partitions_pspec_core 
> method, as this code part is the same as in the add_partitions_core method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18696) The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-04-05 Thread Marta Kuczora (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-18696:
-
Attachment: HIVE-18696.6.patch

> The partition folders might not get cleaned up properly in the 
> HiveMetaStore.add_partitions_core method if an exception occurs
> --
>
> Key: HIVE-18696
> URL: https://issues.apache.org/jira/browse/HIVE-18696
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Attachments: HIVE-18696.1.patch, HIVE-18696.2.patch, 
> HIVE-18696.3.patch, HIVE-18696.4.patch, HIVE-18696.5.patch, HIVE-18696.6.patch
>
>
> When trying to add multiple partitions, but one of them cannot be created 
> successfully, none of the partitions are created, but the folders might not 
> be cleaned up properly. See the test case "testAddPartitionsOneInvalid" in 
> the TestAddPartitions test.
> This is the problematic code in the HiveMetaStore.add_partitions_core method:
> {code:java}
> for (final Partition part : parts) {
>   if (!part.getTableName().equals(tblName) || 
> !part.getDbName().equals(dbName)) {
> throw new MetaException("Partition does not belong to target 
> table "
> + dbName + "." + tblName + ": " + part);
>   }
>   boolean shouldAdd = startAddPartition(ms, part, ifNotExists);
>   if (!shouldAdd) {
> existingParts.add(part);
> LOG.info("Not adding partition " + part + " as it already 
> exists");
> continue;
>   }
>   final UserGroupInformation ugi;
>   try {
> ugi = UserGroupInformation.getCurrentUser();
>   } catch (IOException e) {
> throw new RuntimeException(e);
>   }
>   partFutures.add(threadPool.submit(new Callable() {
> @Override
> public Partition call() throws Exception {
>   ugi.doAs(new PrivilegedExceptionAction() {
> @Override
> public Object run() throws Exception {
>   try {
> boolean madeDir = createLocationForAddedPartition(table, 
> part);
> if (addedPartitions.put(new PartValEqWrapper(part), 
> madeDir) != null) {
>   // Technically, for ifNotExists case, we could insert 
> one and discard the other
>   // because the first one now "exists", but it seems 
> better to report the problem
>   // upstream as such a command doesn't make sense.
>   throw new MetaException("Duplicate partitions in the 
> list: " + part);
> }
> initializeAddedPartition(table, part, madeDir);
>   } catch (MetaException e) {
> throw new IOException(e.getMessage(), e);
>   }
>   return null;
> }
>   });
>   return part;
> }
>   }));
> }
> {code}
> When going through the partitions, let's say for the first two partitions the 
> threads are successfully submitted to create the folders. But an exception 
> occurs for the third partition in the code before submitting the thread. (It 
> can happen if the partition has different table or db name as the others or 
> it has invalid value.)
>  In this case the execution will jump to the finally part where the folders 
> in the "addedPartitions" map will be cleaned up. However it can happen that 
> the threads for the first two partitions are not finished with the folder 
> creation yet, so the map can be empty or it can contain only one of the 
> partitions.
> This issue also happens in the HiveMetastore.add_partitions_pspec_core 
> method, as this code part is the same as in the add_partitions_core method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18696) The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-03-27 Thread Marta Kuczora (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-18696:
-
Attachment: HIVE-18696.5.patch

> The partition folders might not get cleaned up properly in the 
> HiveMetaStore.add_partitions_core method if an exception occurs
> --
>
> Key: HIVE-18696
> URL: https://issues.apache.org/jira/browse/HIVE-18696
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Attachments: HIVE-18696.1.patch, HIVE-18696.2.patch, 
> HIVE-18696.3.patch, HIVE-18696.4.patch, HIVE-18696.5.patch
>
>
> When trying to add multiple partitions, but one of them cannot be created 
> successfully, none of the partitions are created, but the folders might not 
> be cleaned up properly. See the test case "testAddPartitionsOneInvalid" in 
> the TestAddPartitions test.
> This is the problematic code in the HiveMetaStore.add_partitions_core method:
> {code:java}
> for (final Partition part : parts) {
>   if (!part.getTableName().equals(tblName) || 
> !part.getDbName().equals(dbName)) {
> throw new MetaException("Partition does not belong to target 
> table "
> + dbName + "." + tblName + ": " + part);
>   }
>   boolean shouldAdd = startAddPartition(ms, part, ifNotExists);
>   if (!shouldAdd) {
> existingParts.add(part);
> LOG.info("Not adding partition " + part + " as it already 
> exists");
> continue;
>   }
>   final UserGroupInformation ugi;
>   try {
> ugi = UserGroupInformation.getCurrentUser();
>   } catch (IOException e) {
> throw new RuntimeException(e);
>   }
>   partFutures.add(threadPool.submit(new Callable() {
> @Override
> public Partition call() throws Exception {
>   ugi.doAs(new PrivilegedExceptionAction() {
> @Override
> public Object run() throws Exception {
>   try {
> boolean madeDir = createLocationForAddedPartition(table, 
> part);
> if (addedPartitions.put(new PartValEqWrapper(part), 
> madeDir) != null) {
>   // Technically, for ifNotExists case, we could insert 
> one and discard the other
>   // because the first one now "exists", but it seems 
> better to report the problem
>   // upstream as such a command doesn't make sense.
>   throw new MetaException("Duplicate partitions in the 
> list: " + part);
> }
> initializeAddedPartition(table, part, madeDir);
>   } catch (MetaException e) {
> throw new IOException(e.getMessage(), e);
>   }
>   return null;
> }
>   });
>   return part;
> }
>   }));
> }
> {code}
> When going through the partitions, let's say for the first two partitions the 
> threads are successfully submitted to create the folders. But an exception 
> occurs for the third partition in the code before submitting the thread. (It 
> can happen if the partition has different table or db name as the others or 
> it has invalid value.)
>  In this case the execution will jump to the finally part where the folders 
> in the "addedPartitions" map will be cleaned up. However it can happen that 
> the threads for the first two partitions are not finished with the folder 
> creation yet, so the map can be empty or it can contain only one of the 
> partitions.
> This issue also happens in the HiveMetastore.add_partitions_pspec_core 
> method, as this code part is the same as in the add_partitions_core method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18696) The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-03-27 Thread Marta Kuczora (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-18696:
-
Attachment: HIVE-18696.4.patch

> The partition folders might not get cleaned up properly in the 
> HiveMetaStore.add_partitions_core method if an exception occurs
> --
>
> Key: HIVE-18696
> URL: https://issues.apache.org/jira/browse/HIVE-18696
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Attachments: HIVE-18696.1.patch, HIVE-18696.2.patch, 
> HIVE-18696.3.patch, HIVE-18696.4.patch
>
>
> When trying to add multiple partitions, but one of them cannot be created 
> successfully, none of the partitions are created, but the folders might not 
> be cleaned up properly. See the test case "testAddPartitionsOneInvalid" in 
> the TestAddPartitions test.
> This is the problematic code in the HiveMetaStore.add_partitions_core method:
> {code:java}
> for (final Partition part : parts) {
>   if (!part.getTableName().equals(tblName) || 
> !part.getDbName().equals(dbName)) {
> throw new MetaException("Partition does not belong to target 
> table "
> + dbName + "." + tblName + ": " + part);
>   }
>   boolean shouldAdd = startAddPartition(ms, part, ifNotExists);
>   if (!shouldAdd) {
> existingParts.add(part);
> LOG.info("Not adding partition " + part + " as it already 
> exists");
> continue;
>   }
>   final UserGroupInformation ugi;
>   try {
> ugi = UserGroupInformation.getCurrentUser();
>   } catch (IOException e) {
> throw new RuntimeException(e);
>   }
>   partFutures.add(threadPool.submit(new Callable() {
> @Override
> public Partition call() throws Exception {
>   ugi.doAs(new PrivilegedExceptionAction() {
> @Override
> public Object run() throws Exception {
>   try {
> boolean madeDir = createLocationForAddedPartition(table, 
> part);
> if (addedPartitions.put(new PartValEqWrapper(part), 
> madeDir) != null) {
>   // Technically, for ifNotExists case, we could insert 
> one and discard the other
>   // because the first one now "exists", but it seems 
> better to report the problem
>   // upstream as such a command doesn't make sense.
>   throw new MetaException("Duplicate partitions in the 
> list: " + part);
> }
> initializeAddedPartition(table, part, madeDir);
>   } catch (MetaException e) {
> throw new IOException(e.getMessage(), e);
>   }
>   return null;
> }
>   });
>   return part;
> }
>   }));
> }
> {code}
> When going through the partitions, let's say for the first two partitions the 
> threads are successfully submitted to create the folders. But an exception 
> occurs for the third partition in the code before submitting the thread. (It 
> can happen if the partition has different table or db name as the others or 
> it has invalid value.)
>  In this case the execution will jump to the finally part where the folders 
> in the "addedPartitions" map will be cleaned up. However it can happen that 
> the threads for the first two partitions are not finished with the folder 
> creation yet, so the map can be empty or it can contain only one of the 
> partitions.
> This issue also happens in the HiveMetastore.add_partitions_pspec_core 
> method, as this code part is the same as in the add_partitions_core method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18696) The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-03-26 Thread Marta Kuczora (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-18696:
-
Attachment: HIVE-18696.3.patch

> The partition folders might not get cleaned up properly in the 
> HiveMetaStore.add_partitions_core method if an exception occurs
> --
>
> Key: HIVE-18696
> URL: https://issues.apache.org/jira/browse/HIVE-18696
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Attachments: HIVE-18696.1.patch, HIVE-18696.2.patch, 
> HIVE-18696.3.patch
>
>
> When trying to add multiple partitions, but one of them cannot be created 
> successfully, none of the partitions are created, but the folders might not 
> be cleaned up properly. See the test case "testAddPartitionsOneInvalid" in 
> the TestAddPartitions test.
> This is the problematic code in the HiveMetaStore.add_partitions_core method:
> {code:java}
> for (final Partition part : parts) {
>   if (!part.getTableName().equals(tblName) || 
> !part.getDbName().equals(dbName)) {
> throw new MetaException("Partition does not belong to target 
> table "
> + dbName + "." + tblName + ": " + part);
>   }
>   boolean shouldAdd = startAddPartition(ms, part, ifNotExists);
>   if (!shouldAdd) {
> existingParts.add(part);
> LOG.info("Not adding partition " + part + " as it already 
> exists");
> continue;
>   }
>   final UserGroupInformation ugi;
>   try {
> ugi = UserGroupInformation.getCurrentUser();
>   } catch (IOException e) {
> throw new RuntimeException(e);
>   }
>   partFutures.add(threadPool.submit(new Callable() {
> @Override
> public Partition call() throws Exception {
>   ugi.doAs(new PrivilegedExceptionAction() {
> @Override
> public Object run() throws Exception {
>   try {
> boolean madeDir = createLocationForAddedPartition(table, 
> part);
> if (addedPartitions.put(new PartValEqWrapper(part), 
> madeDir) != null) {
>   // Technically, for ifNotExists case, we could insert 
> one and discard the other
>   // because the first one now "exists", but it seems 
> better to report the problem
>   // upstream as such a command doesn't make sense.
>   throw new MetaException("Duplicate partitions in the 
> list: " + part);
> }
> initializeAddedPartition(table, part, madeDir);
>   } catch (MetaException e) {
> throw new IOException(e.getMessage(), e);
>   }
>   return null;
> }
>   });
>   return part;
> }
>   }));
> }
> {code}
> When going through the partitions, let's say for the first two partitions the 
> threads are successfully submitted to create the folders. But an exception 
> occurs for the third partition in the code before submitting the thread. (It 
> can happen if the partition has different table or db name as the others or 
> it has invalid value.)
>  In this case the execution will jump to the finally part where the folders 
> in the "addedPartitions" map will be cleaned up. However it can happen that 
> the threads for the first two partitions are not finished with the folder 
> creation yet, so the map can be empty or it can contain only one of the 
> partitions.
> This issue also happens in the HiveMetastore.add_partitions_pspec_core 
> method, as this code part is the same as in the add_partitions_core method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18696) The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-03-08 Thread Marta Kuczora (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-18696:
-
Attachment: HIVE-18696.2.patch

> The partition folders might not get cleaned up properly in the 
> HiveMetaStore.add_partitions_core method if an exception occurs
> --
>
> Key: HIVE-18696
> URL: https://issues.apache.org/jira/browse/HIVE-18696
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Attachments: HIVE-18696.1.patch, HIVE-18696.2.patch
>
>
> When trying to add multiple partitions, but one of them cannot be created 
> successfully, none of the partitions are created, but the folders might not 
> be cleaned up properly. See the test case "testAddPartitionsOneInvalid" in 
> the TestAddPartitions test.
> This is the problematic code in the HiveMetaStore.add_partitions_core method:
> {code:java}
> for (final Partition part : parts) {
>   if (!part.getTableName().equals(tblName) || 
> !part.getDbName().equals(dbName)) {
> throw new MetaException("Partition does not belong to target 
> table "
> + dbName + "." + tblName + ": " + part);
>   }
>   boolean shouldAdd = startAddPartition(ms, part, ifNotExists);
>   if (!shouldAdd) {
> existingParts.add(part);
> LOG.info("Not adding partition " + part + " as it already 
> exists");
> continue;
>   }
>   final UserGroupInformation ugi;
>   try {
> ugi = UserGroupInformation.getCurrentUser();
>   } catch (IOException e) {
> throw new RuntimeException(e);
>   }
>   partFutures.add(threadPool.submit(new Callable() {
> @Override
> public Partition call() throws Exception {
>   ugi.doAs(new PrivilegedExceptionAction() {
> @Override
> public Object run() throws Exception {
>   try {
> boolean madeDir = createLocationForAddedPartition(table, 
> part);
> if (addedPartitions.put(new PartValEqWrapper(part), 
> madeDir) != null) {
>   // Technically, for ifNotExists case, we could insert 
> one and discard the other
>   // because the first one now "exists", but it seems 
> better to report the problem
>   // upstream as such a command doesn't make sense.
>   throw new MetaException("Duplicate partitions in the 
> list: " + part);
> }
> initializeAddedPartition(table, part, madeDir);
>   } catch (MetaException e) {
> throw new IOException(e.getMessage(), e);
>   }
>   return null;
> }
>   });
>   return part;
> }
>   }));
> }
> {code}
> When going through the partitions, let's say for the first two partitions the 
> threads are successfully submitted to create the folders. But an exception 
> occurs for the third partition in the code before submitting the thread. (It 
> can happen if the partition has different table or db name as the others or 
> it has invalid value.)
>  In this case the execution will jump to the finally part where the folders 
> in the "addedPartitions" map will be cleaned up. However it can happen that 
> the threads for the first two partitions are not finished with the folder 
> creation yet, so the map can be empty or it can contain only one of the 
> partitions.
> This issue also happens in the HiveMetastore.add_partitions_pspec_core 
> method, as this code part is the same as in the add_partitions_core method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18696) The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-02-20 Thread Marta Kuczora (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-18696:
-
Status: Patch Available  (was: Open)

> The partition folders might not get cleaned up properly in the 
> HiveMetaStore.add_partitions_core method if an exception occurs
> --
>
> Key: HIVE-18696
> URL: https://issues.apache.org/jira/browse/HIVE-18696
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Attachments: HIVE-18696.1.patch
>
>
> When trying to add multiple partitions, but one of them cannot be created 
> successfully, none of the partitions are created, but the folders might not 
> be cleaned up properly. See the test case "testAddPartitionsOneInvalid" in 
> the TestAddPartitions test.
> This is the problematic code in the HiveMetaStore.add_partitions_core method:
> {code:java}
> for (final Partition part : parts) {
>   if (!part.getTableName().equals(tblName) || 
> !part.getDbName().equals(dbName)) {
> throw new MetaException("Partition does not belong to target 
> table "
> + dbName + "." + tblName + ": " + part);
>   }
>   boolean shouldAdd = startAddPartition(ms, part, ifNotExists);
>   if (!shouldAdd) {
> existingParts.add(part);
> LOG.info("Not adding partition " + part + " as it already 
> exists");
> continue;
>   }
>   final UserGroupInformation ugi;
>   try {
> ugi = UserGroupInformation.getCurrentUser();
>   } catch (IOException e) {
> throw new RuntimeException(e);
>   }
>   partFutures.add(threadPool.submit(new Callable() {
> @Override
> public Partition call() throws Exception {
>   ugi.doAs(new PrivilegedExceptionAction() {
> @Override
> public Object run() throws Exception {
>   try {
> boolean madeDir = createLocationForAddedPartition(table, 
> part);
> if (addedPartitions.put(new PartValEqWrapper(part), 
> madeDir) != null) {
>   // Technically, for ifNotExists case, we could insert 
> one and discard the other
>   // because the first one now "exists", but it seems 
> better to report the problem
>   // upstream as such a command doesn't make sense.
>   throw new MetaException("Duplicate partitions in the 
> list: " + part);
> }
> initializeAddedPartition(table, part, madeDir);
>   } catch (MetaException e) {
> throw new IOException(e.getMessage(), e);
>   }
>   return null;
> }
>   });
>   return part;
> }
>   }));
> }
> {code}
> When going through the partitions, let's say for the first two partitions the 
> threads are successfully submitted to create the folders. But an exception 
> occurs for the third partition in the code before submitting the thread. (It 
> can happen if the partition has different table or db name as the others or 
> it has invalid value.)
>  In this case the execution will jump to the finally part where the folders 
> in the "addedPartitions" map will be cleaned up. However it can happen that 
> the threads for the first two partitions are not finished with the folder 
> creation yet, so the map can be empty or it can contain only one of the 
> partitions.
> This issue also happens in the HiveMetastore.add_partitions_pspec_core 
> method, as this code part is the same as in the add_partitions_core method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18696) The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-02-20 Thread Marta Kuczora (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-18696:
-
Attachment: HIVE-18696.1.patch

> The partition folders might not get cleaned up properly in the 
> HiveMetaStore.add_partitions_core method if an exception occurs
> --
>
> Key: HIVE-18696
> URL: https://issues.apache.org/jira/browse/HIVE-18696
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Attachments: HIVE-18696.1.patch
>
>
> When trying to add multiple partitions, but one of them cannot be created 
> successfully, none of the partitions are created, but the folders might not 
> be cleaned up properly. See the test case "testAddPartitionsOneInvalid" in 
> the TestAddPartitions test.
> This is the problematic code in the HiveMetaStore.add_partitions_core method:
> {code:java}
> for (final Partition part : parts) {
>   if (!part.getTableName().equals(tblName) || 
> !part.getDbName().equals(dbName)) {
> throw new MetaException("Partition does not belong to target 
> table "
> + dbName + "." + tblName + ": " + part);
>   }
>   boolean shouldAdd = startAddPartition(ms, part, ifNotExists);
>   if (!shouldAdd) {
> existingParts.add(part);
> LOG.info("Not adding partition " + part + " as it already 
> exists");
> continue;
>   }
>   final UserGroupInformation ugi;
>   try {
> ugi = UserGroupInformation.getCurrentUser();
>   } catch (IOException e) {
> throw new RuntimeException(e);
>   }
>   partFutures.add(threadPool.submit(new Callable() {
> @Override
> public Partition call() throws Exception {
>   ugi.doAs(new PrivilegedExceptionAction() {
> @Override
> public Object run() throws Exception {
>   try {
> boolean madeDir = createLocationForAddedPartition(table, 
> part);
> if (addedPartitions.put(new PartValEqWrapper(part), 
> madeDir) != null) {
>   // Technically, for ifNotExists case, we could insert 
> one and discard the other
>   // because the first one now "exists", but it seems 
> better to report the problem
>   // upstream as such a command doesn't make sense.
>   throw new MetaException("Duplicate partitions in the 
> list: " + part);
> }
> initializeAddedPartition(table, part, madeDir);
>   } catch (MetaException e) {
> throw new IOException(e.getMessage(), e);
>   }
>   return null;
> }
>   });
>   return part;
> }
>   }));
> }
> {code}
> When going through the partitions, let's say for the first two partitions the 
> threads are successfully submitted to create the folders. But an exception 
> occurs for the third partition in the code before submitting the thread. (It 
> can happen if the partition has different table or db name as the others or 
> it has invalid value.)
>  In this case the execution will jump to the finally part where the folders 
> in the "addedPartitions" map will be cleaned up. However it can happen that 
> the threads for the first two partitions are not finished with the folder 
> creation yet, so the map can be empty or it can contain only one of the 
> partitions.
> This issue also happens in the HiveMetastore.add_partitions_pspec_core 
> method, as this code part is the same as in the add_partitions_core method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18696) The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-02-13 Thread Marta Kuczora (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-18696:
-
Description: 
When trying to add multiple partitions, but one of them cannot be created 
successfully, none of the partitions are created, but the folders might not be 
cleaned up properly. See the test case "testAddPartitionsOneInvalid" in the 
TestAddPartitions test.

This is the problematic code in the HiveMetaStore.add_partitions_core method:
{code:java}
for (final Partition part : parts) {
  if (!part.getTableName().equals(tblName) || 
!part.getDbName().equals(dbName)) {
throw new MetaException("Partition does not belong to target table "
+ dbName + "." + tblName + ": " + part);
  }

  boolean shouldAdd = startAddPartition(ms, part, ifNotExists);
  if (!shouldAdd) {
existingParts.add(part);
LOG.info("Not adding partition " + part + " as it already exists");
continue;
  }

  final UserGroupInformation ugi;
  try {
ugi = UserGroupInformation.getCurrentUser();
  } catch (IOException e) {
throw new RuntimeException(e);
  }

  partFutures.add(threadPool.submit(new Callable() {
@Override
public Partition call() throws Exception {
  ugi.doAs(new PrivilegedExceptionAction() {
@Override
public Object run() throws Exception {
  try {
boolean madeDir = createLocationForAddedPartition(table, 
part);
if (addedPartitions.put(new PartValEqWrapper(part), 
madeDir) != null) {
  // Technically, for ifNotExists case, we could insert one 
and discard the other
  // because the first one now "exists", but it seems 
better to report the problem
  // upstream as such a command doesn't make sense.
  throw new MetaException("Duplicate partitions in the 
list: " + part);
}
initializeAddedPartition(table, part, madeDir);
  } catch (MetaException e) {
throw new IOException(e.getMessage(), e);
  }
  return null;
}
  });
  return part;
}
  }));
}
{code}
When going through the partitions, let's say for the first two partitions the 
threads are successfully submitted to create the folders. But an exception 
occurs for the third partition in the code before submitting the thread. (It 
can happen if the partition has different table or db name as the others or it 
has invalid value.)
 In this case the execution will jump to the finally part where the folders in 
the "addedPartitions" map will be cleaned up. However it can happen that the 
threads for the first two partitions are not finished with the folder creation 
yet, so the map can be empty or it can contain only one of the partitions.

This issue also happens in the HiveMetastore.add_partitions_pspec_core method, 
as this code part is the same as in the add_partitions_core method.

  was:
When trying to add multiple partitions, but one of them cannot be created 
successfully, none of the partitions are created, but the folders might not be 
cleaned up properly. See the test case "testAddPartitionsOneInvalid" in the 
TestAddPartitions test.

This is the problematic code in the HiveMetaStore.add_partitions_core method:
{code:java}
for (final Partition part : parts) {
  if (!part.getTableName().equals(tblName) || 
!part.getDbName().equals(dbName)) {
throw new MetaException("Partition does not belong to target table "
+ dbName + "." + tblName + ": " + part);
  }

  boolean shouldAdd = startAddPartition(ms, part, ifNotExists);
  if (!shouldAdd) {
existingParts.add(part);
LOG.info("Not adding partition " + part + " as it already exists");
continue;
  }

  final UserGroupInformation ugi;
  try {
ugi = UserGroupInformation.getCurrentUser();
  } catch (IOException e) {
throw new RuntimeException(e);
  }

  partFutures.add(threadPool.submit(new Callable() {
@Override
public Partition call() throws Exception {
  ugi.doAs(new PrivilegedExceptionAction() {
@Override
public Object run() throws Exception {
  try {
boolean madeDir = createLocationForAddedPartition(table, 
part);
if (addedPartitions.put(new PartValEqWrapper(part), 
madeDir) != null) {
  // Technically, for ifNotExists case, we could insert one 
and discard the other