[jira] [Commented] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec

2014-09-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14118073#comment-14118073
 ] 

Hive QA commented on HIVE-6809:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12665716/HIVE-6809.6.patch.txt

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6135 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.service.TestHS2ImpersonationWithRemoteMS.testImpersonation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/599/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/599/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-599/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12665716

> Support bulk deleting directories for partition drop with partial spec
> --
>
> Key: HIVE-6809
> URL: https://issues.apache.org/jira/browse/HIVE-6809
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
> Attachments: HIVE-6809.1.patch.txt, HIVE-6809.2.patch.txt, 
> HIVE-6809.3.patch.txt, HIVE-6809.4.patch.txt, HIVE-6809.5.patch.txt, 
> HIVE-6809.6.patch.txt
>
>
> In busy hadoop system, dropping many of partitions takes much more time than 
> expected. In hive-0.11.0, removing 1700 partitions by single partial spec 
> took 90 minutes, which is reduced to 3 minutes when deleteData is set false. 
> I couldn't test this in recent hive, which has HIVE-6256 but if the 
> time-taking part is mostly from removing directories, it seemed not helpful 
> to reduce whole processing time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec

2014-05-26 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14009186#comment-14009186
 ] 

Navis commented on HIVE-6809:
-

1. In DDLTask(3942), multiple simple specs can be given and it would be 
executed by iteration. But I thought it's not a common case. 
{noformat}
for (Map spec : simpleSpecs) {
  List dropped =
  db.dropPartitions(tbl.getDbName(), tbl.getTableName(), toPartValues(tbl, 
spec), true);
  droppedParts.addAll(dropped);
}
{noformat}

2. In HiveMetastore(2247) 
{noformat}
for (Partition part : parts) {
  // copy values, which would be removed after drop
  part.setValues(new ArrayList(part.getValues()));
  if (!ms.dropPartition(db_name, tbl_name, part.getValues())) {
throw new MetaException("Unable to drop partition");
  }
}
{noformat}
Could be simply changed to 
{noformat}
for (Partition part : parts) {
  // copy values, which would be removed after drop
  part.setValues(new ArrayList(part.getValues()));
}
ms.dropPartitions(db_name, tbl_name, Arrays.asList(partName));
{noformat}

3. I've tried to use new API with DropPartitionsRequest but it's too 
complicated and I couldn't fully convinced to convert simple List into 
ExprDescs and again into binary and vice versa. It might be possible to use 
that but I felt uncomfortable.

> Support bulk deleting directories for partition drop with partial spec
> --
>
> Key: HIVE-6809
> URL: https://issues.apache.org/jira/browse/HIVE-6809
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
> Attachments: HIVE-6809.1.patch.txt, HIVE-6809.2.patch.txt, 
> HIVE-6809.3.patch.txt, HIVE-6809.4.patch.txt, HIVE-6809.5.patch.txt
>
>
> In busy hadoop system, dropping many of partitions takes much more time than 
> expected. In hive-0.11.0, removing 1700 partitions by single partial spec 
> took 90 minutes, which is reduced to 3 minutes when deleteData is set false. 
> I couldn't test this in recent hive, which has HIVE-6256 but if the 
> time-taking part is mostly from removing directories, it seemed not helpful 
> to reduce whole processing time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec

2014-05-26 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14008977#comment-14008977
 ] 

Ashutosh Chauhan commented on HIVE-6809:


Api design is one concern. But my previous question was for another issue. 
Patch currently checks if partitions to be dropped have 'simplespec' (ie only 
string partition columns with equality). If it is then it drops all those 
partition in one api call, but if it isn't than it uses same api but drops in 
for-loop one by one. I would have assumed that in both cases we can do bulk 
drop. Can you explain why its better to drop one partition at a time if it is 
not a 'simpleSpec'?

> Support bulk deleting directories for partition drop with partial spec
> --
>
> Key: HIVE-6809
> URL: https://issues.apache.org/jira/browse/HIVE-6809
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
> Attachments: HIVE-6809.1.patch.txt, HIVE-6809.2.patch.txt, 
> HIVE-6809.3.patch.txt, HIVE-6809.4.patch.txt, HIVE-6809.5.patch.txt
>
>
> In busy hadoop system, dropping many of partitions takes much more time than 
> expected. In hive-0.11.0, removing 1700 partitions by single partial spec 
> took 90 minutes, which is reduced to 3 minutes when deleteData is set false. 
> I couldn't test this in recent hive, which has HIVE-6256 but if the 
> time-taking part is mostly from removing directories, it seemed not helpful 
> to reduce whole processing time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec

2014-05-26 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14008639#comment-14008639
 ] 

Navis commented on HIVE-6809:
-

Because it's way simpler. DropPartitionsRequest/Result felt too complicated for 
me to do simple things like issue.

For the direct thirft users, I should suggest that use "bool 
drop_partition_by_name()" instead of "bool drop_partition()". For users of 
IMetaStoreClient, nothing changes.

> Support bulk deleting directories for partition drop with partial spec
> --
>
> Key: HIVE-6809
> URL: https://issues.apache.org/jira/browse/HIVE-6809
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
> Attachments: HIVE-6809.1.patch.txt, HIVE-6809.2.patch.txt, 
> HIVE-6809.3.patch.txt, HIVE-6809.4.patch.txt, HIVE-6809.5.patch.txt
>
>
> In busy hadoop system, dropping many of partitions takes much more time than 
> expected. In hive-0.11.0, removing 1700 partitions by single partial spec 
> took 90 minutes, which is reduced to 3 minutes when deleteData is set false. 
> I couldn't test this in recent hive, which has HIVE-6256 but if the 
> time-taking part is mostly from removing directories, it seemed not helpful 
> to reduce whole processing time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec

2014-05-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13992839#comment-13992839
 ] 

Hive QA commented on HIVE-6809:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12643703/HIVE-6809.5.patch.txt

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5429 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/145/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/145/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12643703

> Support bulk deleting directories for partition drop with partial spec
> --
>
> Key: HIVE-6809
> URL: https://issues.apache.org/jira/browse/HIVE-6809
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
> Attachments: HIVE-6809.1.patch.txt, HIVE-6809.2.patch.txt, 
> HIVE-6809.3.patch.txt, HIVE-6809.4.patch.txt, HIVE-6809.5.patch.txt
>
>
> In busy hadoop system, dropping many of partitions takes much more time than 
> expected. In hive-0.11.0, removing 1700 partitions by single partial spec 
> took 90 minutes, which is reduced to 3 minutes when deleteData is set false. 
> I couldn't test this in recent hive, which has HIVE-6256 but if the 
> time-taking part is mostly from removing directories, it seemed not helpful 
> to reduce whole processing time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec

2014-04-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983594#comment-13983594
 ] 

Sergey Shelukhin commented on HIVE-6809:


This appears to have stalled. I still wonder about breaking APIs and why two 
optimizations in metastore have to be separate. 

> Support bulk deleting directories for partition drop with partial spec
> --
>
> Key: HIVE-6809
> URL: https://issues.apache.org/jira/browse/HIVE-6809
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
> Attachments: HIVE-6809.1.patch.txt, HIVE-6809.2.patch.txt, 
> HIVE-6809.3.patch.txt, HIVE-6809.4.patch.txt
>
>
> In busy hadoop system, dropping many of partitions takes much more time than 
> expected. In hive-0.11.0, removing 1700 partitions by single partial spec 
> took 90 minutes, which is reduced to 3 minutes when deleteData is set false. 
> I couldn't test this in recent hive, which has HIVE-6256 but if the 
> time-taking part is mostly from removing directories, it seemed not helpful 
> to reduce whole processing time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec

2014-04-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1396#comment-1396
 ] 

Hive QA commented on HIVE-6809:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12639100/HIVE-6809.4.patch.txt

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5551 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_dyn_part
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2176/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2176/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12639100

> Support bulk deleting directories for partition drop with partial spec
> --
>
> Key: HIVE-6809
> URL: https://issues.apache.org/jira/browse/HIVE-6809
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
> Attachments: HIVE-6809.1.patch.txt, HIVE-6809.2.patch.txt, 
> HIVE-6809.3.patch.txt, HIVE-6809.4.patch.txt
>
>
> In busy hadoop system, dropping many of partitions takes much more time than 
> expected. In hive-0.11.0, removing 1700 partitions by single partial spec 
> took 90 minutes, which is reduced to 3 minutes when deleteData is set false. 
> I couldn't test this in recent hive, which has HIVE-6256 but if the 
> time-taking part is mostly from removing directories, it seemed not helpful 
> to reduce whole processing time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec

2014-04-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13963194#comment-13963194
 ] 

Sergey Shelukhin commented on HIVE-6809:


Can you update RB also?

> Support bulk deleting directories for partition drop with partial spec
> --
>
> Key: HIVE-6809
> URL: https://issues.apache.org/jira/browse/HIVE-6809
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
> Attachments: HIVE-6809.1.patch.txt, HIVE-6809.2.patch.txt, 
> HIVE-6809.3.patch.txt, HIVE-6809.4.patch.txt
>
>
> In busy hadoop system, dropping many of partitions takes much more time than 
> expected. In hive-0.11.0, removing 1700 partitions by single partial spec 
> took 90 minutes, which is reduced to 3 minutes when deleteData is set false. 
> I couldn't test this in recent hive, which has HIVE-6256 but if the 
> time-taking part is mostly from removing directories, it seemed not helpful 
> to reduce whole processing time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec

2014-04-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13961767#comment-13961767
 ] 

Hive QA commented on HIVE-6809:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12638949/HIVE-6809.3.patch.txt

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5549 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_reduce_deduplicate
org.apache.hcatalog.security.TestHdfsAuthorizationProvider.testTableOps
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2160/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2160/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12638949

> Support bulk deleting directories for partition drop with partial spec
> --
>
> Key: HIVE-6809
> URL: https://issues.apache.org/jira/browse/HIVE-6809
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
> Attachments: HIVE-6809.1.patch.txt, HIVE-6809.2.patch.txt, 
> HIVE-6809.3.patch.txt
>
>
> In busy hadoop system, dropping many of partitions takes much more time than 
> expected. In hive-0.11.0, removing 1700 partitions by single partial spec 
> took 90 minutes, which is reduced to 3 minutes when deleteData is set false. 
> I couldn't test this in recent hive, which has HIVE-6256 but if the 
> time-taking part is mostly from removing directories, it seemed not helpful 
> to reduce whole processing time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec

2014-04-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13959849#comment-13959849
 ] 

Hive QA commented on HIVE-6809:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12638396/HIVE-6809.2.patch.txt

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5547 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_drop_partitions_partialspec
org.apache.hcatalog.security.TestHdfsAuthorizationProvider.testDropPartitionFail1
org.apache.hcatalog.security.TestHdfsAuthorizationProvider.testDropPartitionFail2
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2106/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2106/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12638396

> Support bulk deleting directories for partition drop with partial spec
> --
>
> Key: HIVE-6809
> URL: https://issues.apache.org/jira/browse/HIVE-6809
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
> Attachments: HIVE-6809.1.patch.txt, HIVE-6809.2.patch.txt
>
>
> In busy hadoop system, dropping many of partitions takes much more time than 
> expected. In hive-0.11.0, removing 1700 partitions by single partial spec 
> took 90 minutes, which is reduced to 3 minutes when deleteData is set false. 
> I couldn't test this in recent hive, which has HIVE-6256 but if the 
> time-taking part is mostly from removing directories, it seemed not helpful 
> to reduce whole processing time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec

2014-04-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13959169#comment-13959169
 ] 

Sergey Shelukhin commented on HIVE-6809:


On code level patch looks ok.

1) I am not sure how acceptable it is to break thrift APIs.
[~ashutoshc] can you comment?

I wonder again if you can augment existing dropPartitions req/resp based API?
Or if not, maybe convert existing ones to req/resp if they are changing anyway, 
so it would be easier to change them later.

2) Can this also use bulk delete path from database? 





> Support bulk deleting directories for partition drop with partial spec
> --
>
> Key: HIVE-6809
> URL: https://issues.apache.org/jira/browse/HIVE-6809
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
> Attachments: HIVE-6809.1.patch.txt, HIVE-6809.2.patch.txt
>
>
> In busy hadoop system, dropping many of partitions takes much more time than 
> expected. In hive-0.11.0, removing 1700 partitions by single partial spec 
> took 90 minutes, which is reduced to 3 minutes when deleteData is set false. 
> I couldn't test this in recent hive, which has HIVE-6256 but if the 
> time-taking part is mostly from removing directories, it seemed not helpful 
> to reduce whole processing time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec

2014-04-02 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958170#comment-13958170
 ] 

Sergey Shelukhin commented on HIVE-6809:


left some comments on RB. My main concern is the breaking change to APIs which 
hcat and the like might be using. 
It's also a bit concerning that depending on partition spec the path will be 
different and different optimization will apply.
Can we do this on the "new" path, and for old path just keep it as if it is 
deprecated?

> Support bulk deleting directories for partition drop with partial spec
> --
>
> Key: HIVE-6809
> URL: https://issues.apache.org/jira/browse/HIVE-6809
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
> Attachments: HIVE-6809.1.patch.txt
>
>
> In busy hadoop system, dropping many of partitions takes much more time than 
> expected. In hive-0.11.0, removing 1700 partitions by single partial spec 
> took 90 minutes, which is reduced to 3 minutes when deleteData is set false. 
> I couldn't test this in recent hive, which has HIVE-6256 but if the 
> time-taking part is mostly from removing directories, it seemed not helpful 
> to reduce whole processing time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec

2014-04-02 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958146#comment-13958146
 ] 

Sergey Shelukhin commented on HIVE-6809:


looking

> Support bulk deleting directories for partition drop with partial spec
> --
>
> Key: HIVE-6809
> URL: https://issues.apache.org/jira/browse/HIVE-6809
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
> Attachments: HIVE-6809.1.patch.txt
>
>
> In busy hadoop system, dropping many of partitions takes much more time than 
> expected. In hive-0.11.0, removing 1700 partitions by single partial spec 
> took 90 minutes, which is reduced to 3 minutes when deleteData is set false. 
> I couldn't test this in recent hive, which has HIVE-6256 but if the 
> time-taking part is mostly from removing directories, it seemed not helpful 
> to reduce whole processing time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec

2014-04-02 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958005#comment-13958005
 ] 

Ashutosh Chauhan commented on HIVE-6809:


cc: [~sershe] Can you take a look?

> Support bulk deleting directories for partition drop with partial spec
> --
>
> Key: HIVE-6809
> URL: https://issues.apache.org/jira/browse/HIVE-6809
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
> Attachments: HIVE-6809.1.patch.txt
>
>
> In busy hadoop system, dropping many of partitions takes much more time than 
> expected. In hive-0.11.0, removing 1700 partitions by single partial spec 
> took 90 minutes, which is reduced to 3 minutes when deleteData is set false. 
> I couldn't test this in recent hive, which has HIVE-6256 but if the 
> time-taking part is mostly from removing directories, it seemed not helpful 
> to reduce whole processing time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec

2014-04-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13957816#comment-13957816
 ] 

Hive QA commented on HIVE-6809:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12638200/HIVE-6809.1.patch.txt

{color:red}ERROR:{color} -1 due to 130 failed/errored test(s), 5540 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_drop_multi_partitions
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_protectmode
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_partition_nodrop
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_drop_partition_failure
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_drop_partition_filter_failure
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_protectmode_part_no_drop
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_protectmode_tbl7
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_protectmode_tbl8
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_sa_fail_hook3
org.apache.hadoop.hive.metastore.TestHiveMetaStoreWithEnvironmentContext.testEnvironmentContext
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testAlterPartition
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testAlterTable
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testAlterViewParititon
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testColumnStatistics
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testComplexTable
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testComplexTypeApi
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testConcurrentMetastores
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testDBOwner
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testDBOwnerChange
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testDatabase
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testDatabaseLocation
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testDatabaseLocationWithPermissionProblems
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testDropTable
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testFilterLastPartition
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testFilterSinglePartition
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testFunctionWithResources
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testGetConfigValue
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testListPartitionNames
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testListPartitions
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testPartition
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testPartitionFilter
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testRenamePartition
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testSimpleFunction
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testSimpleTable
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testSimpleTypeApi
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testSynchronized
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testTableDatabase
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testTableFilter
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testAlterPartition
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testAlterTable
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testAlterViewParititon
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testColumnStatistics
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testComplexTable
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testComplexTypeApi
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testConcurrentMetastores
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testDBOwner
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testDBOwnerChange
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testDatabase
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testDatabaseLocation
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testDatabaseLocationWithPermissionProblems
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testDropTable
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testFilterLastPartition
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testFilterSinglePartition
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testFunctionWithResources
org.apache