[jira] [Commented] (HIVE-17232) "No match found" Compactor finds a bucket file thinking it's a directory

Eugene Koifman (JIRA) Wed, 02 Aug 2017 14:52:20 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-17232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111781#comment-16111781
 ]


Eugene Koifman commented on HIVE-17232:
---------------------------------------

In the case of this test at least it's due to compaction being invoked via 
    txnHandler.compact(new CompactionRequest("default", 
Table.ACIDTBLPART.name(), CompactionType.MAJOR));
but the target table is Partitioned.  Thus in CompactorMR.run()
    AcidUtils.Directory dir = AcidUtils.getAcidState(new 
Path(sd.getLocation()), conf, txns, false, true);

ends up finding the delta.../bucket.. files but thinks they are "original" 
because they are not at the level where they are expected.

Compacting all partitions of a partition table in one command is not supported.
If compaction is invoked via Alter Table (as it should), DDLTask.compact() will 
check that if table is partitioned then partition spec is supplied.

todo: add some checks to getAcidState() to do some sanity checking to make sure 
it raises some error if it finds unexpected directory layout. 



>  "No match found"  Compactor finds a bucket file thinking it's a directory
> --------------------------------------------------------------------------
>
>                 Key: HIVE-17232
>                 URL: https://issues.apache.org/jira/browse/HIVE-17232
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>
> {noformat}
> 2017-08-02T12:38:11,996  WARN [main] compactor.CompactorMR: Found a 
> non-bucket file that we thought matched the bucket pattern! 
> file:/Users/ekoifman/dev/hiv\
> erwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands2-1501702264311/warehouse/acidtblpart/p=1/delta_0000013_0000013_0000/bucket_00001
>  Matcher=java\
> .util.regex.Matcher[pattern=^[0-9]{6} region=0,12 lastmatch=]
> 2017-08-02T12:38:11,996  INFO [main] mapreduce.JobSubmitter: Cleaning up the 
> staging area 
> file:/tmp/hadoop/mapred/staging/ekoifman1723152463/.staging/job_lo\
> cal1723152463_0183
> 2017-08-02T12:38:11,997 ERROR [main] compactor.Worker: Caught exception while 
> trying to compact 
> id:1,dbname:default,tableName:ACIDTBLPART,partName:null,stat\
> e:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestTxnId:0.
>   Marking failed to avoid repeated failures, java.lang.IllegalStateException: 
> \
> No match found
>         at java.util.regex.Matcher.group(Matcher.java:536)
>         at java.util.regex.Matcher.group(Matcher.java:496)
>         at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorInputFormat.addFileToMap(CompactorMR.java:577)
>         at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorInputFormat.getSplits(CompactorMR.java:549)
>         at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:330)
>         at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:322)
>         at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:198)
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1338)
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575)
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
>         at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561)
>         at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.launchCompactionJob(CompactorMR.java:320)
>         at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:275)
>         at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:166)
>         at 
> org.apache.hadoop.hive.ql.TestTxnCommands2.runWorker(TestTxnCommands2.java:1138)
>         at 
> org.apache.hadoop.hive.ql.TestTxnCommands2.updateDeletePartitioned(TestTxnCommands2.java:894)
> {noformat}
> the stack trace points to 1st runWorker() in updateDeletePartitioned() though 
> the test run was TestTxnCommands2WithSplitUpdateAndVectorization



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17232) "No match found" Compactor finds a bucket file thinking it's a directory

Reply via email to