[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-30 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14298910#comment-14298910
 ] 

Alan Gates commented on HIVE-8966:
--

I confirmed that it is already in 1.1, based on the git logs.

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 1.0.0
>
> Attachments: HIVE-8966-branch-1.patch, HIVE-8966.2.patch, 
> HIVE-8966.3.patch, HIVE-8966.4.patch, HIVE-8966.5.patch, HIVE-8966.6.patch, 
> HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-29 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14297960#comment-14297960
 ] 

Lefty Leverenz commented on HIVE-8966:
--

Does this also need to be checked into branch-1.1 (formerly known as 0.15)?

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 1.0.0
>
> Attachments: HIVE-8966-branch-1.patch, HIVE-8966.2.patch, 
> HIVE-8966.3.patch, HIVE-8966.4.patch, HIVE-8966.5.patch, HIVE-8966.6.patch, 
> HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-29 Thread Jihong Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14297958#comment-14297958
 ] 

Jihong Liu commented on HIVE-8966:
--

Thanks Alan.

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 1.0.0
>
> Attachments: HIVE-8966-branch-1.patch, HIVE-8966.2.patch, 
> HIVE-8966.3.patch, HIVE-8966.4.patch, HIVE-8966.5.patch, HIVE-8966.6.patch, 
> HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-28 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295328#comment-14295328
 ] 

Alan Gates commented on HIVE-8966:
--

[~leftylev] no, we just made what should have worked before work properly.

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 1.0.0
>
> Attachments: HIVE-8966-branch-1.patch, HIVE-8966.2.patch, 
> HIVE-8966.3.patch, HIVE-8966.4.patch, HIVE-8966.5.patch, HIVE-8966.6.patch, 
> HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-28 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14294911#comment-14294911
 ] 

Lefty Leverenz commented on HIVE-8966:
--

Any documentation needed?

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 1.0.0
>
> Attachments: HIVE-8966-branch-1.patch, HIVE-8966.2.patch, 
> HIVE-8966.3.patch, HIVE-8966.4.patch, HIVE-8966.5.patch, HIVE-8966.6.patch, 
> HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-26 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292622#comment-14292622
 ] 

Alan Gates commented on HIVE-8966:
--

Fixed.

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.4.patch, 
> HIVE-8966.5.patch, HIVE-8966.6.patch, HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-26 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292615#comment-14292615
 ] 

Brock Noland commented on HIVE-8966:


thx

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.4.patch, 
> HIVE-8966.5.patch, HIVE-8966.6.patch, HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-26 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292601#comment-14292601
 ] 

Alan Gates commented on HIVE-8966:
--

I did svn add instead of svn rm on a couple of files that moved.  I'll fix it.

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.4.patch, 
> HIVE-8966.5.patch, HIVE-8966.6.patch, HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-26 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292593#comment-14292593
 ] 

Brock Noland commented on HIVE-8966:


Looks like this was committed but I am seeing:

{noformat}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hive-common: Compilation failure: Compilation failure:
[ERROR] 
/Users/noland/workspaces/hive-apache/hive/common/src/java/org/apache/hadoop/hive/common/ValidTxnListImpl.java:[23,8]
 org.apache.hadoop.hive.common.ValidTxnListImpl is not abstract and does not 
override abstract method getInvalidTransactions() in 
org.apache.hadoop.hive.common.ValidTxnList
[ERROR] 
/Users/noland/workspaces/hive-apache/hive/common/src/java/org/apache/hadoop/hive/common/ValidTxnListImpl.java:[46,3]
 method does not override or implement a method from a supertype
[ERROR] 
/Users/noland/workspaces/hive-apache/hive/common/src/java/org/apache/hadoop/hive/common/ValidTxnListImpl.java:[54,3]
 method does not override or implement a method from a supertype
[ERROR] 
/Users/noland/workspaces/hive-apache/hive/common/src/java/org/apache/hadoop/hive/common/ValidTxnListImpl.java:[121,3]
 method does not override or implement a method from a supertype
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :hive-common
{noformat}

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.4.patch, 
> HIVE-8966.5.patch, HIVE-8966.6.patch, HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290583#comment-14290583
 ] 

Hive QA commented on HIVE-8966:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12694321/HIVE-8966.6.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7370 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-parallel_join1.q-avro_joins.q-groupby_ppr.q-and-12-more - 
did not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2506/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2506/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2506/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12694321 - PreCommit-HIVE-TRUNK-Build

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.4.patch, 
> HIVE-8966.5.patch, HIVE-8966.6.patch, HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-21 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14286267#comment-14286267
 ] 

Vikram Dixit K commented on HIVE-8966:
--

+1 for a branch 1.0.

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.4.patch, 
> HIVE-8966.5.patch, HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-20 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284935#comment-14284935
 ] 

Owen O'Malley commented on HIVE-8966:
-

After a little more thought, I'm worried that someone will accidentally create 
a ValidCompactorTxnList and get confused by the different behavior. I think it 
would make sense to move it into the compactor package to minimize the chance 
that someone accidentally uses it by mistake. 

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.4.patch, 
> HIVE-8966.5.patch, HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-20 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284927#comment-14284927
 ] 

Owen O'Malley commented on HIVE-8966:
-

This looks good, Alan. +1

One minor nit is that the class javadoc for ValidReadTxnList has "And" instead 
of the intended "An".


> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.4.patch, 
> HIVE-8966.5.patch, HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278515#comment-14278515
 ] 

Hive QA commented on HIVE-8966:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12692048/HIVE-8966.5.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7330 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2369/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2369/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2369/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12692048 - PreCommit-HIVE-TRUNK-Build

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.4.patch, 
> HIVE-8966.5.patch, HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14272443#comment-14272443
 ] 

Hive QA commented on HIVE-8966:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12691437/HIVE-8966.4.patch

{color:green}SUCCESS:{color} +1 6764 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2322/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2322/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2322/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12691437 - PreCommit-HIVE-TRUNK-Build

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.4.patch, 
> HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-09 Thread Jihong Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14271601#comment-14271601
 ] 

Jihong Liu commented on HIVE-8966:
--

Make sense. It is so great if that solution can be implemented.Thanks

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-08 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14270282#comment-14270282
 ] 

Alan Gates commented on HIVE-8966:
--

The issue is that since the writer died with an unclosed batch it left the orc 
file in a state where it cannot be read without the length file.  So removing 
the length file means any reader will fail when reading it.

The proper solution is for the compactor to stop at that partition until it has 
determined all transactions in that file have committed or aborted.  Then it 
should compact it using the length file, but properly ignore the length file.  
I'll work on the fix.

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-06 Thread Jihong Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266854#comment-14266854
 ] 

Jihong Liu commented on HIVE-8966:
--

The error occur when doing the mapreduce job. Following is log in 
hivemetastore.log

2015-01-06 16:42:22,506 INFO  [sfdmgctmn003.gid.gap.com-32]: compactor.Worker 
(Worker.java:run(137)) - Starting MAJOR compaction for 
ds_infra.event_metrics.date=2014-12-24
2015-01-06 16:42:22,564 INFO  [sfdmgctmn003.gid.gap.com-32]: 
impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(285)) - Timeline 
service address: http://sfdmgctmn003.gid.gap.com:8188/ws/v1/timeline/
2015-01-06 16:42:22,622 INFO  [sfdmgctmn003.gid.gap.com-32]: 
impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(285)) - Timeline 
service address: http://sfdmgctmn003.gid.gap.com:8188/ws/v1/timeline/
2015-01-06 16:42:22,628 WARN  [sfdmgctmn003.gid.gap.com-32]: 
mapreduce.JobSubmitter (JobSubmitter.java:copyAndConfigureFiles(153)) - Hadoop 
command-line option parsing not performed. Implement the Tool interface and 
execute your application with ToolRunner to remedy this.
2015-01-06 16:42:22,753 WARN  [sfdmgctmn003.gid.gap.com-32]: 
split.JobSplitWriter (JobSplitWriter.java:writeOldSplits(168)) - Max block 
location exceeded for split: CompactorInputSplit{base: 
hdfs://sfdmgct/apps/hive/warehouse/ds_infra/event_metrics/date=2014-12-24/base_0035304,
 bucket: 1, length: 292280, deltas: [delta_0035311_0035313, 
delta_0035479_0035481, delta_0035491_0035493, delta_0035515_0035517, 
delta_0035533_0035535, delta_0035548_0035550, delta_0035563_0035565, 
delta_0035578_0035580, delta_0035593_0035595, delta_0035599_0035601, 
delta_0035656_0035658, delta_0035671_0035673, delta_0035686_0035688, 
delta_0035701_0035703, delta_0035716_0035718, delta_0035731_0035733, 
delta_0035746_0035748, delta_0035761_0035763, delta_0035776_0035778, 
delta_0035791_0035793, delta_0035806_0035808, delta_0035821_0035823, 
delta_0035830_0035832, delta_0035842_0035844, delta_0035854_0035856, 
delta_0035866_0035868, delta_0035878_0035880]} splitsize: 27 maxsize: 10
2015-01-06 16:42:22,753 WARN  [sfdmgctmn003.gid.gap.com-32]: 
split.JobSplitWriter (JobSplitWriter.java:writeOldSplits(168)) - Max block 
location exceeded for split: CompactorInputSplit{base: null, bucket: 3, length: 
199770, deltas: [delta_0035311_0035313, delta_0035479_0035481, 
delta_0035491_0035493, delta_0035515_0035517, delta_0035533_0035535, 
delta_0035548_0035550, delta_0035563_0035565, delta_0035578_0035580, 
delta_0035593_0035595, delta_0035599_0035601, delta_0035656_0035658, 
delta_0035671_0035673, delta_0035686_0035688, delta_0035701_0035703, 
delta_0035716_0035718, delta_0035731_0035733, delta_0035746_0035748, 
delta_0035761_0035763, delta_0035776_0035778, delta_0035791_0035793, 
delta_0035806_0035808, delta_0035821_0035823, delta_0035830_0035832, 
delta_0035842_0035844, delta_0035854_0035856, delta_0035866_0035868, 
delta_0035878_0035880]} splitsize: 21 maxsize: 10
2015-01-06 16:42:22,753 WARN  [sfdmgctmn003.gid.gap.com-32]: 
split.JobSplitWriter (JobSplitWriter.java:writeOldSplits(168)) - Max block 
location exceeded for split: CompactorInputSplit{base: 
hdfs://sfdmgct/apps/hive/warehouse/ds_infra/event_metrics/date=2014-12-24/base_0035304,
 bucket: 0, length: 172391, deltas: [delta_0035311_0035313, 
delta_0035479_0035481, delta_0035491_0035493, delta_0035515_0035517, 
delta_0035533_0035535, delta_0035548_0035550, delta_0035563_0035565, 
delta_0035578_0035580, delta_0035593_0035595, delta_0035599_0035601, 
delta_0035656_0035658, delta_0035671_0035673, delta_0035686_0035688, 
delta_0035701_0035703, delta_0035716_0035718, delta_0035731_0035733, 
delta_0035746_0035748, delta_0035761_0035763, delta_0035776_0035778, 
delta_0035791_0035793, delta_0035806_0035808, delta_0035821_0035823, 
delta_0035830_0035832, delta_0035842_0035844, delta_0035854_0035856, 
delta_0035866_0035868, delta_0035878_0035880]} splitsize: 30 maxsize: 10
2015-01-06 16:42:22,777 INFO  [sfdmgctmn003.gid.gap.com-32]: 
mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(494)) - number of 
splits:4
2015-01-06 16:42:22,793 INFO  [sfdmgctmn003.gid.gap.com-32]: 
mapreduce.JobSubmitter (JobSubmitter.java:printTokens(583)) - Submitting tokens 
for job: job_1419291043936_1639
2015-01-06 16:42:23,000 INFO  [sfdmgctmn003.gid.gap.com-32]: 
impl.YarnClientImpl (YarnClientImpl.java:submitApplication(251)) - Submitted 
application application_1419291043936_1639
2015-01-06 16:42:23,001 INFO  [sfdmgctmn003.gid.gap.com-32]: mapreduce.Job 
(Job.java:submit(1300)) - The url to track the job: 
http://sfdmgctmn002.gid.gap.com:8088/proxy/application_1419291043936_1639/
2015-01-06 16:42:23,001 INFO  [sfdmgctmn003.gid.gap.com-32]: mapreduce.Job 
(Job.java:monitorAndPrintJob(1345)) - Running job: job_1419291043936_1639
2015-01-06 16:42:30,042 INFO  [sfdmgctmn003.gid.gap.com-32]: mapreduce.Job 
(Job.java:mon

[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-05 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265257#comment-14265257
 ] 

Alan Gates commented on HIVE-8966:
--

What error message does it give when it fails?  I would expect this to work.

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-04 Thread Jihong Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14264177#comment-14264177
 ] 

Jihong Liu commented on HIVE-8966:
--

Did a test. Generally the new version works as expected. But for the following 
case, the compaction will always fail:

1. due to any reason, the writer exits without closing a batch. So the "length" 
file is still there. This could happen, for example the program is killed, 
hive/server restarts.
2. restart the program, so a new writer and a new batch is created and 
continute to write into the same partition. The data will go to a new delta.
3. Now we manually delete that "length" file in the previous delta. Then do 
compaction, but it fails. Even we totally exit the program so that no any open 
batch and no any "length" file, the compaction will never success for this 
partition. 

However the current hive 14.0 will work fine for the above case.

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14256700#comment-14256700
 ] 

Hive QA commented on HIVE-8966:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12688699/HIVE-8966.3.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6724 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_lvj_mapjoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2168/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2168/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2168/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12688699 - PreCommit-HIVE-TRUNK-Build

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-09 Thread Jihong Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14240701#comment-14240701
 ] 

Jihong Liu commented on HIVE-8966:
--

Alan,
Your idea is very good. But there is an issue here -- we should only do this 
"compacting" test for the most recent delta, not for all deltas. Following is 
an example for the reason:
Assume there are two deltas:
   1  delta_00011_00020this delta has open transaction batch
   2  delta_00021_00030this delta has no open transaction batch. All closed.

In the above, the first delta has open transaction batch, the second has not. 
And the second delta is the most recent delta. This case is possible, 
especially when multiple threads write to the same partition. If we ignore the 
first one, then the compaction will success and create a base, like base_00030. 
Then cleaner will delete all the two deltas since their transaction id are less 
or equal to the base transaction id. Thus the data in delta 2 will be lost. 
This is why we should only test the most recent delta, all other deltas will be 
automatically in the list. Thus in this case, the compaction will be fail, 
since the "flush_length" file is there. And for this case, the compaction will 
be success only when all transaction batchs are closed. Although it is not 
perfect, at least no data lost. Since each delta file and transaction id for 
compaction is not saved anywhere, probably this is the only solution for now. 
In my removeNotCompactableDeltas() method, we first sort the deltas, then only 
check the last one. But the name: "removeNotCompactableDeltas" is not good, 
easy makes confusion. It will be clear if named it as 
"removeLastDeltaIfNotCompactable". 
Thanks

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.2.patch, HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14240482#comment-14240482
 ] 

Hive QA commented on HIVE-8966:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12686124/HIVE-8966.2.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6704 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_1
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2013/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2013/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2013/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12686124 - PreCommit-HIVE-TRUNK-Build

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.2.patch, HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-09 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14240415#comment-14240415
 ] 

Owen O'Malley commented on HIVE-8966:
-

Alan, your patch looks good +1

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.2.patch, HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-09 Thread Jihong Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14240004#comment-14240004
 ] 

Jihong Liu commented on HIVE-8966:
--

I see. Basically there are two solutions. One is that when get the delta list, 
we don't include the current delta if it has open tranaction. So uptate the 
AcidUtil.getAcidState() directly. The other is what I posted here. We first get 
the delta list, then when do compaction, we don't compact the last one if there 
is open transaction. The first solution is better as long as changing 
getAcidState() doesn't affact other existing code, since it is a public static 
method. 
By the way, we should only do that to the current delta (the delta with the 
largest transaction id), not to all deltas which have open transactions. If I 
am correct, the base file will be named based on the largest transaction id in 
the deltas. So if the latest delta is closed, but an early delta has an open 
transaction, we should not do anything. So simply let the compaction fail. 
Otherwise, the base will be named by the last transaction id, and all early 
deltas will be removed. That will cause data lost. This is my understanding, 
please correct me, it it is not correct. Thanks

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-09 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14239750#comment-14239750
 ] 

Alan Gates commented on HIVE-8966:
--

Rather than go remove these directories from the list of deltas I think it 
makes more sense to change Directory.getAcidState to not include these deltas.  
We obviously can't do that in all cases, as readers need to see these deltas. 
But we can change it to see that this is the compactor and therefore those 
should be excluded.  I'll post a patch with this change.

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-09 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14239636#comment-14239636
 ] 

Alan Gates commented on HIVE-8966:
--

Don't worry about the results from testing, those tests are flaky.  I'll review 
the patch.

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-07 Thread Jihong Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14237257#comment-14237257
 ] 

Jihong Liu commented on HIVE-8966:
--

I am confused about the QA test. The error looks like not related to 
HIVE-8966.patch. First, was this patch really included in the build? Also this 
patch is for 0.14.1, not for trunk.

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14237080#comment-14237080
 ] 

Hive QA commented on HIVE-8966:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12685590/HIVE-8966.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6696 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_aggregate
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1986/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1986/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1986/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12685590 - PreCommit-HIVE-TRUNK-Build

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-06 Thread Jihong Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14237067#comment-14237067
 ] 

Jihong Liu commented on HIVE-8966:
--

Alan,
I created a wrong patch about 1 hour ago. Before I removed it. QA automatically 
did the above test. Please ignore and look the current attached patch. I think 
it really solves the issue.

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-06 Thread Jihong Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14237062#comment-14237062
 ] 

Jihong Liu commented on HIVE-8966:
--

Hi Alan,I have created a new patch. It works fine. The patch is pasted in that 
jira, also added comment about the logic. Please have a look. Thanks and have a 
good dayJihong
  From: Alan Gates (JIRA)  
 To: jhli...@yahoo.com 
 Sent: Friday, December 5, 2014 7:41 AM
 Subject: [jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog 
streaming cannot be compacted
   

    [ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14235645#comment-14235645
 ] 

Alan Gates commented on HIVE-8966:
--

Jihong, thanks for doing the testing on this.  

We could change this to not compact the current delta file, or we could change 
the cleaner to not remove the delta file that was still open during compaction. 
 I'll try to look at this in the next couple of days.  We need to get this 
fixed for 0.14.1.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)




> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14237057#comment-14237057
 ] 

Hive QA commented on HIVE-8966:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12685584/HIVE-8966.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1985/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1985/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1985/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-1985/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 
'metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java'
Reverted 'common/src/java/org/apache/hadoop/hive/conf/HiveConf.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java'
Reverted 
'ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java'
++ awk '{print $2}'
++ egrep -v '^X|^Performing status on external'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20S/target 
shims/0.23/target shims/aggregator/target shims/common/target 
shims/scheduler/target packaging/target hbase-handler/target testutils/target 
jdbc/target metastore/target itests/target itests/hcatalog-unit/target 
itests/test-serde/target itests/qtest/target itests/hive-unit-hadoop2/target 
itests/hive-minikdc/target itests/hive-unit/target itests/custom-serde/target 
itests/util/target hcatalog/target hcatalog/core/target 
hcatalog/streaming/target hcatalog/server-extensions/target 
hcatalog/hcatalog-pig-adapter/target hcatalog/webhcat/svr/target 
hcatalog/webhcat/java-client/target accumulo-handler/target hwi/target 
common/target common/src/gen 
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java.orig contrib/target 
service/target serde/target beeline/target odbc/target cli/target 
ql/dependency-reduced-pom.xml ql/target 
ql/src/test/results/clientpositive/parquet_array_of_multi_field_struct_gen_schema.q.out
 ql/src/test/results/clientpositive/parquet_decimal_gen_schema.q.out 
ql/src/test/results/clientpositive/parquet_array_of_unannotated_groups_gen_schema.q.out
 
ql/src/test/results/clientpositive/parquet_array_of_single_field_struct_gen_schema.q.out
 
ql/src/test/results/clientpositive/parquet_array_of_unannotated_primitives_gen_schema.q.out
 ql/src/test/results/clientpositive/parquet_array_of_structs_gen_schema.q.out 
ql/src/test/results/clientpositive/parquet_avro_array_of_primitives_gen_schema.q.out
 
ql/src/test/results/clientpositive/parquet_thrift_array_of_single_field_struct_gen_schema.q.out
 
ql/src/test/results/clientpositive/parquet_array_of_optional_elements_gen_schema.q.out
 
ql/src/test/results/clientpositive/parquet_avro_array_of_single_field_struct_gen_schema.q.out
 
ql/src/test/results/clientpositive/parquet_thrift_array_of_primitives_gen_schema.q.out
 
ql/src/test/results/clientpositive/parquet_array_of_structs_gen_schema_ext.q.out
 
ql/src/test/results/clientpositive/parquet_array_of_required_elements_gen_schema.q.out
 
ql/src/test/queries/clientpositive/parquet_avro_array_of_single_field_struct_gen_schema.q
 
ql/src/test/queries/clientpositive/parquet_array_of_single_field_struct_gen_schema.q
 ql/src/test/queri

[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-06 Thread Jihong Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14237048#comment-14237048
 ] 

Jihong Liu commented on HIVE-8966:
--

By the way, hive may need another cleaning process which auto removes the 
bucket_n_flash_length file if the connection is actually closed.  A program may 
not be able to close a transaction batch, due to many reasons, for example, 
network disconnected, server shutdown, application killed, and etc. So if the 
connection which creates a batch has been closed, that bucket_n_flash_length 
file needs to be removed. Otherwise that delta and the deltas after it can 
never be compacted unless we remove that file manually.

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-06 Thread Jihong Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14237047#comment-14237047
 ] 

Jihong Liu commented on HIVE-8966:
--

Solution: 
if the last delta has any file which is in bucket file pattern, but actually is 
non bucket file, don’t compact this delta. When a transaction is not close, a 
delta will have a file like bucket_n_flash_length, which is non bucket file. 
Actually for any reason, if the last delta has a file with bucket file pattern 
but not compactable, we should ignore this delta. Since after compaction, the 
delta will be removed. So if the whole delta cannot be compacted, leave it as 
what it is. So in the above scenario, the second delta will not be compacted. 
And the cleaner will not remove it because it has higher transaction id than 
the new created compaction file(base or delta). 
The reason we only do the above for the last delta is to consider the case that 
two or more transaction batches may be created and the last one is close first. 
Then if the last delta gets compacted, the transaction id in the base will be 
big, so all deltas will be removed by cleaner. So data could be lost. In this 
case, in the list of deltas for compaction, at least one delta has that 
bucket_n_flash_length file inside. Since we do not ignore it, the compaction 
will be auto-fail, so nothing happen, no data lost. In this case, the 
compaction can only be done after all transaction batches are closed. Although 
it is not so good, at least no data lost.
The patch is attached. It adds one method to test whether needs to remove the 
last delta from the delta list. And before process the delta list, run that 
method.  After applying this patch, no data is lost. We can do either major or 
minor compaction meanwhile keeping loading data in the same time.


> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-06 Thread Jihong Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14237046#comment-14237046
 ] 

Jihong Liu commented on HIVE-8966:
--

The scenario of data lost:
Assume when start compaction there are two deltas, delta_00011_00020 and 
delta_00021_00030, where the transaction batch in the first one is closed, and 
the second one still has transaction batch open. After compaction is finished, 
the status in compaction_ queue  will become “ready_for_clean”. Then clean 
process will be triggered. Cleaner will remove all deltas if its transaction id 
is less than the base which just created and if there is no lock on it. In the 
meantime, we still load data into the second delta. When finish loading and 
close the transaction batch, cleaner detects no lock on that, so delete it. So 
the new data added after compaction will be lost. 


> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-05 Thread Jihong Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14235923#comment-14235923
 ] 

Jihong Liu commented on HIVE-8966:
--

Great. I am working on that now. Will update you after finished the testing.

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-05 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14235645#comment-14235645
 ] 

Alan Gates commented on HIVE-8966:
--

Jihong, thanks for doing the testing on this.  

We could change this to not compact the current delta file, or we could change 
the cleaner to not remove the delta file that was still open during compaction. 
 I'll try to look at this in the next couple of days.  We need to get this 
fixed for 0.14.1.

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-04 Thread Jihong Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234769#comment-14234769
 ] 

Jihong Liu commented on HIVE-8966:
--

I think we may have to withdraw this patch for now. It looks like currently 
hive must not support doing compaction and loading in the same time for a 
partition. 
Without this patch, if loading for a partition is not completely finished, 
compaction will always fail, so nothing happen. After apply this patch, 
compaction will go through and finish. However we may loss data! I did a test. 
Data could be lost if we do compaction meanwhile the loading is not finished 
yet. 
But if keep the current version, it must be a limitation for hive. If streaming 
load to a partition for a long period, performance will be affected if cannot 
do compaction on it. 

For completely solve this issue, my initial thinking is that the delta files 
with open transaction should not be compacted. Currently they must be inlcuded, 
and it is probably the reason for data lost. But other closed delta files 
should be able to compact. So we can do compaction and loading in the same time.


> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-03 Thread Jihong Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14233790#comment-14233790
 ] 

Jihong Liu commented on HIVE-8966:
--

The patch is attached. Please review. Thanks

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8966.patch
>
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-12-02 Thread Jihong Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14232306#comment-14232306
 ] 

Jihong Liu commented on HIVE-8966:
--

Thanks. So now the fix is in 0.14.1?

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-11-26 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14227045#comment-14227045
 ] 

Gunther Hagleitner commented on HIVE-8966:
--

+1 for 0.14.1

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-11-26 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14226943#comment-14226943
 ] 

Alan Gates commented on HIVE-8966:
--

Ok, that makes sense.  You're current delta has the file because it's still 
open and being written to.  It also explains why my tests don't see it, as they 
don't run long enough.  The streaming is always done by the time the compactor 
kicks in.  Why don't you post a patch to this JIRA with the change for 1, and I 
can get that committed.

[~hagleitn], I'd like to put this in 0.14.1 as well as trunk if you're ok with 
it, since it blocks compaction for users using the streaming interface.

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-11-26 Thread Jihong Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14226925#comment-14226925
 ] 

Jihong Liu commented on HIVE-8966:
--

That flush_length file is only in the most recent delta. By the way, for 
streaming loading, a transaction batch is probably always open since data keeps 
coming. Is it possible to do compaction in the streaming loading environment? 
Thanks 

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-11-26 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14226890#comment-14226890
 ] 

Alan Gates commented on HIVE-8966:
--

1 might be the right thing to do.  2 breaks backward compatibility.  Before we 
do that though I'd like to understand why you still see the flush length files 
hanging around.  In my tests I don't see this issue because the flush length 
file is properly cleaned up.  I want to make sure that its existence doesn't 
mean something else is wrong.

Do you see the flush length files in all delta directories or only the most 
recent?  

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-11-26 Thread Jihong Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14226872#comment-14226872
 ] 

Jihong Liu commented on HIVE-8966:
--

Yes. Closed the transaction batch. Suggest to do either the following two 
updates, or do both:

1. if a file is non-bucket file, don't try to compact it. So update the 
following code:
   in org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.java
  Change the following code:

  private void addFileToMap(Matcher matcher, Path file, boolean sawBase,
  Map splitToBucketMap) {
  if (!matcher.find()) {
LOG.warn("Found a non-bucket file that we thought matched the bucket 
pattern! " +
file.toString());
  }

   .
 to:
   private void addFileToMap(Matcher matcher, Path file, boolean sawBase,
  Map splitToBucketMap) {
  if (!matcher.find()) {
LOG.warn("Found a non-bucket file that we thought matched the bucket 
pattern! " +
file.toString());
return;
  }
 

2. don't use the bucket file pattern to name to "flush_length" file. So update 
the following code:
  in org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.java
 change the following code:
   static Path getSideFile(org.apache.tools.ant.types.Path main) {
 return new Path(main + "_flush_length");
   }

to:
 static Path getSideFile(org.apache.tools.ant.types.Path main) {
if (main.toString().startsWith("bucket_")) {
 return new Path("bkt"+main.toString().substring(6)+ 
"_flush_length");
}
  else return new Path(main + "_flush_length");
  }
 
after did the above updates and re-compiled the hive-exec.jar, the compaction 
works fine now


> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2014-11-26 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14226794#comment-14226794
 ] 

Alan Gates commented on HIVE-8966:
--

This flush length file should be removed when the batch is closed.  Are you 
closing the transaction batch on a regular basis?

> Delta files created by hive hcatalog streaming cannot be compacted
> --
>
> Key: HIVE-8966
> URL: https://issues.apache.org/jira/browse/HIVE-8966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
> Environment: hive
>Reporter: Jihong Liu
>Assignee: Alan Gates
>Priority: Critical
>
> hive hcatalog streaming will also create a file like bucket_n_flush_length in 
> each delta directory. Where "n" is the bucket number. But the 
> compactor.CompactorMR think this file also needs to compact. However this 
> file of course cannot be compacted, so compactor.CompactorMR will not 
> continue to do the compaction. 
> Did a test, after removed the bucket_n_flush_length file, then the "alter 
> table partition compact" finished successfully. If don't delete that file, 
> nothing will be compacted. 
> This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)