[jira] [Updated] (HIVE-2304) Support PreparedStatement.setObject

2013-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-2304:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Ido!

> Support PreparedStatement.setObject
> ---
>
> Key: HIVE-2304
> URL: https://issues.apache.org/jira/browse/HIVE-2304
> Project: Hive
>  Issue Type: Sub-task
>  Components: JDBC
>Affects Versions: 0.7.1
>Reporter: Ido Hadanny
>Assignee: Ido Hadanny
>Priority: Minor
> Fix For: 0.12.0
>
> Attachments: HIVE-0.8-SetObject.2.patch.txt
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> PreparedStatement.setObject is important for spring's jdbcTemplate support

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4526) auto_sortmerge_join_9.q throws NPE but test is succeeded

2013-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4526:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Navis!

> auto_sortmerge_join_9.q throws NPE but test is succeeded
> 
>
> Key: HIVE-4526
> URL: https://issues.apache.org/jira/browse/HIVE-4526
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Navis
>Assignee: Navis
> Fix For: 0.12.0
>
> Attachments: HIVE-4526.D10725.1.patch
>
>
> auto_sortmerge_join_9.q
> {noformat}
> [junit] Running org.apache.hadoop.hive.cli.TestCliDriver
> [junit] Begin query: auto_sortmerge_join_9.q
> [junit] Deleted 
> file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl1
> [junit] Deleted 
> file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl2
> [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with 
> exception nulljava.lang.NullPointerException
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
> [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit]   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> [junit]   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
> [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> [junit] 
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
> [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit]   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> [junit]   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
> [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with 
> exception nulljava.lang.NullPointerException
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
> [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit]   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> [junit]   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
> [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> [junit] 
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
> [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit]   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> [junit]   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> [junit] 

[jira] [Commented] (HIVE-2206) add a new optimizer for query correlation discovery and optimization

2013-06-04 Thread Yin Huai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675569#comment-13675569
 ] 

Yin Huai commented on HIVE-2206:


HIVE-2206.D11097.1.patch is the latest patch for the trunk. I have heavily 
refactored my code. Here are major changes.
# If multiple operation paths share the same input table, I just use a single 
TableScanOperator and add the bottom operators of these paths as children of 
this common TableScanOperator. I do not do any deduplication of common columns 
because deduplication will significantly make the code more complicated and may 
introduce more problems. If we want to do deduplication, I suggest to tackle it 
later in a followup work.
# Without deduplicating columns, the dispatcher at the reduce side has less 
work to do and some queries involving self join can be optimized in the current 
version.
# The fake ReduceSinkOperator (CorrelationLocalSimulativeReduceSinkOperator... 
I will change the name later) does not do serialization and deserialization as 
appearing in the previous one.
# New test cases are added.
# I also refactor the code ReduceSinkDeDupplication since CorrelationOptimizer 
can reuse some methods introduced by ReduceSinkDeDupplication. [~navis] can you 
take a look at it and see if my changes make sense?

I will run all unit tests soon and will also add more comments.

btw, there is a issue in correlationoptimizer2.q. Optimized plans cannot 
generate rows that both join keys (from the left table and right table) are 
null values for outer joins. I am looking at it

> add a new optimizer for query correlation discovery and optimization
> 
>
> Key: HIVE-2206
> URL: https://issues.apache.org/jira/browse/HIVE-2206
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Affects Versions: 0.12.0
>Reporter: He Yongqiang
>Assignee: Yin Huai
> Attachments: HIVE-2206.10-r1384442.patch.txt, 
> HIVE-2206.11-r1385084.patch.txt, HIVE-2206.12-r1386996.patch.txt, 
> HIVE-2206.13-r1389072.patch.txt, HIVE-2206.14-r1389704.patch.txt, 
> HIVE-2206.15-r1392491.patch.txt, HIVE-2206.16-r1399936.patch.txt, 
> HIVE-2206.17-r1404933.patch.txt, HIVE-2206.18-r1407720.patch.txt, 
> HIVE-2206.19-r1410581.patch.txt, HIVE-2206.1.patch.txt, 
> HIVE-2206.20-r1434012.patch.txt, HIVE-2206.2.patch.txt, 
> HIVE-2206.3.patch.txt, HIVE-2206.4.patch.txt, HIVE-2206.5-1.patch.txt, 
> HIVE-2206.5.patch.txt, HIVE-2206.6.patch.txt, HIVE-2206.7.patch.txt, 
> HIVE-2206.8.r1224646.patch.txt, HIVE-2206.8-r1237253.patch.txt, 
> HIVE-2206.D11097.1.patch, testQueries.2.q, YSmartPatchForHive.patch
>
>
> This issue proposes a new logical optimizer called Correlation Optimizer, 
> which is used to merge correlated MapReduce jobs (MR jobs) into a single MR 
> job. The idea is based on YSmart (http://ysmart.cse.ohio-state.edu/). The 
> paper and slides of YSmart are linked at the bottom.
> Since Hive translates queries in a sentence by sentence fashion, for every 
> operation which may need to shuffle the data (e.g. join and aggregation 
> operations), Hive will generate a MapReduce job for that operation. However, 
> for those operations which may need to shuffle the data, they may involve 
> correlations explained below and thus can be executed in a single MR job.
> # Input Correlation: Multiple MR jobs have input correlation (IC) if their 
> input relation sets are not disjoint;
> # Transit Correlation: Multiple MR jobs have transit correlation (TC) if they 
> have not only input correlation, but also the same partition key;
> # Job Flow Correlation: An MR has job flow correlation (JFC) with one of its 
> child nodes if it has the same partition key as that child node.
> The current implementation of correlation optimizer only detect correlations 
> among MR jobs for reduce-side join operators and reduce-side aggregation 
> operators (not map only aggregation). A query will be optimized if it 
> satisfies following conditions.
> # There exists a MR job for reduce-side join operator or reduce side 
> aggregation operator which have JFC with all of its parents MR jobs (TCs will 
> be also exploited if JFC exists);
> # All input tables of those correlated MR job are original input tables (not 
> intermediate tables generated by sub-queries); and 
> # No self join is involved in those correlated MR jobs.
> Correlation optimizer is implemented as a logical optimizer. The main reasons 
> are that it only needs to manipulate the query plan tree and it can leverage 
> the existing component on generating MR jobs.
> Current implementation can serve as a framework for correlation related 
> optimizations. I think that it is better than adding individual optimizers. 
> There are several work that can be done in future 

[jira] [Created] (HIVE-4659) while sql contains \t , 'desc formatted view_name' and 'show create table view_name' statements will generate Incomplete results

2013-06-04 Thread caofangkun (JIRA)
caofangkun created HIVE-4659:


 Summary: while sql contains \t , 'desc formatted view_name' and 
'show create table view_name' statements will generate Incomplete results
 Key: HIVE-4659
 URL: https://issues.apache.org/jira/browse/HIVE-4659
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: caofangkun
Assignee: caofangkun
Priority: Minor


drop view if exists v_test;
CREATE VIEW v_test AS select 
key,-- start by \t\t 
value,  -- start by \t\t 
dt from -- start by \t\t
(
select key, value, dt from tmp_v_t1 where dt='20130122' 
union all 
select key,value, dt from tmp_v_t1 where dt='20130123'
) t;

$ hive -e "show create table  v_test"

UT-One the three lines which started by \t lost in create statment !
Logging initialized using configuration in 
file:/home/zongren/hive-conf/hive-log4j.properties
Hive history 
file=/tmp/zongren/hive_job_log_zongren_24155@hd17-vm5_201306051125_94165790.txt
OK
CREATE VIEW v_test AS select



(
select `tmp_v_t1`.`key`, `tmp_v_t1`.`value`, `tmp_v_t1`.`dt` from 
`default`.`tmp_v_t1` where `tmp_v_t1`.`dt`='20130122' 
union all 
select `tmp_v_t1`.`key`,`tmp_v_t1`.`value`, `tmp_v_t1`.`dt` from 
`default`.`tmp_v_t1` where `tmp_v_t1`.`dt`='20130123'
) `t`
Time taken: 2.767 seconds, Fetched: 9 row(s)

UT-Two:


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2206) add a new optimizer for query correlation discovery and optimization

2013-06-04 Thread Yin Huai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-2206:
---

Attachment: (was: HIVE-2206.21.patch.txt)

> add a new optimizer for query correlation discovery and optimization
> 
>
> Key: HIVE-2206
> URL: https://issues.apache.org/jira/browse/HIVE-2206
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Affects Versions: 0.12.0
>Reporter: He Yongqiang
>Assignee: Yin Huai
> Attachments: HIVE-2206.10-r1384442.patch.txt, 
> HIVE-2206.11-r1385084.patch.txt, HIVE-2206.12-r1386996.patch.txt, 
> HIVE-2206.13-r1389072.patch.txt, HIVE-2206.14-r1389704.patch.txt, 
> HIVE-2206.15-r1392491.patch.txt, HIVE-2206.16-r1399936.patch.txt, 
> HIVE-2206.17-r1404933.patch.txt, HIVE-2206.18-r1407720.patch.txt, 
> HIVE-2206.19-r1410581.patch.txt, HIVE-2206.1.patch.txt, 
> HIVE-2206.20-r1434012.patch.txt, HIVE-2206.2.patch.txt, 
> HIVE-2206.3.patch.txt, HIVE-2206.4.patch.txt, HIVE-2206.5-1.patch.txt, 
> HIVE-2206.5.patch.txt, HIVE-2206.6.patch.txt, HIVE-2206.7.patch.txt, 
> HIVE-2206.8.r1224646.patch.txt, HIVE-2206.8-r1237253.patch.txt, 
> HIVE-2206.D11097.1.patch, testQueries.2.q, YSmartPatchForHive.patch
>
>
> This issue proposes a new logical optimizer called Correlation Optimizer, 
> which is used to merge correlated MapReduce jobs (MR jobs) into a single MR 
> job. The idea is based on YSmart (http://ysmart.cse.ohio-state.edu/). The 
> paper and slides of YSmart are linked at the bottom.
> Since Hive translates queries in a sentence by sentence fashion, for every 
> operation which may need to shuffle the data (e.g. join and aggregation 
> operations), Hive will generate a MapReduce job for that operation. However, 
> for those operations which may need to shuffle the data, they may involve 
> correlations explained below and thus can be executed in a single MR job.
> # Input Correlation: Multiple MR jobs have input correlation (IC) if their 
> input relation sets are not disjoint;
> # Transit Correlation: Multiple MR jobs have transit correlation (TC) if they 
> have not only input correlation, but also the same partition key;
> # Job Flow Correlation: An MR has job flow correlation (JFC) with one of its 
> child nodes if it has the same partition key as that child node.
> The current implementation of correlation optimizer only detect correlations 
> among MR jobs for reduce-side join operators and reduce-side aggregation 
> operators (not map only aggregation). A query will be optimized if it 
> satisfies following conditions.
> # There exists a MR job for reduce-side join operator or reduce side 
> aggregation operator which have JFC with all of its parents MR jobs (TCs will 
> be also exploited if JFC exists);
> # All input tables of those correlated MR job are original input tables (not 
> intermediate tables generated by sub-queries); and 
> # No self join is involved in those correlated MR jobs.
> Correlation optimizer is implemented as a logical optimizer. The main reasons 
> are that it only needs to manipulate the query plan tree and it can leverage 
> the existing component on generating MR jobs.
> Current implementation can serve as a framework for correlation related 
> optimizations. I think that it is better than adding individual optimizers. 
> There are several work that can be done in future to improve this optimizer. 
> Here are three examples.
> # Support queries only involve TC;
> # Support queries in which input tables of correlated MR jobs involves 
> intermediate tables; and 
> # Optimize queries involving self join. 
> References:
> Paper and presentation of YSmart.
> Paper: 
> http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf
> Slides: http://sdrv.ms/UpwJJc

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2206) add a new optimizer for query correlation discovery and optimization

2013-06-04 Thread Yin Huai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-2206:
---

Attachment: HIVE-2206.21.patch.txt

> add a new optimizer for query correlation discovery and optimization
> 
>
> Key: HIVE-2206
> URL: https://issues.apache.org/jira/browse/HIVE-2206
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Affects Versions: 0.12.0
>Reporter: He Yongqiang
>Assignee: Yin Huai
> Attachments: HIVE-2206.10-r1384442.patch.txt, 
> HIVE-2206.11-r1385084.patch.txt, HIVE-2206.12-r1386996.patch.txt, 
> HIVE-2206.13-r1389072.patch.txt, HIVE-2206.14-r1389704.patch.txt, 
> HIVE-2206.15-r1392491.patch.txt, HIVE-2206.16-r1399936.patch.txt, 
> HIVE-2206.17-r1404933.patch.txt, HIVE-2206.18-r1407720.patch.txt, 
> HIVE-2206.19-r1410581.patch.txt, HIVE-2206.1.patch.txt, 
> HIVE-2206.20-r1434012.patch.txt, HIVE-2206.21.patch.txt, 
> HIVE-2206.2.patch.txt, HIVE-2206.3.patch.txt, HIVE-2206.4.patch.txt, 
> HIVE-2206.5-1.patch.txt, HIVE-2206.5.patch.txt, HIVE-2206.6.patch.txt, 
> HIVE-2206.7.patch.txt, HIVE-2206.8.r1224646.patch.txt, 
> HIVE-2206.8-r1237253.patch.txt, HIVE-2206.D11097.1.patch, testQueries.2.q, 
> YSmartPatchForHive.patch
>
>
> This issue proposes a new logical optimizer called Correlation Optimizer, 
> which is used to merge correlated MapReduce jobs (MR jobs) into a single MR 
> job. The idea is based on YSmart (http://ysmart.cse.ohio-state.edu/). The 
> paper and slides of YSmart are linked at the bottom.
> Since Hive translates queries in a sentence by sentence fashion, for every 
> operation which may need to shuffle the data (e.g. join and aggregation 
> operations), Hive will generate a MapReduce job for that operation. However, 
> for those operations which may need to shuffle the data, they may involve 
> correlations explained below and thus can be executed in a single MR job.
> # Input Correlation: Multiple MR jobs have input correlation (IC) if their 
> input relation sets are not disjoint;
> # Transit Correlation: Multiple MR jobs have transit correlation (TC) if they 
> have not only input correlation, but also the same partition key;
> # Job Flow Correlation: An MR has job flow correlation (JFC) with one of its 
> child nodes if it has the same partition key as that child node.
> The current implementation of correlation optimizer only detect correlations 
> among MR jobs for reduce-side join operators and reduce-side aggregation 
> operators (not map only aggregation). A query will be optimized if it 
> satisfies following conditions.
> # There exists a MR job for reduce-side join operator or reduce side 
> aggregation operator which have JFC with all of its parents MR jobs (TCs will 
> be also exploited if JFC exists);
> # All input tables of those correlated MR job are original input tables (not 
> intermediate tables generated by sub-queries); and 
> # No self join is involved in those correlated MR jobs.
> Correlation optimizer is implemented as a logical optimizer. The main reasons 
> are that it only needs to manipulate the query plan tree and it can leverage 
> the existing component on generating MR jobs.
> Current implementation can serve as a framework for correlation related 
> optimizations. I think that it is better than adding individual optimizers. 
> There are several work that can be done in future to improve this optimizer. 
> Here are three examples.
> # Support queries only involve TC;
> # Support queries in which input tables of correlated MR jobs involves 
> intermediate tables; and 
> # Optimize queries involving self join. 
> References:
> Paper and presentation of YSmart.
> Paper: 
> http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf
> Slides: http://sdrv.ms/UpwJJc

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4418) TestNegativeCliDriver failure message if cmd succeeds is misleading

2013-06-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675548#comment-13675548
 ] 

Hudson commented on HIVE-4418:
--

Integrated in Hive-trunk-hadoop2 #225 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/225/])
HIVE-4418 : TestNegativeCliDriver failure message if cmd succeeds is 
misleading (Thejas Nair via Ashutosh Chauhan) (Revision 1489278)

 Result = ABORTED
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489278
Files : 
* /hive/trunk/ql/src/test/templates/TestNegativeCliDriver.vm


> TestNegativeCliDriver failure message if cmd succeeds is misleading
> ---
>
> Key: HIVE-4418
> URL: https://issues.apache.org/jira/browse/HIVE-4418
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Affects Versions: 0.10.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.12.0
>
> Attachments: HIVE-4418.1.patch
>
>
> If the .q test ends up succeeding (exit code == 0), then the test failure 
> message is misleading.
> From the error it seems as if the command actually failed -
> {code}
> [junit] junit.framework.AssertionFailedError: Client Execution failed 
> with error code = 0
> [junit] See build/ql/tmp/hive.log, or try "ant test ... 
> -Dtest.silent=false" to get more logs.
> [junit] at junit.framework.Assert.fail(Assert.java:47)
> [junit] at 
> org.apache.hadoop.hive.cli.TestNegativeCliDriver.runTest(TestNegativeCliDriver.java:121)
> [junit] at 
> org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_desc_tab(TestNegativeCliDriver.java:102)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4585) Remove unused MR Temp file localization from Tasks

2013-06-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675550#comment-13675550
 ] 

Hudson commented on HIVE-4585:
--

Integrated in Hive-trunk-hadoop2 #225 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/225/])
HIVE-4585 : Remove unused MR Temp file localization from Tasks (Gunther 
Hagleitner via Ashutosh Chauhan) (Revision 1489279)

 Result = ABORTED
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489279
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ConditionalTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CopyTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DependencyCollectionTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/index/IndexMetadataChangeTask.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/stats/PartialScanTask.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java


> Remove unused MR Temp file localization from Tasks
> --
>
> Key: HIVE-4585
> URL: https://issues.apache.org/jira/browse/HIVE-4585
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Fix For: 0.12.0
>
> Attachments: HIVE-4585.1.patch
>
>
> HIVE-1408 introduced code that is currently commented out (i.e.: dead code), 
> with a comment saying needs further development (HIVE-1484). It's been like 
> this for close to 3 years. 
> I suggest removing the code until such time that someone picks up that work. 
> At that time they can decide if they want to use this code or pursue another 
> route (FS shim?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4620) MR temp directory conflicts in case of parallel execution mode

2013-06-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675547#comment-13675547
 ] 

Hudson commented on HIVE-4620:
--

Integrated in Hive-trunk-hadoop2 #225 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/225/])
HIVE-4620 MR temp directory conflicts in case of parallel execution mode 
(Prasad Mujumdar via Navis) (Revision 1489226)

 Result = ABORTED
navis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489226
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Context.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskRunner.java


> MR temp directory conflicts in case of parallel execution mode
> --
>
> Key: HIVE-4620
> URL: https://issues.apache.org/jira/browse/HIVE-4620
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Prasad Mujumdar
>Assignee: Prasad Mujumdar
> Fix For: 0.12.0
>
> Attachments: HIVE-4620-1.patch, HIVE-4620-2.patch, HIVE-4620-3.patch
>
>
> In parallel query execution mode, all the parallel running task ends up 
> sharing the same temp/scratch directory. This could lead to file conflicts 
> and temp files getting deleted before the job completion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2670) A cluster test utility for Hive

2013-06-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675549#comment-13675549
 ] 

Hudson commented on HIVE-2670:
--

Integrated in Hive-trunk-hadoop2 #225 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/225/])
HIVE-2670 A cluster test utility for Hive (gates and Johnny Zhang via 
gates) (Revision 1489376)

 Result = ABORTED
gates : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489376
Files : 
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/build.xml
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/tools/test/floatpostprocessor.pl


> A cluster test utility for Hive
> ---
>
> Key: HIVE-2670
> URL: https://issues.apache.org/jira/browse/HIVE-2670
> Project: Hive
>  Issue Type: New Feature
>  Components: Testing Infrastructure
>Reporter: Alan Gates
>Assignee: Johnny Zhang
> Fix For: 0.12.0
>
> Attachments: harness.tar, HIVE-2670_5.patch, HIVE-2670_6.patch, 
> hive_cluster_test_2.patch, hive_cluster_test_3.patch, 
> hive_cluster_test_4.patch, hive_cluster_test.patch
>
>
> Hive has an extensive set of unit tests, but it does not have an 
> infrastructure for testing in a cluster environment.  Pig and HCatalog have 
> been using a test harness for cluster testing for some time.  We have written 
> Hive drivers and tests to run in this harness.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2206) add a new optimizer for query correlation discovery and optimization

2013-06-04 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-2206:
--

Attachment: HIVE-2206.D11097.1.patch

yhuai requested code review of "HIVE-2206 [jira] add a new optimizer for query 
correlation discovery and optimization".

Reviewers: JIRA

update test results

This issue proposes a new logical optimizer called Correlation Optimizer, which 
is used to merge correlated MapReduce jobs (MR jobs) into a single MR job. The 
idea is based on YSmart (http://ysmart.cse.ohio-state.edu/). The paper and 
slides of YSmart are linked at the bottom.

Since Hive translates queries in a sentence by sentence fashion, for every 
operation which may need to shuffle the data (e.g. join and aggregation 
operations), Hive will generate a MapReduce job for that operation. However, 
for those operations which may need to shuffle the data, they may involve 
correlations explained below and thus can be executed in a single MR job.

Input Correlation: Multiple MR jobs have input correlation (IC) if 
their input relation sets are not disjoint;
Transit Correlation: Multiple MR jobs have transit correlation (TC) if 
they have not only input correlation, but also the same partition key;
Job Flow Correlation: An MR has job flow correlation (JFC) with one of 
its child nodes if it has the same partition key as that child node.

The current implementation of correlation optimizer only detect correlations 
among MR jobs for reduce-side join operators and reduce-side aggregation 
operators (not map only aggregation). A query will be optimized if it satisfies 
following conditions.

There exists a MR job for reduce-side join operator or reduce side 
aggregation operator which have JFC with all of its parents MR jobs (TCs will 
be also exploited if JFC exists);
All input tables of those correlated MR job are original input tables 
(not intermediate tables generated by sub-queries); and
No self join is involved in those correlated MR jobs.

Correlation optimizer is implemented as a logical optimizer. The main reasons 
are that it only needs to manipulate the query plan tree and it can leverage 
the existing component on generating MR jobs.

Current implementation can serve as a framework for correlation related 
optimizations. I think that it is better than adding individual optimizers.

There are several work that can be done in future to improve this optimizer. 
Here are three examples.

Support queries only involve TC;
Support queries in which input tables of correlated MR jobs involves 
intermediate tables; and
Optimize queries involving self join.

References:
Paper and presentation of YSmart.
Paper: 
http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf
Slides: http://sdrv.ms/UpwJJc

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D11097

AFFECTED FILES
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
  conf/hive-default.xml.template
  ql/if/queryplan.thrift
  
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/OperatorType.java
  
ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationLocalSimulativeReduceSinkOperator.java
  
ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationReducerDispatchOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecReducer.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationOptimizer.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/IntraQueryCorrelation.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/QueryPlanTreeTransformation.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  
ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationLocalSimulativeReduceSinkDesc.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationReducerDispatchDesc.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java
  ql/src/test/queries/clientpositive/correlationoptimizer1.q
  ql/src/test/queries/clientpositive/correlationoptimizer2.q
  ql/src/test/queries/clientpositive/correlationoptimizer3.q
  ql/src/test/queries/clientpositive/correlationoptimizer4.q
  ql/src/test/queries/clientpositive

[jira] [Commented] (HIVE-4055) add Date data type

2013-06-04 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675533#comment-13675533
 ] 

Thejas M Nair commented on HIVE-4055:
-

[~sunrui] I think we should consider using JodaTime instead of java.sql.Date.
While working on datetime implementation in Apache Pig (PIG-1314), we found 
that JodaTime is significantly faster than java built in date type. See numbers 
here - 
https://issues.apache.org/jira/secure/EditComment!default.jspa?id=12459893&commentId=13284047
 . The test code is attached in the jira. Note that the comparison is after 
adding optimization to avoid conversion Calendar objects from java.util.date. 
java.sql.Date is a thin wrapper around java.util.Date , so it is likely to have 
the same performance characteristics.




> add Date data type
> --
>
> Key: HIVE-4055
> URL: https://issues.apache.org/jira/browse/HIVE-4055
> Project: Hive
>  Issue Type: Sub-task
>  Components: JDBC, Query Processor, Serializers/Deserializers, UDF
>Reporter: Sun Rui
> Attachments: HIVE-4055.1.patch.txt
>
>
> Add Date data type, a new primitive data type which supports the standard SQL 
> date type.
> Basically, the implementation can take HIVE-2272 and HIVE-2957 as references.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4658) Make KW_OUTER optional in outer joins

2013-06-04 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4658:
--

Attachment: HIVE-4658.D11091.1.patch

navis requested code review of "HIVE-4658 [jira] Make KW_OUTER optional in 
outer joins".

Reviewers: JIRA

HIVE-4658 Make KW_OUTER optional in outer joins

For really trivial migration issue.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D11091

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/parse/FromClauseParser.g

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/26433/

To: JIRA, navis


> Make KW_OUTER optional in outer joins
> -
>
> Key: HIVE-4658
> URL: https://issues.apache.org/jira/browse/HIVE-4658
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-4658.D11091.1.patch
>
>
> For really trivial migration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4658) Make KW_OUTER optional in outer joins

2013-06-04 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4658:


Status: Patch Available  (was: Open)

> Make KW_OUTER optional in outer joins
> -
>
> Key: HIVE-4658
> URL: https://issues.apache.org/jira/browse/HIVE-4658
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
>
> For really trivial migration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4658) Make KW_OUTER optional in outer joins

2013-06-04 Thread Navis (JIRA)
Navis created HIVE-4658:
---

 Summary: Make KW_OUTER optional in outer joins
 Key: HIVE-4658
 URL: https://issues.apache.org/jira/browse/HIVE-4658
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial


For really trivial migration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4657) HCatalog checkstyle violation after HIVE-2670

2013-06-04 Thread Shreepadma Venugopalan (JIRA)
Shreepadma Venugopalan created HIVE-4657:


 Summary: HCatalog checkstyle violation after HIVE-2670 
 Key: HIVE-4657
 URL: https://issues.apache.org/jira/browse/HIVE-4657
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Shreepadma Venugopalan


After HIVE-2670 was committed, I see the following error,

{noformat}
checkstyle:
 [echo] hcatalog
[checkstyle] Running Checkstyle 5.5 on 416 files
[checkstyle] 
/Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm:1:
 Line does not match expected header line of '\W*or more contributor license 
agreements.  See the NOTICE file$'.
[checkstyle] 
/Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf:1:
 Line does not match expected header line of '\W*or more contributor license 
agreements.  See the NOTICE file$'.
[checkstyle] 
/Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf:1:
 Line does not match expected header line of '\W*or more contributor license 
agreements.  See the NOTICE file$'.
  [for] hcatalog: The following error occurred while executing this line:
  [for] /Users/vshree/work/repositories/hive15/build.xml:310: The following 
error occurred while executing this line:
  [for] /Users/vshree/work/repositories/hive15/hcatalog/build.xml:109: The 
following error occurred while executing this line:
  [for] 
/Users/vshree/work/repositories/hive15/hcatalog/build-support/ant/checkstyle.xml:32:
 Got 3 errors and 0 warnings.

BUILD FAILED
/Users/vshree/work/repositories/hive15/build.xml:308: Keepgoing execution: 2 of 
11 iterations failed.
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions

2013-06-04 Thread Teddy Choi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675467#comment-13675467
 ] 

Teddy Choi commented on HIVE-4642:
--

I see. I was not sure about parallelization. I'll focus in single thread. Thank 
you for feedback.

> Implement vectorized RLIKE and REGEXP filter expressions
> 
>
> Key: HIVE-4642
> URL: https://issues.apache.org/jira/browse/HIVE-4642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Eric Hanson
>Assignee: Teddy Choi
>
> See title. I will add more details next week. The goal is (a) make this work 
> correctly and (b) optimize it as well as possible, at least for the common 
> cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4516) Fix concurrency bug in serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java

2013-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4516:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
 Assignee: Jon Hartlaub
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Jon and Navis!

> Fix concurrency bug in 
> serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java
> -
>
> Key: HIVE-4516
> URL: https://issues.apache.org/jira/browse/HIVE-4516
> Project: Hive
>  Issue Type: Bug
>Reporter: Jon Hartlaub
>Assignee: Jon Hartlaub
> Fix For: 0.12.0
>
> Attachments: HIVE-4516.D10929.1.patch, TimestampWritable.java.patch
>
>
> A patch for concurrent use of TimestampWritable which occurs in a 
> multithreaded scenario (as found in AmpLab Shark).  A static SimpleDateFormat 
> (not ThreadSafe) is used by TimestampWritable in CTAS DDL statements where it 
> manifests as data corruption when used in a concurrent environment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 , if all the column values larger than 0.0 (or if all column values smaller than 0.0)

2013-06-04 Thread Shreepadma Venugopalan


> On June 4, 2013, 11:09 p.m., Shreepadma Venugopalan wrote:
> > Ship It!

I think its OK to have Long.MAX_VALUE/Long.MIN_VALUE as the min/max value if 
there are no rows in the table. 


- Shreepadma


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11172/#review21438
---


On June 4, 2013, 1:41 p.m., Zhuoluo Yang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11172/
> ---
> 
> (Updated June 4, 2013, 1:41 p.m.)
> 
> 
> Review request for hive, Carl Steinbach, Carl Steinbach, Ashutosh Chauhan, 
> Shreepadma Venugopalan, and fangkun cao.
> 
> 
> Description
> ---
> 
> An initialization error.
> Make double and long initialize correctly.
> Would you review that and assign the issue to me?
> 
> 
> This addresses bug HIVE-4561.
> https://issues.apache.org/jira/browse/HIVE-4561
> 
> 
> Diffs
> -
> 
>   
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java
>  1489292 
>   
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/compute_stats_empty_table.q.out
>  1489292 
>   
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/compute_stats_long.q.out
>  1489292 
> 
> Diff: https://reviews.apache.org/r/11172/diff/
> 
> 
> Testing
> ---
> 
> ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_long.q
> ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_double.q
> 
> done.
> 
> 
> Thanks,
> 
> Zhuoluo Yang
> 
>



[jira] [Commented] (HIVE-4554) Failed to create a table from existing file if file path has spaces

2013-06-04 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675428#comment-13675428
 ] 

Ashutosh Chauhan commented on HIVE-4554:


Thanks, Xuefu for testing that. 
+1 will commit if tests pass.

> Failed to create a table from existing file if file path has spaces
> ---
>
> Key: HIVE-4554
> URL: https://issues.apache.org/jira/browse/HIVE-4554
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.10.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-4554.patch, HIVE-4554.patch.1, HIVE-4554.patch.2, 
> HIVE-4554.patch.3, HIVE-4554.patch.4
>
>
> To reproduce the problem,
> 1. Create a table, say, person_age (name STRING, age INT).
> 2. Create a file whose name has a space in it, say, "data set.txt".
> 3. Try to load the date in the file to the table.
> The following error can be seen in the console:
> hive> LOAD DATA INPATH '/home/xzhang/temp/data set.txt' INTO TABLE person_age;
> Loading data to table default.person_age
> Failed with exception Wrong file format. Please check the file's format.
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> Note: the error message is confusing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)

2013-06-04 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675425#comment-13675425
 ] 

Ashutosh Chauhan commented on HIVE-4561:


If there are no rows in table, low and high values should really be null (or 
NaN) and not 0.

> Column stats :  LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the 
> column values larger than 0.0 (or if all column values smaller than 0.0)
> 
>
> Key: HIVE-4561
> URL: https://issues.apache.org/jira/browse/HIVE-4561
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 0.12.0
>Reporter: caofangkun
>Assignee: Zhuoluo (Clark) Yang
> Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch, HIVE-4561.3.patch
>
>
> if all column values larger than 0.0  DOUBLE_LOW_VALUE always will be 0.0 
> or  if all column values less than 0.0,  DOUBLE_HIGH_VALUE will always be 
> hive (default)> create table src_test (price double);
> hive (default)> load data local inpath './test.txt' into table src_test;
> hive (default)> select * from src_test;
> OK
> 1.0
> 2.0
> 3.0
> Time taken: 0.313 seconds, Fetched: 3 row(s)
> hive (default)> analyze table src_test compute statistics for columns price;
> mysql> select * from TAB_COL_STATS \G;
>  CS_ID: 16
>DB_NAME: default
> TABLE_NAME: src_test
>COLUMN_NAME: price
>COLUMN_TYPE: double
> TBL_ID: 2586
> LONG_LOW_VALUE: 0
>LONG_HIGH_VALUE: 0
>   DOUBLE_LOW_VALUE: 0.   # Wrong Result ! Expected is 1.
>  DOUBLE_HIGH_VALUE: 3.
>  BIG_DECIMAL_LOW_VALUE: NULL
> BIG_DECIMAL_HIGH_VALUE: NULL
>  NUM_NULLS: 0
>  NUM_DISTINCTS: 1
>AVG_COL_LEN: 0.
>MAX_COL_LEN: 0
>  NUM_TRUES: 0
> NUM_FALSES: 0
>  LAST_ANALYZED: 1368596151
> 2 rows in set (0.00 sec)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4566) NullPointerException if typeinfo and nativesql commands are executed at beeline before a DB connection is established

2013-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4566:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Xuefu!

> NullPointerException if typeinfo and nativesql commands are executed at 
> beeline before a DB connection is established
> -
>
> Key: HIVE-4566
> URL: https://issues.apache.org/jira/browse/HIVE-4566
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Fix For: 0.12.0
>
> Attachments: HIVE-4566.patch, HIVE-4566.patch.1
>
>
> Before a DB connection is established, executing a command such as typeinfo 
> and nativesql results an NPE shown at the console:
> beeline> !typeinfo
> java.lang.NullPointerException
> beeline> !nativesql
> java.lang.NullPointerException
> Instead, a message, such as "No current connection" should be given, as in 
> case of some other commands, such as dropall.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 , if all the column values larger than 0.0 (or if all column values smaller than 0.0)

2013-06-04 Thread Shreepadma Venugopalan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11172/#review21438
---

Ship it!


Ship It!

- Shreepadma Venugopalan


On June 4, 2013, 1:41 p.m., Zhuoluo Yang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11172/
> ---
> 
> (Updated June 4, 2013, 1:41 p.m.)
> 
> 
> Review request for hive, Carl Steinbach, Carl Steinbach, Ashutosh Chauhan, 
> Shreepadma Venugopalan, and fangkun cao.
> 
> 
> Description
> ---
> 
> An initialization error.
> Make double and long initialize correctly.
> Would you review that and assign the issue to me?
> 
> 
> This addresses bug HIVE-4561.
> https://issues.apache.org/jira/browse/HIVE-4561
> 
> 
> Diffs
> -
> 
>   
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java
>  1489292 
>   
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/compute_stats_empty_table.q.out
>  1489292 
>   
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/compute_stats_long.q.out
>  1489292 
> 
> Diff: https://reviews.apache.org/r/11172/diff/
> 
> 
> Testing
> ---
> 
> ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_long.q
> ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_double.q
> 
> done.
> 
> 
> Thanks,
> 
> Zhuoluo Yang
> 
>



[jira] [Updated] (HIVE-4640) CommonOrcInputFormat should be the default input format for Orc tables.

2013-06-04 Thread Sarvesh Sakalanaga (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sarvesh Sakalanaga updated HIVE-4640:
-

Status: Patch Available  (was: Open)

> CommonOrcInputFormat should be the default input format for Orc tables.
> ---
>
> Key: HIVE-4640
> URL: https://issues.apache.org/jira/browse/HIVE-4640
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jitendra Nath Pandey
>Assignee: Sarvesh Sakalanaga
> Attachments: Hive-4640.0.patch
>
>
> CommonOrcInputFormat should be the default input format for Orc files, so 
> that default orc format tables work with both vectorized and non-vectorized 
> path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4640) CommonOrcInputFormat should be the default input format for Orc tables.

2013-06-04 Thread Sarvesh Sakalanaga (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sarvesh Sakalanaga updated HIVE-4640:
-

Attachment: Hive-4640.0.patch

Patch available.

> CommonOrcInputFormat should be the default input format for Orc tables.
> ---
>
> Key: HIVE-4640
> URL: https://issues.apache.org/jira/browse/HIVE-4640
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jitendra Nath Pandey
>Assignee: Sarvesh Sakalanaga
> Attachments: Hive-4640.0.patch
>
>
> CommonOrcInputFormat should be the default input format for Orc files, so 
> that default orc format tables work with both vectorized and non-vectorized 
> path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4639) Add has null flag to ORC internal index

2013-06-04 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675338#comment-13675338
 ] 

Prasanth J commented on HIVE-4639:
--

[~owen.omalley]are you working on this issue? If not I can take over this issue.

> Add has null flag to ORC internal index
> ---
>
> Key: HIVE-4639
> URL: https://issues.apache.org/jira/browse/HIVE-4639
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>
> It would enable more predicate pushdown if we added a flag to the index entry 
> recording if there were any null values in the column for the 10k rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4634) Add hasNull flag to ORC index

2013-06-04 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J resolved HIVE-4634.
--

Resolution: Duplicate

Duplicate of HIVE-4639

> Add hasNull flag to ORC index
> -
>
> Key: HIVE-4634
> URL: https://issues.apache.org/jira/browse/HIVE-4634
> Project: Hive
>  Issue Type: New Feature
>  Components: File Formats
>Reporter: Owen O'Malley
>
> It would help the predicate pushdown, if the index recorded whether each 10k 
> rows had null values in that column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4640) CommonOrcInputFormat should be the default input format for Orc tables.

2013-06-04 Thread Sarvesh Sakalanaga (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sarvesh Sakalanaga reassigned HIVE-4640:


Assignee: Sarvesh Sakalanaga  (was: Jitendra Nath Pandey)

> CommonOrcInputFormat should be the default input format for Orc tables.
> ---
>
> Key: HIVE-4640
> URL: https://issues.apache.org/jira/browse/HIVE-4640
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jitendra Nath Pandey
>Assignee: Sarvesh Sakalanaga
>
> CommonOrcInputFormat should be the default input format for Orc files, so 
> that default orc format tables work with both vectorized and non-vectorized 
> path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4656) Implement vectorized text reader to read vectorized data from Text file

2013-06-04 Thread Sarvesh Sakalanaga (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675311#comment-13675311
 ] 

Sarvesh Sakalanaga commented on HIVE-4656:
--

Review at: https://reviews.apache.org/r/11636/

> Implement vectorized text reader to read vectorized data from Text file 
> 
>
> Key: HIVE-4656
> URL: https://issues.apache.org/jira/browse/HIVE-4656
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sarvesh Sakalanaga
>Assignee: Sarvesh Sakalanaga
> Attachments: Hive-4656.0.patch
>
>
> Input format and vectorized serde implementation for Text file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4656) Implement vectorized text reader to read vectorized data from Text file

2013-06-04 Thread Sarvesh Sakalanaga (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sarvesh Sakalanaga updated HIVE-4656:
-

Status: Patch Available  (was: Open)

> Implement vectorized text reader to read vectorized data from Text file 
> 
>
> Key: HIVE-4656
> URL: https://issues.apache.org/jira/browse/HIVE-4656
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sarvesh Sakalanaga
>Assignee: Sarvesh Sakalanaga
> Attachments: Hive-4656.0.patch
>
>
> Input format and vectorized serde implementation for Text file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Review Request: Implement vectorized text reader to read vectorized data from Text file

2013-06-04 Thread Sarvesh Sakalanaga

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11636/
---

Review request for hive.


Description
---

Vectorized input format and vectorized serde implementation for Text file.


This addresses bug Hive-4656.
https://issues.apache.org/jira/browse/Hive-4656


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedBatchUtil.java 
80bf671 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedColumnarSerDe.java 
69553d9 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedLazySimpleSerDe.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java 
5018ea1 
  ql/src/java/org/apache/hadoop/hive/ql/io/CommonRCFileInputFormat.java 4bfeb20 
  ql/src/java/org/apache/hadoop/hive/ql/io/CommonTextFileInputFormat.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/VectorizedLineRecordReader.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/VectorizedRCFileRecordReader.java 
25b3aed 
  ql/src/java/org/apache/hadoop/hive/ql/io/VectorizedTextInputFormat.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcInputFormat.java 
2c20987 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizedRowBatchCtx.java
 78ebb17 
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java 
d6b31a6 

Diff: https://reviews.apache.org/r/11636/diff/


Testing
---


Thanks,

Sarvesh Sakalanaga



Re: Review Request: Review Request for HIVE-4554 Failed to create a table from existing file if file path has spaces

2013-06-04 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11335/
---

(Updated June 4, 2013, 9:51 p.m.)


Review request for hive and Ashutosh Chauhan.


Changes
---

Patch now includes testcase for HDFS files.


Description
---

Patch includes fix and new test case.


This addresses bug HIVE-4554.
https://issues.apache.org/jira/browse/HIVE-4554


Diffs (updated)
-

  build-common.xml 43d8e9c 
  data/files/person PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java bd8d252 
  ql/src/test/queries/clientpositive/load_file_with_space_in_the_name.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/load_hdfs_file_with_space_in_the_name.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/load_file_with_space_in_the_name.q.out 
PRE-CREATION 
  
ql/src/test/results/clientpositive/load_hdfs_file_with_space_in_the_name.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/11335/diff/


Testing
---


Thanks,

Xuefu Zhang



[jira] [Updated] (HIVE-4554) Failed to create a table from existing file if file path has spaces

2013-06-04 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-4554:
--

Attachment: HIVE-4554.patch.4

Patch is updated with new test case for loading HDFS file with special 
character (space) in the file name to a table.

> Failed to create a table from existing file if file path has spaces
> ---
>
> Key: HIVE-4554
> URL: https://issues.apache.org/jira/browse/HIVE-4554
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.10.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-4554.patch, HIVE-4554.patch.1, HIVE-4554.patch.2, 
> HIVE-4554.patch.3, HIVE-4554.patch.4
>
>
> To reproduce the problem,
> 1. Create a table, say, person_age (name STRING, age INT).
> 2. Create a file whose name has a space in it, say, "data set.txt".
> 3. Try to load the date in the file to the table.
> The following error can be seen in the console:
> hive> LOAD DATA INPATH '/home/xzhang/temp/data set.txt' INTO TABLE person_age;
> Loading data to table default.person_age
> Failed with exception Wrong file format. Please check the file's format.
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> Note: the error message is confusing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4656) Implement vectorized text reader to read vectorized data from Text file

2013-06-04 Thread Sarvesh Sakalanaga (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sarvesh Sakalanaga updated HIVE-4656:
-

Attachment: Hive-4656.0.patch

Patch available. 

> Implement vectorized text reader to read vectorized data from Text file 
> 
>
> Key: HIVE-4656
> URL: https://issues.apache.org/jira/browse/HIVE-4656
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sarvesh Sakalanaga
>Assignee: Sarvesh Sakalanaga
> Attachments: Hive-4656.0.patch
>
>
> Input format and vectorized serde implementation for Text file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4656) Implement vectorized text reader to read vectorized data from Text file

2013-06-04 Thread Sarvesh Sakalanaga (JIRA)
Sarvesh Sakalanaga created HIVE-4656:


 Summary: Implement vectorized text reader to read vectorized data 
from Text file 
 Key: HIVE-4656
 URL: https://issues.apache.org/jira/browse/HIVE-4656
 Project: Hive
  Issue Type: Sub-task
Reporter: Sarvesh Sakalanaga
Assignee: Sarvesh Sakalanaga


Input format and vectorized serde implementation for Text file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: Review Request for HIVE-4554 Failed to create a table from existing file if file path has spaces

2013-06-04 Thread Ashutosh Chauhan


> On June 3, 2013, 11:15 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java, line 
> > 273
> > 
> >
> > Apart from this change, all other changes are contained within 
> > if(isLocal) block. Because of this it seems its possible it might be 
> > triggered for non-local paths as well. Can you test it for hdfs:// path 
> > which has spaces. If its easy, it will be good to add it in test, else 
> > manual test is fine as well.
> 
> Xuefu Zhang wrote:
> I tried to add a testcase loading file at HDFS into a table without a 
> success. Doing this requires an HDFS accessible from the test machine. Please 
> let me know if you think there is mechanism. However, I did manually test the 
> case, and it works fine for me. (It fails w/o the patch.)

Glad that its working. You can add this test-case for MinmrCliDriver . Just 
write a regular .q test file and then include that within minimr.query.files 
parameter in build-common.xml . Those testcases run against minicluster so you 
can access hdfs:// there.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11335/#review21366
---


On June 3, 2013, 10:18 p.m., Xuefu Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11335/
> ---
> 
> (Updated June 3, 2013, 10:18 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Description
> ---
> 
> Patch includes fix and new test case.
> 
> 
> This addresses bug HIVE-4554.
> https://issues.apache.org/jira/browse/HIVE-4554
> 
> 
> Diffs
> -
> 
>   data/files/person PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> bd8d252 
>   ql/src/test/queries/clientpositive/load_file_with_space_in_the_name.q 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/load_file_with_space_in_the_name.q.out 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/11335/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Xuefu Zhang
> 
>



[jira] [Commented] (HIVE-4655) Vectorization not working with negative constants, hive doesn't fold constants.

2013-06-04 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675103#comment-13675103
 ] 

Jitendra Nath Pandey commented on HIVE-4655:


Review board entry.
https://reviews.apache.org/r/11634/

> Vectorization not working with negative constants, hive doesn't fold 
> constants.
> ---
>
> Key: HIVE-4655
> URL: https://issues.apache.org/jira/browse/HIVE-4655
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Fix For: vectorization-branch
>
> Attachments: HIVE-4655.1.patch
>
>
>   Hive optimizer doesn't fold the constants, however vectorized code path 
> assumes that constants have been folded. This should be fixed in hive 
> optimizer. 
>   In this jira we just fix vectorization path to handle folding for negative 
> constants. This is needed because hive plan treats negative constants as 
> unary-minus expression on constants, therefore these expressions also need 
> constant folding.
> This fix will become redundant once constant folding is appropriately 
> implemented in hive optimizer. (HIVE-746)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4655) Vectorization not working with negative constants, hive doesn't fold constants.

2013-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-4655.


   Resolution: Fixed
Fix Version/s: vectorization-branch

Committed to branch. Thanks, Jitendra!

> Vectorization not working with negative constants, hive doesn't fold 
> constants.
> ---
>
> Key: HIVE-4655
> URL: https://issues.apache.org/jira/browse/HIVE-4655
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Fix For: vectorization-branch
>
> Attachments: HIVE-4655.1.patch
>
>
>   Hive optimizer doesn't fold the constants, however vectorized code path 
> assumes that constants have been folded. This should be fixed in hive 
> optimizer. 
>   In this jira we just fix vectorization path to handle folding for negative 
> constants. This is needed because hive plan treats negative constants as 
> unary-minus expression on constants, therefore these expressions also need 
> constant folding.
> This fix will become redundant once constant folding is appropriately 
> implemented in hive optimizer. (HIVE-746)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4655) Vectorization not working with negative constants, hive doesn't fold constants.

2013-06-04 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675100#comment-13675100
 ] 

Eric Hanson commented on HIVE-4655:
---

Can you put this on review board please? 

> Vectorization not working with negative constants, hive doesn't fold 
> constants.
> ---
>
> Key: HIVE-4655
> URL: https://issues.apache.org/jira/browse/HIVE-4655
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-4655.1.patch
>
>
>   Hive optimizer doesn't fold the constants, however vectorized code path 
> assumes that constants have been folded. This should be fixed in hive 
> optimizer. 
>   In this jira we just fix vectorization path to handle folding for negative 
> constants. This is needed because hive plan treats negative constants as 
> unary-minus expression on constants, therefore these expressions also need 
> constant folding.
> This fix will become redundant once constant folding is appropriately 
> implemented in hive optimizer. (HIVE-746)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4651) TestVectorGroupByOperator causes asserts in StandardStructObjectInspector.init

2013-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4651:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to branch. Thanks, Remus!

> TestVectorGroupByOperator causes asserts in StandardStructObjectInspector.init
> --
>
> Key: HIVE-4651
> URL: https://issues.apache.org/jira/browse/HIVE-4651
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: vectorization-branch
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>Priority: Minor
> Fix For: vectorization-branch
>
> Attachments: hive-4651.0.patch.txt
>
>
> The number of output columns passed to StandardStructObjectInspector.init 
> must be correct. VGByOp tests that have a GROUP BY key do not set this 
> proper. Assert manifests only when JUnit starts the VM with -ea

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4637) Fix VectorUDAFSum.txt to honor the expected vector column type

2013-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4637:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to branch. Thanks, Remus!

> Fix VectorUDAFSum.txt to honor the expected vector column type
> --
>
> Key: HIVE-4637
> URL: https://issues.apache.org/jira/browse/HIVE-4637
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: vectorization-branch
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>Priority: Minor
> Fix For: vectorization-branch
>
> Attachments: HIVE-4637.0.patch.txt, HIVE-4637.1.patch.txt, 
> HIVE-4637.2.patch.txt
>
>
> "I think, its a bug in code generation for VectorUDAFSumDouble.
> The template VectorUDAFSum.txt, assumes LongColumnVector for input rather 
> than having it  replaced by code generation."

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions

2013-06-04 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674690#comment-13674690
 ] 

Eric Hanson commented on HIVE-4642:
---

I think this sounds good except that using multi-threaded parallelism is not a 
good idea here. We should rely on getting parallelism for large data sets by 
having multiple splits processed in parallel in different processes. Using 
file-grain multi-threaded parallelism within a process only for purposes of 
speeding up RLIKE/REGEXP does not see appropriate. I'd recommend focusing on 
the fastest operation you can get within a single thread, at least for common 
patterns, or maybe even all possible patterns.

> Implement vectorized RLIKE and REGEXP filter expressions
> 
>
> Key: HIVE-4642
> URL: https://issues.apache.org/jira/browse/HIVE-4642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Eric Hanson
>Assignee: Teddy Choi
>
> See title. I will add more details next week. The goal is (a) make this work 
> correctly and (b) optimize it as well as possible, at least for the common 
> cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: Review Request for HIVE-4554 Failed to create a table from existing file if file path has spaces

2013-06-04 Thread Xuefu Zhang


> On June 3, 2013, 11:15 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java, line 
> > 273
> > 
> >
> > Apart from this change, all other changes are contained within 
> > if(isLocal) block. Because of this it seems its possible it might be 
> > triggered for non-local paths as well. Can you test it for hdfs:// path 
> > which has spaces. If its easy, it will be good to add it in test, else 
> > manual test is fine as well.

I tried to add a testcase loading file at HDFS into a table without a success. 
Doing this requires an HDFS accessible from the test machine. Please let me 
know if you think there is mechanism. However, I did manually test the case, 
and it works fine for me. (It fails w/o the patch.)


- Xuefu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11335/#review21366
---


On June 3, 2013, 10:18 p.m., Xuefu Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11335/
> ---
> 
> (Updated June 3, 2013, 10:18 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Description
> ---
> 
> Patch includes fix and new test case.
> 
> 
> This addresses bug HIVE-4554.
> https://issues.apache.org/jira/browse/HIVE-4554
> 
> 
> Diffs
> -
> 
>   data/files/person PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> bd8d252 
>   ql/src/test/queries/clientpositive/load_file_with_space_in_the_name.q 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/load_file_with_space_in_the_name.q.out 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/11335/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Xuefu Zhang
> 
>



[jira] [Updated] (HIVE-4655) Vectorization not working with negative constants, hive doesn't fold constants.

2013-06-04 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4655:
---

Attachment: HIVE-4655.1.patch

> Vectorization not working with negative constants, hive doesn't fold 
> constants.
> ---
>
> Key: HIVE-4655
> URL: https://issues.apache.org/jira/browse/HIVE-4655
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-4655.1.patch
>
>
>   Hive optimizer doesn't fold the constants, however vectorized code path 
> assumes that constants have been folded. This should be fixed in hive 
> optimizer. 
>   In this jira we just fix vectorization path to handle folding for negative 
> constants. This is needed because hive plan treats negative constants as 
> unary-minus expression on constants, therefore these expressions also need 
> constant folding.
> This fix will become redundant once constant folding is appropriately 
> implemented in hive optimizer. (HIVE-746)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3953) Reading of partitioned Avro data fails because of missing properties

2013-06-04 Thread Mark Wagner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674627#comment-13674627
 ] 

Mark Wagner commented on HIVE-3953:
---

Sure thing. I've created https://reviews.apache.org/r/11632/

> Reading of partitioned Avro data fails because of missing properties
> 
>
> Key: HIVE-3953
> URL: https://issues.apache.org/jira/browse/HIVE-3953
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.11.0, 0.11.1, 0.12.0
>Reporter: Mark Wagner
>Assignee: Mark Wagner
>Priority: Blocker
> Fix For: 0.11.1, 0.12.0
>
> Attachments: avro_partition_test.q, HIVE-3953.1.patch
>
>
> After HIVE-3833, reading partitioned Avro data fails due to missing 
> properties. The "avro.schema.(url|literal)" properties are not making it all 
> the way to the SerDe. Non-partitioned data can still be read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Review Request: Initialize object inspectors with union of table properties and partition properties

2013-06-04 Thread Mark Wagner

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11632/
---

Review request for hive and Ashutosh Chauhan.


Description
---

Change the initialization of object inspectors and deserializers to use the 
union of partition properties and table properties for partitioned tables. 
There is no change for unpartitioned tables.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java 9422bf7 
  ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java f0b16e4 
  ql/src/test/queries/clientpositive/avro_partitioned.q PRE-CREATION 
  ql/src/test/results/clientpositive/avro_partitioned.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/11632/diff/


Testing
---

I've done manual end-to-end testing with various queries/tables and have 
created a .q test for reading partitioned Avro tables.


Thanks,

Mark Wagner



[jira] [Commented] (HIVE-2985) Create a new test framework

2013-06-04 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674607#comment-13674607
 ] 

Edward Capriolo commented on HIVE-2985:
---

There is a small cult following around 
https://github.com/edwardcapriolo/hive_test. Great way to test udfs and some 
other input format's serdes (anything that does not make chances to the hive 
source code)

> Create a new test framework
> ---
>
> Key: HIVE-2985
> URL: https://issues.apache.org/jira/browse/HIVE-2985
> Project: Hive
>  Issue Type: New Feature
>  Components: Testing Infrastructure
>Reporter: Namit Jain
>Assignee: Namit Jain
>
> The high level idea is to replicate the deployment framework from Facebook.
> This will us get the changes tested thoroughly in our environment before they 
> are committed.
> Also, it make easier for contributors outside Facebook to test/debug their 
> changes in this environment
> and make sure they are not breaking anything.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4655) Vectorization not working with negative constants, hive doesn't fold constants.

2013-06-04 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-4655:
--

 Summary: Vectorization not working with negative constants, hive 
doesn't fold constants.
 Key: HIVE-4655
 URL: https://issues.apache.org/jira/browse/HIVE-4655
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


  Hive optimizer doesn't fold the constants, however vectorized code path 
assumes that constants have been folded. This should be fixed in hive 
optimizer. 
  In this jira we just fix vectorization path to handle folding for negative 
constants. This is needed because hive plan treats negative constants as 
unary-minus expression on constants, therefore these expressions also need 
constant folding.
This fix will become redundant once constant folding is appropriately 
implemented in hive optimizer. (HIVE-746)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2985) Create a new test framework

2013-06-04 Thread Ramana Inukonda Nagaraj (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674596#comment-13674596
 ] 

Ramana Inukonda Nagaraj commented on HIVE-2985:
---

[~namit] Did we get anywhere with this? I am trying to setup an end to end hive 
automation suite and would love to reuse some work already done.


> Create a new test framework
> ---
>
> Key: HIVE-2985
> URL: https://issues.apache.org/jira/browse/HIVE-2985
> Project: Hive
>  Issue Type: New Feature
>  Components: Testing Infrastructure
>Reporter: Namit Jain
>Assignee: Namit Jain
>
> The high level idea is to replicate the deployment framework from Facebook.
> This will us get the changes tested thoroughly in our environment before they 
> are committed.
> Also, it make easier for contributors outside Facebook to test/debug their 
> changes in this environment
> and make sure they are not breaking anything.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4585) Remove unused MR Temp file localization from Tasks

2013-06-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674551#comment-13674551
 ] 

Hudson commented on HIVE-4585:
--

Integrated in Hive-trunk-h0.21 #2127 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2127/])
HIVE-4585 : Remove unused MR Temp file localization from Tasks (Gunther 
Hagleitner via Ashutosh Chauhan) (Revision 1489279)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489279
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ConditionalTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CopyTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DependencyCollectionTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/index/IndexMetadataChangeTask.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/stats/PartialScanTask.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java


> Remove unused MR Temp file localization from Tasks
> --
>
> Key: HIVE-4585
> URL: https://issues.apache.org/jira/browse/HIVE-4585
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Fix For: 0.12.0
>
> Attachments: HIVE-4585.1.patch
>
>
> HIVE-1408 introduced code that is currently commented out (i.e.: dead code), 
> with a comment saying needs further development (HIVE-1484). It's been like 
> this for close to 3 years. 
> I suggest removing the code until such time that someone picks up that work. 
> At that time they can decide if they want to use this code or pursue another 
> route (FS shim?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hive-trunk-h0.21 - Build # 2127 - Still Failing

2013-06-04 Thread Apache Jenkins Server
Changes for Build #2102

Changes for Build #2103
[daijy] PIG-2955: Fix bunch of Pig e2e tests on Windows


Changes for Build #2104
[daijy] PIG-3069: Native Windows Compatibility for Pig E2E Tests and Harness


Changes for Build #2105
[omalley] HIVE-4550 local_mapred_error_cache fails on some hadoop versions 
(Gunther 
Hagleitner via omalley)

[omalley] HIVE-4440 SMB Operator spills to disk like it's 1999 (Gunther 
Hagleitner via
omalley)


Changes for Build #2106

Changes for Build #2107
[omalley] HIVE-4486 FetchOperator slows down SMB map joins by 50% when there 
are many 
partitions (Gopal V via omalley)


Changes for Build #2108

Changes for Build #2109

Changes for Build #2110

Changes for Build #2111
[omalley] HIVE-4475 Switch RCFile default to LazyBinaryColumnarSerDe. (Guther 
Hagleitner
via omalley)

[omalley] HIVE-4521 Auto join conversion fails in certain cases (Gunther 
Hagleitner via
omalley)


Changes for Build #2112

Changes for Build #2113
[gates] HIVE-4578 Changes to Pig's test harness broke HCat e2e tests (gates)


Changes for Build #2114
[gates] HIVE-4581 HCat e2e tests broken by changes to Hive's describe table 
formatting (gates)


Changes for Build #2115

Changes for Build #2116
[navis] JDBC2: HiveDriver should not throw RuntimeException when passed an 
invalid URL (Richard Ding via Navis)


Changes for Build #2117

Changes for Build #2118

Changes for Build #2119

Changes for Build #2120

Changes for Build #2121
[navis] HIVE-4572 ColumnPruner cannot preserve RS key columns corresponding to 
un-selected join keys in columnExprMap (Yin Huai via Navis)

[navis] HIVE-4540 JOIN-GRP BY-DISTINCT fails with NPE when 
mapjoin.mapreduce=true (Gunther Hagleitner via Navis)


Changes for Build #2122

Changes for Build #2123

Changes for Build #2124
[gates] HIVE-4543 Broken link in HCat doc (Reader and Writer Interfaces) (Lefty 
Leverenz via gates)


Changes for Build #2125
[daijy] PIG-3337: Fix remaining Window e2e tests


Changes for Build #2126
[hashutosh] HIVE-4615 : Invalid column names allowed when created dynamically 
by a SerDe (Gabriel Reid via Ashutosh Chauhan)

[hashutosh] HIVE-3846 : alter view rename NPEs with authorization on. (Teddy 
Choi via Ashutosh Chauhan)

[hashutosh] HIVE-4403 : Running Hive queries on Yarn (MR2) gives warnings 
related to overriding final parameters (Chu Tong via Ashutosh Chauhan)

[hashutosh] HIVE-4610 : HCatalog checkstyle violation after HIVE4578 (Brock 
Noland via Ashutosh Chauhan)

[hashutosh] HIVE-4636 : Failing on TestSemanticAnalysis.testAddReplaceCols in 
trunk (Navis via Ashutosh Chauhan)

[hashutosh] HIVE-4626 : join_vc.q is not deterministic (Navis via Ashutosh 
Chauhan)

[hashutosh] HIVE-4562 : HIVE3393 brought in Jackson library,and these four jars 
should be packed into hive-exec.jar (caofangkun via Ashutosh Chauhan)

[hashutosh] HIVE-4489 : beeline always return the same error message twice 
(Chaoyu Tang via Ashutosh Chauhan)

[hashutosh] HIVE-4510 : HS2 doesn't nest exceptions properly (fun debug times) 
(Thejas Nair via Ashutosh Chauhan)

[hashutosh] HIVE-4535 : hive build fails with hadoop 0.20 (Thejas Nair via 
Ashutosh Chauhan)


Changes for Build #2127
[hashutosh] HIVE-4585 : Remove unused MR Temp file localization from Tasks 
(Gunther Hagleitner via Ashutosh Chauhan)

[hashutosh] HIVE-4418 : TestNegativeCliDriver failure message if cmd succeeds 
is misleading (Thejas Nair via Ashutosh Chauhan)

[navis] HIVE-4620 MR temp directory conflicts in case of parallel execution 
mode (Prasad Mujumdar via Navis)




All tests passed

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #2127)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/2127/ to 
view the results.

[jira] [Commented] (HIVE-4418) TestNegativeCliDriver failure message if cmd succeeds is misleading

2013-06-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674550#comment-13674550
 ] 

Hudson commented on HIVE-4418:
--

Integrated in Hive-trunk-h0.21 #2127 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2127/])
HIVE-4418 : TestNegativeCliDriver failure message if cmd succeeds is 
misleading (Thejas Nair via Ashutosh Chauhan) (Revision 1489278)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489278
Files : 
* /hive/trunk/ql/src/test/templates/TestNegativeCliDriver.vm


> TestNegativeCliDriver failure message if cmd succeeds is misleading
> ---
>
> Key: HIVE-4418
> URL: https://issues.apache.org/jira/browse/HIVE-4418
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Affects Versions: 0.10.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.12.0
>
> Attachments: HIVE-4418.1.patch
>
>
> If the .q test ends up succeeding (exit code == 0), then the test failure 
> message is misleading.
> From the error it seems as if the command actually failed -
> {code}
> [junit] junit.framework.AssertionFailedError: Client Execution failed 
> with error code = 0
> [junit] See build/ql/tmp/hive.log, or try "ant test ... 
> -Dtest.silent=false" to get more logs.
> [junit] at junit.framework.Assert.fail(Assert.java:47)
> [junit] at 
> org.apache.hadoop.hive.cli.TestNegativeCliDriver.runTest(TestNegativeCliDriver.java:121)
> [junit] at 
> org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_desc_tab(TestNegativeCliDriver.java:102)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4620) MR temp directory conflicts in case of parallel execution mode

2013-06-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674549#comment-13674549
 ] 

Hudson commented on HIVE-4620:
--

Integrated in Hive-trunk-h0.21 #2127 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2127/])
HIVE-4620 MR temp directory conflicts in case of parallel execution mode 
(Prasad Mujumdar via Navis) (Revision 1489226)

 Result = FAILURE
navis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489226
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Context.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskRunner.java


> MR temp directory conflicts in case of parallel execution mode
> --
>
> Key: HIVE-4620
> URL: https://issues.apache.org/jira/browse/HIVE-4620
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Prasad Mujumdar
>Assignee: Prasad Mujumdar
> Fix For: 0.12.0
>
> Attachments: HIVE-4620-1.patch, HIVE-4620-2.patch, HIVE-4620-3.patch
>
>
> In parallel query execution mode, all the parallel running task ends up 
> sharing the same temp/scratch directory. This could lead to file conflicts 
> and temp files getting deleted before the job completion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2304) Support PreparedStatement.setObject

2013-06-04 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674543#comment-13674543
 ] 

Ashutosh Chauhan commented on HIVE-2304:


+1 will commit if tests pass.

> Support PreparedStatement.setObject
> ---
>
> Key: HIVE-2304
> URL: https://issues.apache.org/jira/browse/HIVE-2304
> Project: Hive
>  Issue Type: Sub-task
>  Components: JDBC
>Affects Versions: 0.7.1
>Reporter: Ido Hadanny
>Assignee: Ido Hadanny
>Priority: Minor
> Attachments: HIVE-0.8-SetObject.2.patch.txt
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> PreparedStatement.setObject is important for spring's jdbcTemplate support

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4652) VectorHashKeyWrapperBatch.java should be in vector package (instead of exec)

2013-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4652:
---

   Resolution: Fixed
Fix Version/s: vectorization-branch
   Status: Resolved  (was: Patch Available)

Committed to branch. Thanks, Remus!

>  VectorHashKeyWrapperBatch.java should be in vector package (instead of exec)
> -
>
> Key: HIVE-4652
> URL: https://issues.apache.org/jira/browse/HIVE-4652
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: vectorization-branch
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>Priority: Minor
> Fix For: vectorization-branch
>
> Attachments: HIVE-4652.0.patch.txt
>
>
> As the title says

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4516) Fix concurrency bug in serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java

2013-06-04 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674504#comment-13674504
 ] 

Brock Noland commented on HIVE-4516:


I hit this bug and this patch resolved the issue.

> Fix concurrency bug in 
> serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java
> -
>
> Key: HIVE-4516
> URL: https://issues.apache.org/jira/browse/HIVE-4516
> Project: Hive
>  Issue Type: Bug
>Reporter: Jon Hartlaub
> Attachments: HIVE-4516.D10929.1.patch, TimestampWritable.java.patch
>
>
> A patch for concurrent use of TimestampWritable which occurs in a 
> multithreaded scenario (as found in AmpLab Shark).  A static SimpleDateFormat 
> (not ThreadSafe) is used by TimestampWritable in CTAS DDL statements where it 
> manifests as data corruption when used in a concurrent environment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4526) auto_sortmerge_join_9.q throws NPE but test is succeeded

2013-06-04 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674451#comment-13674451
 ] 

Phabricator commented on HIVE-4526:
---

ashutoshc has accepted the revision "HIVE-4526 [jira] auto_sortmerge_join_9.q 
throws NPE but test is succeeded".

  +1

REVISION DETAIL
  https://reviews.facebook.net/D10725

BRANCH
  HIVE-4526

ARCANIST PROJECT
  hive

To: JIRA, ashutoshc, navis


> auto_sortmerge_join_9.q throws NPE but test is succeeded
> 
>
> Key: HIVE-4526
> URL: https://issues.apache.org/jira/browse/HIVE-4526
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Navis
>Assignee: Navis
> Attachments: HIVE-4526.D10725.1.patch
>
>
> auto_sortmerge_join_9.q
> {noformat}
> [junit] Running org.apache.hadoop.hive.cli.TestCliDriver
> [junit] Begin query: auto_sortmerge_join_9.q
> [junit] Deleted 
> file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl1
> [junit] Deleted 
> file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl2
> [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with 
> exception nulljava.lang.NullPointerException
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
> [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit]   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> [junit]   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
> [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> [junit] 
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
> [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit]   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> [junit]   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
> [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with 
> exception nulljava.lang.NullPointerException
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
> [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit]   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> [junit]   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
> [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> [junit] 
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
> [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit]   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:3

[jira] [Updated] (HIVE-4172) JDBC2 does not support VOID type

2013-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4172:
---

Affects Version/s: 0.11.0
   Status: Open  (was: Patch Available)

Patch looks good, but in jdbc parlance[1] there is a type called {{null}} and 
there is no type called {{void}} Shall we also name this type as null instead 
of void, especially given the fact we are mapping it to java.sql.types.Null 
anyways in our implementation?

[1] http://docs.oracle.com/javase/6/docs/api/java/sql/Types.html#NULL

> JDBC2 does not support VOID type
> 
>
> Key: HIVE-4172
> URL: https://issues.apache.org/jira/browse/HIVE-4172
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, JDBC
>Affects Versions: 0.11.0
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
>  Labels: HiveServer2
> Attachments: HIVE-4172.D9555.1.patch, HIVE-4172.D9555.2.patch, 
> HIVE-4172.D9555.3.patch, HIVE-4172.D9555.4.patch
>
>
> In beeline, "select key, null from src" fails with exception,
> {noformat}
> org.apache.hive.service.cli.HiveSQLException: Error running query: 
> java.lang.NullPointerException
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:112)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:166)
>   at 
> org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:148)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:183)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1133)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1118)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:39)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3953) Reading of partitioned Avro data fails because of missing properties

2013-06-04 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674427#comment-13674427
 ] 

Ashutosh Chauhan commented on HIVE-3953:


Thanks [~mwagner] for taking a look. Can you create phabricator or RB entry for 
this, its easier to provide feedback via those.

> Reading of partitioned Avro data fails because of missing properties
> 
>
> Key: HIVE-3953
> URL: https://issues.apache.org/jira/browse/HIVE-3953
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.11.0, 0.11.1, 0.12.0
>Reporter: Mark Wagner
>Assignee: Mark Wagner
>Priority: Blocker
> Fix For: 0.11.1, 0.12.0
>
> Attachments: avro_partition_test.q, HIVE-3953.1.patch
>
>
> After HIVE-3833, reading partitioned Avro data fails due to missing 
> properties. The "avro.schema.(url|literal)" properties are not making it all 
> the way to the SerDe. Non-partitioned data can still be read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4516) Fix concurrency bug in serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java

2013-06-04 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674424#comment-13674424
 ] 

Phabricator commented on HIVE-4516:
---

ashutoshc has accepted the revision "HIVE-4516 [jira] Fix concurrency bug in 
serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java".

  +1

REVISION DETAIL
  https://reviews.facebook.net/D10929

BRANCH
  HIVE-4516

ARCANIST PROJECT
  hive

To: JIRA, ashutoshc, navis


> Fix concurrency bug in 
> serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java
> -
>
> Key: HIVE-4516
> URL: https://issues.apache.org/jira/browse/HIVE-4516
> Project: Hive
>  Issue Type: Bug
>Reporter: Jon Hartlaub
> Attachments: HIVE-4516.D10929.1.patch, TimestampWritable.java.patch
>
>
> A patch for concurrent use of TimestampWritable which occurs in a 
> multithreaded scenario (as found in AmpLab Shark).  A static SimpleDateFormat 
> (not ThreadSafe) is used by TimestampWritable in CTAS DDL statements where it 
> manifests as data corruption when used in a concurrent environment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions

2013-06-04 Thread Teddy Choi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674426#comment-13674426
 ] 

Teddy Choi commented on HIVE-4642:
--

Here is my draft spec. Please leave a comment.

The base version can be easily implemented with the basic template and the 
UDFRegExp class. It will be expensive, and it needs to be optimized more.

Problem: Regular expression matcher is about 10+ times slower than 
prefix/suffix matcher(as shown in HIVE-4548). Because the Pattern is already 
compiled, it's hard to optimize the Pattern more. Matchers don't depend on each 
other, so they are distributable over threads. Also the base version will 
create new objects per call. These can be implemented more efficiently.

Goal: Reduce object creations per call, and distribute matching loads over 
multiple threads.

Cache and reuse a compiled pattern, a byte buffer, a char buffer, and a UTF-8 
decoder as HIVE-4548.

Divide matching tasks into groups, and run each group on different thread. Or 
apply the producer-consumer pattern. If there are enough idle CPU cores, total 
execution time will be reduced significantly.

If it is feasible, implement prefix/suffix matchers for further optimization. 
People may use LIKE filter more for simpler filtering. So these matchers may 
not be used frequently but will run faster.

> Implement vectorized RLIKE and REGEXP filter expressions
> 
>
> Key: HIVE-4642
> URL: https://issues.apache.org/jira/browse/HIVE-4642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Eric Hanson
>Assignee: Teddy Choi
>
> See title. I will add more details next week. The goal is (a) make this work 
> correctly and (b) optimize it as well as possible, at least for the common 
> cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2615) CTAS with literal NULL creates VOID type

2013-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-2615:
---

Fix Version/s: (was: 0.12.0)
Affects Version/s: 0.7.0
   0.8.0
   0.9.0
   0.10.0
   0.11.0
   Status: Open  (was: Patch Available)

Instead of in SemanticAnalyzer, better place to do this check is in 
TypeCheckProcFactory.java

> CTAS with literal NULL creates VOID type
> 
>
> Key: HIVE-2615
> URL: https://issues.apache.org/jira/browse/HIVE-2615
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0, 0.10.0, 0.9.0, 0.8.0, 0.7.0, 0.6.0
>Reporter: David Phillips
>Assignee: Zhuoluo (Clark) Yang
> Attachments: HIVE-2615.1.patch
>
>
> Create the table with a column that always contains NULL:
> {quote}
> hive> create table bad as select 1 x, null z from dual; 
> {quote}
> Because there's no type, Hive gives it the VOID type:
> {quote}
> hive> describe bad;
> OK
> x int 
> z void
> {quote}
> This seems weird, because AFAIK, there is no normal way to create a column of 
> type VOID.  The problem is that the table can't be queried:
> {quote}
> hive> select * from bad;
> OK
> Failed with exception java.io.IOException:java.lang.RuntimeException: 
> Internal error: no LazyObject for VOID
> {quote}
> Worse, even if you don't select that field, the query fails at runtime:
> {quote}
> hive> select x from bad;
> ...
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.MapRedTask
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)

2013-06-04 Thread Zhuoluo (Clark) Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhuoluo (Clark) Yang updated HIVE-4561:
---

Status: Open  (was: Patch Available)

[~ashutoshc] The values sounds quite strange, I will try to make a new patch.

> Column stats :  LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the 
> column values larger than 0.0 (or if all column values smaller than 0.0)
> 
>
> Key: HIVE-4561
> URL: https://issues.apache.org/jira/browse/HIVE-4561
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 0.12.0
>Reporter: caofangkun
>Assignee: Zhuoluo (Clark) Yang
> Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch, HIVE-4561.3.patch
>
>
> if all column values larger than 0.0  DOUBLE_LOW_VALUE always will be 0.0 
> or  if all column values less than 0.0,  DOUBLE_HIGH_VALUE will always be 
> hive (default)> create table src_test (price double);
> hive (default)> load data local inpath './test.txt' into table src_test;
> hive (default)> select * from src_test;
> OK
> 1.0
> 2.0
> 3.0
> Time taken: 0.313 seconds, Fetched: 3 row(s)
> hive (default)> analyze table src_test compute statistics for columns price;
> mysql> select * from TAB_COL_STATS \G;
>  CS_ID: 16
>DB_NAME: default
> TABLE_NAME: src_test
>COLUMN_NAME: price
>COLUMN_TYPE: double
> TBL_ID: 2586
> LONG_LOW_VALUE: 0
>LONG_HIGH_VALUE: 0
>   DOUBLE_LOW_VALUE: 0.   # Wrong Result ! Expected is 1.
>  DOUBLE_HIGH_VALUE: 3.
>  BIG_DECIMAL_LOW_VALUE: NULL
> BIG_DECIMAL_HIGH_VALUE: NULL
>  NUM_NULLS: 0
>  NUM_DISTINCTS: 1
>AVG_COL_LEN: 0.
>MAX_COL_LEN: 0
>  NUM_TRUES: 0
> NUM_FALSES: 0
>  LAST_ANALYZED: 1368596151
> 2 rows in set (0.00 sec)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)

2013-06-04 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674383#comment-13674383
 ] 

Ashutosh Chauhan commented on HIVE-4561:


Is that correct? For empty table 
"min":9223372036854775807,"max":-9223372036854775808 doesn't sound right.

> Column stats :  LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the 
> column values larger than 0.0 (or if all column values smaller than 0.0)
> 
>
> Key: HIVE-4561
> URL: https://issues.apache.org/jira/browse/HIVE-4561
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 0.12.0
>Reporter: caofangkun
>Assignee: Zhuoluo (Clark) Yang
> Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch, HIVE-4561.3.patch
>
>
> if all column values larger than 0.0  DOUBLE_LOW_VALUE always will be 0.0 
> or  if all column values less than 0.0,  DOUBLE_HIGH_VALUE will always be 
> hive (default)> create table src_test (price double);
> hive (default)> load data local inpath './test.txt' into table src_test;
> hive (default)> select * from src_test;
> OK
> 1.0
> 2.0
> 3.0
> Time taken: 0.313 seconds, Fetched: 3 row(s)
> hive (default)> analyze table src_test compute statistics for columns price;
> mysql> select * from TAB_COL_STATS \G;
>  CS_ID: 16
>DB_NAME: default
> TABLE_NAME: src_test
>COLUMN_NAME: price
>COLUMN_TYPE: double
> TBL_ID: 2586
> LONG_LOW_VALUE: 0
>LONG_HIGH_VALUE: 0
>   DOUBLE_LOW_VALUE: 0.   # Wrong Result ! Expected is 1.
>  DOUBLE_HIGH_VALUE: 3.
>  BIG_DECIMAL_LOW_VALUE: NULL
> BIG_DECIMAL_HIGH_VALUE: NULL
>  NUM_NULLS: 0
>  NUM_DISTINCTS: 1
>AVG_COL_LEN: 0.
>MAX_COL_LEN: 0
>  NUM_TRUES: 0
> NUM_FALSES: 0
>  LAST_ANALYZED: 1368596151
> 2 rows in set (0.00 sec)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)

2013-06-04 Thread Zhuoluo (Clark) Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhuoluo (Clark) Yang updated HIVE-4561:
---

Status: Patch Available  (was: Open)

> Column stats :  LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the 
> column values larger than 0.0 (or if all column values smaller than 0.0)
> 
>
> Key: HIVE-4561
> URL: https://issues.apache.org/jira/browse/HIVE-4561
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 0.12.0
>Reporter: caofangkun
>Assignee: Zhuoluo (Clark) Yang
> Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch, HIVE-4561.3.patch
>
>
> if all column values larger than 0.0  DOUBLE_LOW_VALUE always will be 0.0 
> or  if all column values less than 0.0,  DOUBLE_HIGH_VALUE will always be 
> hive (default)> create table src_test (price double);
> hive (default)> load data local inpath './test.txt' into table src_test;
> hive (default)> select * from src_test;
> OK
> 1.0
> 2.0
> 3.0
> Time taken: 0.313 seconds, Fetched: 3 row(s)
> hive (default)> analyze table src_test compute statistics for columns price;
> mysql> select * from TAB_COL_STATS \G;
>  CS_ID: 16
>DB_NAME: default
> TABLE_NAME: src_test
>COLUMN_NAME: price
>COLUMN_TYPE: double
> TBL_ID: 2586
> LONG_LOW_VALUE: 0
>LONG_HIGH_VALUE: 0
>   DOUBLE_LOW_VALUE: 0.   # Wrong Result ! Expected is 1.
>  DOUBLE_HIGH_VALUE: 3.
>  BIG_DECIMAL_LOW_VALUE: NULL
> BIG_DECIMAL_HIGH_VALUE: NULL
>  NUM_NULLS: 0
>  NUM_DISTINCTS: 1
>AVG_COL_LEN: 0.
>MAX_COL_LEN: 0
>  NUM_TRUES: 0
> NUM_FALSES: 0
>  LAST_ANALYZED: 1368596151
> 2 rows in set (0.00 sec)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 , if all the column values larger than 0.0 (or if all column values smaller than 0.0)

2013-06-04 Thread Zhuoluo Yang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11172/
---

(Updated June 4, 2013, 1:41 p.m.)


Review request for hive, Carl Steinbach, Carl Steinbach, Ashutosh Chauhan, 
Shreepadma Venugopalan, and fangkun cao.


Changes
---

Sorry for my ignorance of the empty table.
I think a UT fix is a quite simple way, shall we make the min/max value zero if 
the table is empty?


Description
---

An initialization error.
Make double and long initialize correctly.
Would you review that and assign the issue to me?


This addresses bug HIVE-4561.
https://issues.apache.org/jira/browse/HIVE-4561


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java
 1489292 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/compute_stats_empty_table.q.out
 1489292 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/compute_stats_long.q.out
 1489292 

Diff: https://reviews.apache.org/r/11172/diff/


Testing
---

ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_long.q
ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_double.q

done.


Thanks,

Zhuoluo Yang



[jira] [Resolved] (HIVE-4608) Vectorized UDFs for Timestamp in nanoseconds

2013-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-4608.


   Resolution: Fixed
Fix Version/s: vectorization-branch

Committed to branch. Thanks, Gopal!

> Vectorized UDFs for Timestamp in nanoseconds
> 
>
> Key: HIVE-4608
> URL: https://issues.apache.org/jira/browse/HIVE-4608
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: vectorization-branch
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Minor
>  Labels: vectorization
> Fix For: vectorization-branch
>
> Attachments: 
> 0001-Vectorized-UDFs-for-timestamp-functions-which-accept.patch, 
> 0002-Update-patch-to-the-review-comments-in-https-reviews.patch, 
> 0003-rebased-to-apache-hive.patch
>
>
> Vectorized UDFs for timestamp functions which accept long vectors
> VectorUDFYearLong   
> VectorUDFMonthLong
> VectorUDFWeekOfYearLong   
> VectorUDFDayOfMonthLong
> VectorUDFHourLong   
> VectorUDFMinuteLong
> VectorUDFSecondLong   
> VectorUDFUnixTimeStampLong 
> and tests for them against their non-vectorized implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)

2013-06-04 Thread Zhuoluo (Clark) Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhuoluo (Clark) Yang updated HIVE-4561:
---

Attachment: HIVE-4561.3.patch

fix compute_stats_empty_table.q test results.

> Column stats :  LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the 
> column values larger than 0.0 (or if all column values smaller than 0.0)
> 
>
> Key: HIVE-4561
> URL: https://issues.apache.org/jira/browse/HIVE-4561
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 0.12.0
>Reporter: caofangkun
>Assignee: Zhuoluo (Clark) Yang
> Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch, HIVE-4561.3.patch
>
>
> if all column values larger than 0.0  DOUBLE_LOW_VALUE always will be 0.0 
> or  if all column values less than 0.0,  DOUBLE_HIGH_VALUE will always be 
> hive (default)> create table src_test (price double);
> hive (default)> load data local inpath './test.txt' into table src_test;
> hive (default)> select * from src_test;
> OK
> 1.0
> 2.0
> 3.0
> Time taken: 0.313 seconds, Fetched: 3 row(s)
> hive (default)> analyze table src_test compute statistics for columns price;
> mysql> select * from TAB_COL_STATS \G;
>  CS_ID: 16
>DB_NAME: default
> TABLE_NAME: src_test
>COLUMN_NAME: price
>COLUMN_TYPE: double
> TBL_ID: 2586
> LONG_LOW_VALUE: 0
>LONG_HIGH_VALUE: 0
>   DOUBLE_LOW_VALUE: 0.   # Wrong Result ! Expected is 1.
>  DOUBLE_HIGH_VALUE: 3.
>  BIG_DECIMAL_LOW_VALUE: NULL
> BIG_DECIMAL_HIGH_VALUE: NULL
>  NUM_NULLS: 0
>  NUM_DISTINCTS: 1
>AVG_COL_LEN: 0.
>MAX_COL_LEN: 0
>  NUM_TRUES: 0
> NUM_FALSES: 0
>  LAST_ANALYZED: 1368596151
> 2 rows in set (0.00 sec)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4646) skewjoin.q is failing in hadoop2

2013-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-4646.


   Resolution: Fixed
Fix Version/s: 0.12.0

Committed to trunk. Thanks, Navis!

> skewjoin.q is failing in hadoop2
> 
>
> Key: HIVE-4646
> URL: https://issues.apache.org/jira/browse/HIVE-4646
> Project: Hive
>  Issue Type: Test
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
> Fix For: 0.12.0
>
> Attachments: HIVE-4646.D11043.1.patch
>
>
> https://issues.apache.org/jira/browse/HDFS-538 changed to throw exception 
> instead of returning null for not-existing path. But skew resolver depends on 
> old behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4566) NullPointerException if typeinfo and nativesql commands are executed at beeline before a DB connection is established

2013-06-04 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674372#comment-13674372
 ] 

Ashutosh Chauhan commented on HIVE-4566:


+1

> NullPointerException if typeinfo and nativesql commands are executed at 
> beeline before a DB connection is established
> -
>
> Key: HIVE-4566
> URL: https://issues.apache.org/jira/browse/HIVE-4566
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-4566.patch, HIVE-4566.patch.1
>
>
> Before a DB connection is established, executing a command such as typeinfo 
> and nativesql results an NPE shown at the console:
> beeline> !typeinfo
> java.lang.NullPointerException
> beeline> !nativesql
> java.lang.NullPointerException
> Instead, a message, such as "No current connection" should be given, as in 
> case of some other commands, such as dropall.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4377) Add more comment to https://reviews.facebook.net/D1209 (HIVE-2340)

2013-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-4377.


   Resolution: Fixed
Fix Version/s: 0.12.0

Committed to trunk. Thanks, Navis!

> Add more comment to https://reviews.facebook.net/D1209 (HIVE-2340)
> --
>
> Key: HIVE-4377
> URL: https://issues.apache.org/jira/browse/HIVE-4377
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Gang Tim Liu
>Assignee: Navis
> Fix For: 0.12.0
>
> Attachments: HIVE-4377.D10377.1.patch, HIVE-4377.D10377.2.patch, 
> HIVE-4377.D10377.3.patch
>
>
> thanks a lot for addressing optimization in HIVE-2340. Awesome!
> Since we are developing at a very fast pace, it would be really useful to
> think about maintainability and testing of the large codebase. Highlights 
> which are applicable for D1209:
>   1.  Javadoc for all public/private functions, except for
> setters/getters. For any complex function, clear examples (input/output)
> would really help.
>   2.  Specially, for query optimizations, it might be a good idea to have
> a simple working query at the top, and the expected changes. For e.g..
> The operator tree for that query at each step, or a detailed explanation
> at the top.
>   3.  If possible, the test name (.q file) where the function is being
> invoked, or the query which would potentially test that scenario, if it
> is a query processor change.
>   4.  Comments in each test (.q file)­ that should include the jira
> number,  what is it trying to test. Assumptions about each query.
>   5.  Reduce the output for each test ­ whenever query is outputting more
> than 10 results, it should have a reason. Otherwise, each query result
> should be bounded by 10 rows.
> thanks a lot

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)

2013-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4561:
---

Status: Open  (was: Patch Available)

Test {{compute_stats_empty_table.q}} failed.

> Column stats :  LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the 
> column values larger than 0.0 (or if all column values smaller than 0.0)
> 
>
> Key: HIVE-4561
> URL: https://issues.apache.org/jira/browse/HIVE-4561
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 0.12.0
>Reporter: caofangkun
>Assignee: Zhuoluo (Clark) Yang
> Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch
>
>
> if all column values larger than 0.0  DOUBLE_LOW_VALUE always will be 0.0 
> or  if all column values less than 0.0,  DOUBLE_HIGH_VALUE will always be 
> hive (default)> create table src_test (price double);
> hive (default)> load data local inpath './test.txt' into table src_test;
> hive (default)> select * from src_test;
> OK
> 1.0
> 2.0
> 3.0
> Time taken: 0.313 seconds, Fetched: 3 row(s)
> hive (default)> analyze table src_test compute statistics for columns price;
> mysql> select * from TAB_COL_STATS \G;
>  CS_ID: 16
>DB_NAME: default
> TABLE_NAME: src_test
>COLUMN_NAME: price
>COLUMN_TYPE: double
> TBL_ID: 2586
> LONG_LOW_VALUE: 0
>LONG_HIGH_VALUE: 0
>   DOUBLE_LOW_VALUE: 0.   # Wrong Result ! Expected is 1.
>  DOUBLE_HIGH_VALUE: 3.
>  BIG_DECIMAL_LOW_VALUE: NULL
> BIG_DECIMAL_HIGH_VALUE: NULL
>  NUM_NULLS: 0
>  NUM_DISTINCTS: 1
>AVG_COL_LEN: 0.
>MAX_COL_LEN: 0
>  NUM_TRUES: 0
> NUM_FALSES: 0
>  LAST_ANALYZED: 1368596151
> 2 rows in set (0.00 sec)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4546) Hive CLI leaves behind the per session resource directory on non-interactive invocation

2013-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4546:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Prasad!

> Hive CLI leaves behind the per session resource directory on non-interactive 
> invocation
> ---
>
> Key: HIVE-4546
> URL: https://issues.apache.org/jira/browse/HIVE-4546
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Prasad Mujumdar
>Assignee: Prasad Mujumdar
> Fix For: 0.12.0
>
> Attachments: HIVE-4546-1.patch, HIVE-4546-2.patch
>
>
> As part of HIVE-4505, the resource directory is set to 
> /tmp/${hive.session.id}_resources and suppose to be removed at the end. The 
> CLI fails to remove it when invoked using -f or -e (non-interactive mode)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4566) NullPointerException if typeinfo and nativesql commands are executed at beeline before a DB connection is established

2013-06-04 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674328#comment-13674328
 ] 

Xuefu Zhang commented on HIVE-4566:
---

Patch updated with the following assertions in the test case:

+   Assert.assertTrue( output.contains("No current connection") );

Commadline console output with the fix:

beeline> !typeinfo   
No current connection
beeline> !nativesql
No current connection



> NullPointerException if typeinfo and nativesql commands are executed at 
> beeline before a DB connection is established
> -
>
> Key: HIVE-4566
> URL: https://issues.apache.org/jira/browse/HIVE-4566
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-4566.patch, HIVE-4566.patch.1
>
>
> Before a DB connection is established, executing a command such as typeinfo 
> and nativesql results an NPE shown at the console:
> beeline> !typeinfo
> java.lang.NullPointerException
> beeline> !nativesql
> java.lang.NullPointerException
> Instead, a message, such as "No current connection" should be given, as in 
> case of some other commands, such as dropall.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4566) NullPointerException if typeinfo and nativesql commands are executed at beeline before a DB connection is established

2013-06-04 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-4566:
--

Attachment: HIVE-4566.patch.1

> NullPointerException if typeinfo and nativesql commands are executed at 
> beeline before a DB connection is established
> -
>
> Key: HIVE-4566
> URL: https://issues.apache.org/jira/browse/HIVE-4566
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-4566.patch, HIVE-4566.patch.1
>
>
> Before a DB connection is established, executing a command such as typeinfo 
> and nativesql results an NPE shown at the console:
> beeline> !typeinfo
> java.lang.NullPointerException
> beeline> !nativesql
> java.lang.NullPointerException
> Instead, a message, such as "No current connection" should be given, as in 
> case of some other commands, such as dropall.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2670) A cluster test utility for Hive

2013-06-04 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-2670:
-

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Patch checked in.  Thanks Johnny for working on this, and Daniel for the review.

> A cluster test utility for Hive
> ---
>
> Key: HIVE-2670
> URL: https://issues.apache.org/jira/browse/HIVE-2670
> Project: Hive
>  Issue Type: New Feature
>  Components: Testing Infrastructure
>Reporter: Alan Gates
>Assignee: Johnny Zhang
> Fix For: 0.12.0
>
> Attachments: harness.tar, HIVE-2670_5.patch, HIVE-2670_6.patch, 
> hive_cluster_test_2.patch, hive_cluster_test_3.patch, 
> hive_cluster_test_4.patch, hive_cluster_test.patch
>
>
> Hive has an extensive set of unit tests, but it does not have an 
> infrastructure for testing in a cluster environment.  Pig and HCatalog have 
> been using a test harness for cluster testing for some time.  We have written 
> Hive drivers and tests to run in this harness.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4615) Invalid column names allowed when created dynamically by a SerDe

2013-06-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674229#comment-13674229
 ] 

Hudson commented on HIVE-4615:
--

Integrated in Hive-trunk-hadoop2 #224 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/224/])
HIVE-4615 : Invalid column names allowed when created dynamically by a 
SerDe (Gabriel Reid via Ashutosh Chauhan) (Revision 1489013)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489013
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
* /hive/trunk/ql/src/test/queries/clientnegative/invalid_columns.q
* /hive/trunk/ql/src/test/results/clientnegative/invalid_columns.q.out


> Invalid column names allowed when created dynamically by a SerDe
> 
>
> Key: HIVE-4615
> URL: https://issues.apache.org/jira/browse/HIVE-4615
> Project: Hive
>  Issue Type: Bug
>Reporter: Gabriel Reid
>Assignee: Gabriel Reid
> Fix For: 0.12.0
>
> Attachments: HIVE-4615.1.patch.txt
>
>
> When a SerDe creates columns dynamically during table creation, there is no 
> checking done on the validity of the created column names. This means that 
> it's possible to create a table that contains columns that can't be queried, 
> and will lead to issues when trying to query the created table.
> The same column name validation should be performed for dynamically-created 
> columns as for other column names.
> This behavior can be easily tested using the TestSerDe, and including a 
> column name that includes an invalid identifier character (e.g. a period) in 
> the list of columns to create.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4403) Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters

2013-06-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674228#comment-13674228
 ] 

Hudson commented on HIVE-4403:
--

Integrated in Hive-trunk-hadoop2 #224 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/224/])
HIVE-4403 : Running Hive queries on Yarn (MR2) gives warnings related to 
overriding final parameters (Chu Tong via Ashutosh Chauhan) (Revision 1489008)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489008
Files : 
* /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


> Running Hive queries on Yarn (MR2) gives warnings related to overriding final 
> parameters
> 
>
> Key: HIVE-4403
> URL: https://issues.apache.org/jira/browse/HIVE-4403
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0, 0.11.0
>Reporter: Mark Grover
>Assignee: Chu Tong
> Fix For: 0.12.0
>
> Attachments: HIVE-4403.patch, HIVE-4403.patch
>
>
> While working on BIGTOP-885, I saw that Hive was giving a bunch of warnings 
> related to overriding final parameters in job.conf. This was on a pseudo 
> distributed cluster. FWIW, I didn't see this happen on a fully-distributed 
> cluster. Perhaps, Hive's job.conf is overriding some final parameters it 
> shouldn't.
> Here is what the warnings looked like:
> {code}
> 2013-04-19 14:20:32,304 WARN  [main] conf.Configuration 
> (Configuration.java:loadProperty(2032)) - 
> file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an
>  attempt to override final parameter: 
> mapreduce.job.end-notification.max.retry.interval;  Ignoring.
> 2013-04-19 14:20:32,367 WARN  [main] conf.Configuration 
> (Configuration.java:loadProperty(2032)) - 
> file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an
>  attempt to override final parameter: 
> mapreduce.job.end-notification.max.attempts;  Ignoring.
> {code}
> To reproduce, run a query like:
> {code}
> CREATE TABLE u_data (
>   userid INT,
>   movieid INT,
>   rating INT,
>   unixtime STRING)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '\t'
> STORED AS TEXTFILE;
> {code}
> Load some data into u_data, here is some sample data:
> https://github.com/apache/bigtop/blob/master/bigtop-tests/test-artifacts/hive/src/main/resources/seed_data_files/ml-data/u.data
> Run a simple query on that data (on YARN/MR2)
> {code}
> INSERT OVERWRITE DIRECTORY '/tmp/count'
> SELECT COUNT(1) FROM u_data
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3846) alter view rename NPEs with authorization on.

2013-06-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674230#comment-13674230
 ] 

Hudson commented on HIVE-3846:
--

Integrated in Hive-trunk-hadoop2 #224 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/224/])
HIVE-3846 : alter view rename NPEs with authorization on. (Teddy Choi via 
Ashutosh Chauhan) (Revision 1489009)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489009
Files : 
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java
* /hive/trunk/ql/src/test/queries/clientpositive/authorization_8.q
* /hive/trunk/ql/src/test/results/clientnegative/recursive_view.q.out
* /hive/trunk/ql/src/test/results/clientpositive/alter_view_rename.q.out
* /hive/trunk/ql/src/test/results/clientpositive/authorization_8.q.out


> alter view rename NPEs with authorization on.
> -
>
> Key: HIVE-3846
> URL: https://issues.apache.org/jira/browse/HIVE-3846
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 0.10.0, 0.11.0
>Reporter: Ashutosh Chauhan
>Assignee: Teddy Choi
> Fix For: 0.12.0
>
> Attachments: HIVE-3846.1.patch.txt, HIVE-3846.2.patch.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4654) Remove unused org.apache.hadoop.hive.ql.exec Writables

2013-06-04 Thread Remus Rusanu (JIRA)
Remus Rusanu created HIVE-4654:
--

 Summary: Remove unused org.apache.hadoop.hive.ql.exec Writables
 Key: HIVE-4654
 URL: https://issues.apache.org/jira/browse/HIVE-4654
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Remus Rusanu
Priority: Minor


The Writables are originally from org.apache.hadoop.io. I tend to assume that 
they have been re-defined in hive if the original implementation was not 
considered good enough.
However, I don't understand why some are defined twice in hive itself. I 
noticed that ByteWritable in o.a.h.hive.ql.exec is not being used anywhere. The 
ByteWritable in serde2.io is being referred to in bunch of places. Therefore, I 
would suggest to just use the one in serde2.io. 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4653) Favor serde2.io Writable classes over hadoop.io ones

2013-06-04 Thread Remus Rusanu (JIRA)
Remus Rusanu created HIVE-4653:
--

 Summary: Favor serde2.io Writable classes over hadoop.io ones
 Key: HIVE-4653
 URL: https://issues.apache.org/jira/browse/HIVE-4653
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Priority: Minor


"The Writables are originally from org.apache.hadoop.io. I tend to assume that 
they have been re-defined in hive if the original implementation was not 
considered good enough.
However, I don't understand why some are defined twice in hive itself. I 
noticed that ByteWritable in o.a.h.hive.ql.exec is not being used anywhere. The 
ByteWritable in serde2.io is being referred to in bunch of places. Therefore, I 
would suggest to just use the one in serde2.io."

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4653) Favor serde2.io Writable classes over hadoop.io ones

2013-06-04 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-4653:
---

Assignee: Remus Rusanu

> Favor serde2.io Writable classes over hadoop.io ones
> 
>
> Key: HIVE-4653
> URL: https://issues.apache.org/jira/browse/HIVE-4653
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: vectorization-branch
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>Priority: Minor
>
> "The Writables are originally from org.apache.hadoop.io. I tend to assume 
> that they have been re-defined in hive if the original implementation was not 
> considered good enough.
> However, I don't understand why some are defined twice in hive itself. I 
> noticed that ByteWritable in o.a.h.hive.ql.exec is not being used anywhere. 
> The ByteWritable in serde2.io is being referred to in bunch of places. 
> Therefore, I would suggest to just use the one in serde2.io."

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4637) Fix VectorUDAFSum.txt to honor the expected vector column type

2013-06-04 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-4637:
---

Attachment: HIVE-4637.2.patch.txt

> Fix VectorUDAFSum.txt to honor the expected vector column type
> --
>
> Key: HIVE-4637
> URL: https://issues.apache.org/jira/browse/HIVE-4637
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: vectorization-branch
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>Priority: Minor
> Fix For: vectorization-branch
>
> Attachments: HIVE-4637.0.patch.txt, HIVE-4637.1.patch.txt, 
> HIVE-4637.2.patch.txt
>
>
> "I think, its a bug in code generation for VectorUDAFSumDouble.
> The template VectorUDAFSum.txt, assumes LongColumnVector for input rather 
> than having it  replaced by code generation."

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: HIVE-4637 Fix VectorUDAFSum.txt to honor the expected vector column type

2013-06-04 Thread Remus Rusanu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11579/
---

(Updated June 4, 2013, 9:49 a.m.)


Review request for hive.


Changes
---

Fixed AVG and added UT to cover each template with Double type.


Description
---

See HIVE-4637


This addresses bug HIVE-4637.
https://issues.apache.org/jira/browse/HIVE-4637


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFAvgDouble.java
 38b14f1 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFAvgLong.java
 115444d 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFMaxDouble.java
 bc7f852 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFMaxLong.java
 6ba416e 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFMinDouble.java
 d982fc2 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFMinLong.java
 a8f5531 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFSumDouble.java
 a5dac79 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFSumLong.java
 4d1db3d 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/CodeGen.java
 888f9ca 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/VectorUDAFAvg.txt
 7887ceb 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/VectorUDAFMinMax.txt
 d00d9ae 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/VectorUDAFSum.txt
 6ad 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorGroupByOperator.java
 42cdcf4 

Diff: https://reviews.apache.org/r/11579/diff/


Testing
---

Added new UT to cover this case


Thanks,

Remus Rusanu



[jira] [Updated] (HIVE-4637) Fix VectorUDAFSum.txt to honor the expected vector column type

2013-06-04 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-4637:
---

Attachment: HIVE-4637.1.patch.txt

> Fix VectorUDAFSum.txt to honor the expected vector column type
> --
>
> Key: HIVE-4637
> URL: https://issues.apache.org/jira/browse/HIVE-4637
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: vectorization-branch
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>Priority: Minor
> Fix For: vectorization-branch
>
> Attachments: HIVE-4637.0.patch.txt, HIVE-4637.1.patch.txt
>
>
> "I think, its a bug in code generation for VectorUDAFSumDouble.
> The template VectorUDAFSum.txt, assumes LongColumnVector for input rather 
> than having it  replaced by code generation."

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4651) TestVectorGroupByOperator causes asserts in StandardStructObjectInspector.init

2013-06-04 Thread Remus Rusanu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674185#comment-13674185
 ] 

Remus Rusanu commented on HIVE-4651:


https://reviews.apache.org/r/11624/

> TestVectorGroupByOperator causes asserts in StandardStructObjectInspector.init
> --
>
> Key: HIVE-4651
> URL: https://issues.apache.org/jira/browse/HIVE-4651
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: vectorization-branch
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>Priority: Minor
> Fix For: vectorization-branch
>
> Attachments: hive-4651.0.patch.txt
>
>
> The number of output columns passed to StandardStructObjectInspector.init 
> must be correct. VGByOp tests that have a GROUP BY key do not set this 
> proper. Assert manifests only when JUnit starts the VM with -ea

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4652) VectorHashKeyWrapperBatch.java should be in vector package (instead of exec)

2013-06-04 Thread Remus Rusanu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674184#comment-13674184
 ] 

Remus Rusanu commented on HIVE-4652:


https://reviews.apache.org/r/11625/

>  VectorHashKeyWrapperBatch.java should be in vector package (instead of exec)
> -
>
> Key: HIVE-4652
> URL: https://issues.apache.org/jira/browse/HIVE-4652
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: vectorization-branch
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>Priority: Minor
> Attachments: HIVE-4652.0.patch.txt
>
>
> As the title says

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Review Request: HIVE-4652 VectorHashKeyWrapperBatch.java should be in vector package (instead of exec)

2013-06-04 Thread Remus Rusanu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11625/
---

Review request for hive.


Description
---

Changed package, had to mark parent class KeyWrapper members as public


This addresses bug HIVE-4652.
https://issues.apache.org/jira/browse/HIVE-4652


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/KeyWrapper.java c303b30 
  ql/src/java/org/apache/hadoop/hive/ql/exec/VectorHashKeyWrapper.java 7437f7d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/VectorHashKeyWrapperBatch.java 
59bede4 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java 
07eccea 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapperBatch.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/11625/diff/


Testing
---


Thanks,

Remus Rusanu



Review Request: HIVE-4651 TestVectorGroupByOperator causes asserts in StandardStructObjectInspector.init

2013-06-04 Thread Remus Rusanu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11624/
---

Review request for hive.


Description
---

The GroupByDesc wa snot properly prepared by unit tests. Assert is hit only if 
JUnit has -ea in VM args


This addresses bug HIVE-4651.
https://issues.apache.org/jira/browse/HIVE-4651


Diffs
-

  
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorGroupByOperator.java
 42cdcf4 

Diff: https://reviews.apache.org/r/11624/diff/


Testing
---


Thanks,

Remus Rusanu



[jira] [Updated] (HIVE-4652) VectorHashKeyWrapperBatch.java should be in vector package (instead of exec)

2013-06-04 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-4652:
---

Attachment: HIVE-4652.0.patch.txt

>  VectorHashKeyWrapperBatch.java should be in vector package (instead of exec)
> -
>
> Key: HIVE-4652
> URL: https://issues.apache.org/jira/browse/HIVE-4652
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: vectorization-branch
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>Priority: Minor
> Attachments: HIVE-4652.0.patch.txt
>
>
> As the title says

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4652) VectorHashKeyWrapperBatch.java should be in vector package (instead of exec)

2013-06-04 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-4652:
---

Status: Patch Available  (was: Open)

Had to mark methods in parent KeyWrapper class as public because of package 
change of derived class. Methinks KeyWrapper should be an interface not a class.

>  VectorHashKeyWrapperBatch.java should be in vector package (instead of exec)
> -
>
> Key: HIVE-4652
> URL: https://issues.apache.org/jira/browse/HIVE-4652
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: vectorization-branch
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>Priority: Minor
> Attachments: HIVE-4652.0.patch.txt
>
>
> As the title says

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4612) Vectorized aggregates do not emit proper rows in presence of GROUP BY

2013-06-04 Thread Remus Rusanu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674141#comment-13674141
 ] 

Remus Rusanu commented on HIVE-4612:


Added HIVE-4652 for "VectorHashKeyWrapperBatch.java should be in vector package 
(instead of exec)". Thanks!

> Vectorized aggregates do not emit proper rows in presence of GROUP BY
> -
>
> Key: HIVE-4612
> URL: https://issues.apache.org/jira/browse/HIVE-4612
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: vectorization-branch
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
> Fix For: vectorization-branch
>
> Attachments: HIVE-4612.0.patch.txt, HIVE-4612.1.patch.txt
>
>
> I discovered this while testing the fix for HIVE-4451 and HIVE-4452. The VGBy 
> is emitting appropriate number of rows, but the row-mode ReduceSinkOperatoir 
> only logs one row and the final result is incomplete. Investigating. Related 
> to HIVE-4599.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2615) CTAS with literal NULL creates VOID type

2013-06-04 Thread Zhuoluo (Clark) Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhuoluo (Clark) Yang updated HIVE-2615:
---

Fix Version/s: 0.12.0
   Status: Patch Available  (was: In Progress)

Would any committer review this issue?

> CTAS with literal NULL creates VOID type
> 
>
> Key: HIVE-2615
> URL: https://issues.apache.org/jira/browse/HIVE-2615
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.6.0
>Reporter: David Phillips
>Assignee: Zhuoluo (Clark) Yang
> Fix For: 0.12.0
>
> Attachments: HIVE-2615.1.patch
>
>
> Create the table with a column that always contains NULL:
> {quote}
> hive> create table bad as select 1 x, null z from dual; 
> {quote}
> Because there's no type, Hive gives it the VOID type:
> {quote}
> hive> describe bad;
> OK
> x int 
> z void
> {quote}
> This seems weird, because AFAIK, there is no normal way to create a column of 
> type VOID.  The problem is that the table can't be queried:
> {quote}
> hive> select * from bad;
> OK
> Failed with exception java.io.IOException:java.lang.RuntimeException: 
> Internal error: no LazyObject for VOID
> {quote}
> Worse, even if you don't select that field, the query fails at runtime:
> {quote}
> hive> select x from bad;
> ...
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.MapRedTask
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4652) VectorHashKeyWrapperBatch.java should be in vector package (instead of exec)

2013-06-04 Thread Remus Rusanu (JIRA)
Remus Rusanu created HIVE-4652:
--

 Summary:  VectorHashKeyWrapperBatch.java should be in vector 
package (instead of exec)
 Key: HIVE-4652
 URL: https://issues.apache.org/jira/browse/HIVE-4652
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor


As the title says

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2615) CTAS with literal NULL creates VOID type

2013-06-04 Thread Zhuoluo (Clark) Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674136#comment-13674136
 ] 

Zhuoluo (Clark) Yang commented on HIVE-2615:


https://reviews.apache.org/r/11622/

> CTAS with literal NULL creates VOID type
> 
>
> Key: HIVE-2615
> URL: https://issues.apache.org/jira/browse/HIVE-2615
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.6.0
>Reporter: David Phillips
>Assignee: Zhuoluo (Clark) Yang
> Attachments: HIVE-2615.1.patch
>
>
> Create the table with a column that always contains NULL:
> {quote}
> hive> create table bad as select 1 x, null z from dual; 
> {quote}
> Because there's no type, Hive gives it the VOID type:
> {quote}
> hive> describe bad;
> OK
> x int 
> z void
> {quote}
> This seems weird, because AFAIK, there is no normal way to create a column of 
> type VOID.  The problem is that the table can't be queried:
> {quote}
> hive> select * from bad;
> OK
> Failed with exception java.io.IOException:java.lang.RuntimeException: 
> Internal error: no LazyObject for VOID
> {quote}
> Worse, even if you don't select that field, the query fails at runtime:
> {quote}
> hive> select x from bad;
> ...
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.MapRedTask
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Review Request: CTAS with literal NULL creates VOID type

2013-06-04 Thread Zhuoluo Yang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11622/
---

Review request for hive.


Description
---

The checks after result schema is generated.
if CTAS and contains void, it raise an exception and ask user to cast the type.


This addresses bug HIVE-2615.
https://issues.apache.org/jira/browse/HIVE-2615


Diffs
-

  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
 1489292 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
 1489292 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/ctas_creates_void.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/ctas_creates_void.q.out
 PRE-CREATION 

Diff: https://reviews.apache.org/r/11622/diff/


Testing
---

ant test -Dtestcase=TestNegativeCliDriver -Dqfile=ctas_creates_void.q


Thanks,

Zhuoluo Yang



[jira] [Updated] (HIVE-4651) TestVectorGroupByOperator causes asserts in StandardStructObjectInspector.init

2013-06-04 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-4651:
---

Attachment: hive-4651.0.patch.txt

> TestVectorGroupByOperator causes asserts in StandardStructObjectInspector.init
> --
>
> Key: HIVE-4651
> URL: https://issues.apache.org/jira/browse/HIVE-4651
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: vectorization-branch
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>Priority: Minor
> Fix For: vectorization-branch
>
> Attachments: hive-4651.0.patch.txt
>
>
> The number of output columns passed to StandardStructObjectInspector.init 
> must be correct. VGByOp tests that have a GROUP BY key do not set this 
> proper. Assert manifests only when JUnit starts the VM with -ea

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4651) TestVectorGroupByOperator causes asserts in StandardStructObjectInspector.init

2013-06-04 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-4651:
---

Fix Version/s: vectorization-branch
   Status: Patch Available  (was: Open)

Added dummy _col1 when required

> TestVectorGroupByOperator causes asserts in StandardStructObjectInspector.init
> --
>
> Key: HIVE-4651
> URL: https://issues.apache.org/jira/browse/HIVE-4651
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: vectorization-branch
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>Priority: Minor
> Fix For: vectorization-branch
>
> Attachments: hive-4651.0.patch.txt
>
>
> The number of output columns passed to StandardStructObjectInspector.init 
> must be correct. VGByOp tests that have a GROUP BY key do not set this 
> proper. Assert manifests only when JUnit starts the VM with -ea

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2615) CTAS with literal NULL creates VOID type

2013-06-04 Thread Zhuoluo (Clark) Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhuoluo (Clark) Yang updated HIVE-2615:
---

Attachment: HIVE-2615.1.patch

Attach a patch.
The checks after result schema is generated.
if CTAS and contains void, it raise an exception and ask user to cast the type.

> CTAS with literal NULL creates VOID type
> 
>
> Key: HIVE-2615
> URL: https://issues.apache.org/jira/browse/HIVE-2615
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.6.0
>Reporter: David Phillips
>Assignee: Zhuoluo (Clark) Yang
> Attachments: HIVE-2615.1.patch
>
>
> Create the table with a column that always contains NULL:
> {quote}
> hive> create table bad as select 1 x, null z from dual; 
> {quote}
> Because there's no type, Hive gives it the VOID type:
> {quote}
> hive> describe bad;
> OK
> x int 
> z void
> {quote}
> This seems weird, because AFAIK, there is no normal way to create a column of 
> type VOID.  The problem is that the table can't be queried:
> {quote}
> hive> select * from bad;
> OK
> Failed with exception java.io.IOException:java.lang.RuntimeException: 
> Internal error: no LazyObject for VOID
> {quote}
> Worse, even if you don't select that field, the query fails at runtime:
> {quote}
> hive> select x from bad;
> ...
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.MapRedTask
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4651) TestVectorGroupByOperator causes asserts in StandardStructObjectInspector.init

2013-06-04 Thread Remus Rusanu (JIRA)
Remus Rusanu created HIVE-4651:
--

 Summary: TestVectorGroupByOperator causes asserts in 
StandardStructObjectInspector.init
 Key: HIVE-4651
 URL: https://issues.apache.org/jira/browse/HIVE-4651
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor


The number of output columns passed to StandardStructObjectInspector.init must 
be correct. VGByOp tests that have a GROUP BY key do not set this proper. 
Assert manifests only when JUnit starts the VM with -ea

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >