date:20140103


[ 
https://issues.apache.org/jira/browse/HIVE-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861369#comment-13861369
 ] 

Hive QA commented on HIVE-5901:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12621020/HIVE-5901.4.patch.txt

{color:green}SUCCESS:{color} +1 4873 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/788/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/788/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12621020

 Query cancel should stop running MR tasks
 -

 Key: HIVE-5901
 URL: https://issues.apache.org/jira/browse/HIVE-5901
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-5901.1.patch.txt, HIVE-5901.2.patch.txt, 
 HIVE-5901.3.patch.txt, HIVE-5901.4.patch.txt


 Currently, query canceling does not stop running MR job immediately.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-5032) Enable hive creating external table at the root directory of DFS


[ 
https://issues.apache.org/jira/browse/HIVE-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861385#comment-13861385
 ] 

Lefty Leverenz commented on HIVE-5032:
--

Is this going to need any documentation?  (I'm guessing not.)

 Enable hive creating external table at the root directory of DFS
 

 Key: HIVE-5032
 URL: https://issues.apache.org/jira/browse/HIVE-5032
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-5032.1.patch, HIVE-5032.2.patch, HIVE-5032.3.patch


 Creating external table using HIVE with location point to the root directory 
 of DFS will fail because the function 
 HiveFileFormatUtils#doGetPartitionDescFromPath treat authority of the path 
 the same as folder and cannot find a match in the pathToPartitionInfo table 
 when doing prefix match. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-5941) SQL std auth - support 'show all roles'


[ 
https://issues.apache.org/jira/browse/HIVE-5941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861412#comment-13861412
 ] 

Hive QA commented on HIVE-5941:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12621029/HIVE-5941.2.patch.txt

{color:green}SUCCESS:{color} +1 4874 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/789/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/789/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12621029

 SQL std auth - support 'show all roles'
 ---

 Key: HIVE-5941
 URL: https://issues.apache.org/jira/browse/HIVE-5941
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Navis
 Attachments: HIVE-5941.1.patch.txt, HIVE-5941.2.patch.txt

   Original Estimate: 24h
  Remaining Estimate: 24h

 SHOW ALL ROLES - This will list all
 currently existing roles. This will be available only to the superuser.
 This task includes parser changes.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-5945) ql.plan.ConditionalResolverCommonJoin.resolveMapJoinTask also sums those tables which are not used in the child of this conditional task.


[ 
https://issues.apache.org/jira/browse/HIVE-5945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861465#comment-13861465
 ] 

Hive QA commented on HIVE-5945:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12621023/HIVE-5945.6.patch.txt

{color:green}SUCCESS:{color} +1 4873 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/790/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/790/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12621023

 ql.plan.ConditionalResolverCommonJoin.resolveMapJoinTask also sums those 
 tables which are not used in the child of this conditional task.
 -

 Key: HIVE-5945
 URL: https://issues.apache.org/jira/browse/HIVE-5945
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.8.0, 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0
Reporter: Yin Huai
Assignee: Navis
Priority: Critical
 Attachments: HIVE-5945.1.patch.txt, HIVE-5945.2.patch.txt, 
 HIVE-5945.3.patch.txt, HIVE-5945.4.patch.txt, HIVE-5945.5.patch.txt, 
 HIVE-5945.6.patch.txt


 Here is an example
 {code}
 select
i_item_id,
s_state,
avg(ss_quantity) agg1,
avg(ss_list_price) agg2,
avg(ss_coupon_amt) agg3,
avg(ss_sales_price) agg4
 FROM store_sales
 JOIN date_dim on (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
 JOIN item on (store_sales.ss_item_sk = item.i_item_sk)
 JOIN customer_demographics on (store_sales.ss_cdemo_sk = 
 customer_demographics.cd_demo_sk)
 JOIN store on (store_sales.ss_store_sk = store.s_store_sk)
 where
cd_gender = 'F' and
cd_marital_status = 'U' and
cd_education_status = 'Primary' and
d_year = 2002 and
s_state in ('GA','PA', 'LA', 'SC', 'MI', 'AL')
 group by
i_item_id,
s_state
 order by
i_item_id,
s_state
 limit 100;
 {\code}
 I turned off noconditionaltask. So, I expected that there will be 4 Map-only 
 jobs for this query. However, I got 1 Map-only job (joining strore_sales and 
 date_dim) and 3 MR job (for reduce joins.)
 So, I checked the conditional task determining the plan of the join involving 
 item. In ql.plan.ConditionalResolverCommonJoin.resolveMapJoinTask, 
 aliasToFileSizeMap contains all input tables used in this query and the 
 intermediate table generated by joining store_sales and date_dim. So, when we 
 sum the size of all small tables, the size of store_sales (which is around 
 45GB in my test) will be also counted.  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-5757) Implement vectorized support for CASE


[ 
https://issues.apache.org/jira/browse/HIVE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861522#comment-13861522
 ] 

Hive QA commented on HIVE-5757:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12621119/HIVE-5757.4.patch

{color:green}SUCCESS:{color} +1 4874 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/791/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/791/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12621119

 Implement vectorized support for CASE
 -

 Key: HIVE-5757
 URL: https://issues.apache.org/jira/browse/HIVE-5757
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-5757.1.patch, HIVE-5757.2.patch, HIVE-5757.3.patch, 
 HIVE-5757.4.patch


 Implement support for CASE in vectorized mode. The approach is to use the 
 vectorized UDF adaptor internally. A higher-performance version that used 
 VectorExpression subclasses was considered but not done due to complexity. 
 Such a version potentially could be done in the future if it's important 
 enough.
 This is high priority because CASE is a fairly popular expression.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6051) Create DecimalColumnVector and a representative VectorExpression for decimal


[ 
https://issues.apache.org/jira/browse/HIVE-6051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861584#comment-13861584
 ] 

Hive QA commented on HIVE-6051:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12619194/HIVE-6051.01.patch

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/794/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/794/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n '' ]]
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-Build-794/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 
'shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java'
Reverted 'hbase-handler/src/test/results/negative/cascade_dbdrop.q.out'
Reverted 'hbase-handler/src/test/results/positive/hbase_bulk.m.out'
Reverted 'hbase-handler/src/test/queries/positive/hbase_bulk.m'
++ awk '{print $2}'
++ egrep -v '^X|^Performing status on external'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20/target 
shims/0.20S/target shims/0.23/target shims/aggregator/target 
shims/common/target shims/common-secure/target packaging/target 
hbase-handler/target testutils/target jdbc/target metastore/target 
itests/target itests/hcatalog-unit/target itests/test-serde/target 
itests/qtest/target itests/hive-unit/target itests/custom-serde/target 
itests/util/target hcatalog/target hcatalog/storage-handlers/hbase/target 
hcatalog/server-extensions/target hcatalog/core/target 
hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target 
hcatalog/hcatalog-pig-adapter/target hwi/target common/target common/src/gen 
contrib/target service/target serde/target beeline/target odbc/target 
cli/target ql/dependency-reduced-pom.xml ql/target
+ svn update

Fetching external item into 'hcatalog/src/test/e2e/harness'
External at revision 1555121.

At revision 1555121.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12619194

 Create DecimalColumnVector and a representative VectorExpression for decimal
 

 Key: HIVE-6051
 URL: https://issues.apache.org/jira/browse/HIVE-6051
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6051.01.patch


 Create a DecimalColumnVector to use as a basis for vectorized decimal 
 operations. Include a representative VectorExpression on decimal (e.g. 
 column-column addition) to demonstrate it's use.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-4216) TestHBaseMinimrCliDriver throws weird error with HBase 0.94.5 and Hadoop 23 and test is stuck infinitely


[ 
https://issues.apache.org/jira/browse/HIVE-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861583#comment-13861583
 ] 

Hive QA commented on HIVE-4216:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12621157/HIVE-4216.2.patch

{color:green}SUCCESS:{color} +1 4873 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/792/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/792/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12621157

 TestHBaseMinimrCliDriver throws weird error with HBase 0.94.5 and Hadoop 23 
 and test is stuck infinitely
 

 Key: HIVE-4216
 URL: https://issues.apache.org/jira/browse/HIVE-4216
 Project: Hive
  Issue Type: Bug
  Components: StorageHandler
Affects Versions: 0.9.0, 0.11.0, 0.12.0
 Environment: Hadoop 23.X
Reporter: Viraj Bhat
 Fix For: 0.13.0

 Attachments: HIVE-4216.1.patch, HIVE-4216.2.patch


 After upgrading to Hadoop 23 and HBase 0.94.5 compiled for Hadoop 23. The 
 TestHBaseMinimrCliDriver, fails after performing the following steps
 Update hbase_bulk.m with the following properties
 set mapreduce.totalorderpartitioner.naturalorder=false;
 set mapreduce.totalorderpartitioner.path=/tmp/hbpartition.lst;
 Otherwise I keep seeing: _partition.lst not found exception in the mappers, 
 even though set total.order.partitioner.path=/tmp/hbpartition.lst is set.
 When the test runs, the 3 reducer phase of the second query fails with the 
 following error, but the MiniMRCluster keeps spinning up new reducer and the 
 test is stuck infinitely.
 {code}
 insert overwrite table hbsort
  select distinct value,
   case when key=103 then cast(null as string) else key end,
   case when key=103 then ''
else cast(key+1 as string) end
  from src
  cluster by value;
 {code}
 The stack trace I see in the syslog for the Node Manager is the following:
 ==
 13-03-20 16:26:48,942 FATAL [IPC Server handler 17 on 55996] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
 attempt_1363821864968_0003_r_02_0 - exited : java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row (tag=0) 
 {key:{reducesinkkey0:val_200},value:{_col0:val_200,_col1:200,_col2:201.0},alias:0}
 at 
 org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:268)
 at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:448)
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row (tag=0) 
 {key:{reducesinkkey0:val_200},value:{_col0:val_200,_col1:200,_col2:201.0},alias:0}
 at 
 org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:256)
 ... 7 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:237)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:477)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:525)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
 at 
 org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
 at 
 org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:247)
 ... 7 more
 Caused by: java.lang.NullPointerException
 at 
 org.apache.hadoop.mapreduce.TaskID$CharTaskTypeMaps.getRepresentingCharacter(TaskID.java:265)
 at org.apache.hadoop.mapreduce.TaskID.appendTo(TaskID.java:153)
 at

[jira] [Commented] (HIVE-3553) Support binary qualifiers for Hive/HBase integration


[ 
https://issues.apache.org/jira/browse/HIVE-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861605#comment-13861605
 ] 

Brock Noland commented on HIVE-3553:


Ahh ok, that makes sense.  So the change to Bytes.toBytesBinary() will break 
some users and we'll need to create our own utility method to do the conversion.

 Support binary qualifiers for Hive/HBase integration
 

 Key: HIVE-3553
 URL: https://issues.apache.org/jira/browse/HIVE-3553
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Affects Versions: 0.9.0
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni
 Attachments: HIVE-3553.1.patch.txt


 Along with regular qualifiers, we should support binary HBase qualifiers as 
 well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-3553) Support binary qualifiers for Hive/HBase integration

2014-01-03 Thread Swarnim Kulkarni (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861612#comment-13861612
 ] 

Swarnim Kulkarni commented on HIVE-3553:


Sounds good to me as well. I'll make the change and try to have an updated 
patch soon.

 Support binary qualifiers for Hive/HBase integration
 

 Key: HIVE-3553
 URL: https://issues.apache.org/jira/browse/HIVE-3553
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Affects Versions: 0.9.0
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni
 Attachments: HIVE-3553.1.patch.txt


 Along with regular qualifiers, we should support binary HBase qualifiers as 
 well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-3553) Support binary qualifiers for Hive/HBase integration


[ 
https://issues.apache.org/jira/browse/HIVE-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861633#comment-13861633
 ] 

Brock Noland commented on HIVE-3553:


Sounds good! And thank you for the test :)

 Support binary qualifiers for Hive/HBase integration
 

 Key: HIVE-3553
 URL: https://issues.apache.org/jira/browse/HIVE-3553
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Affects Versions: 0.9.0
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni
 Attachments: HIVE-3553.1.patch.txt


 Along with regular qualifiers, we should support binary HBase qualifiers as 
 well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Hive-trunk-hadoop2 - Build # 645 - Still Failing

Changes for Build #640

Changes for Build #641
[navis] HIVE-5414 : The result of show grant is not visible via JDBC (Navis 
reviewed by Thejas M Nair)

[navis] HIVE-4257 : java.sql.SQLNonTransientConnectionException on 
JDBCStatsAggregator (Teddy Choi via Navis, reviewed by Ashutosh)


Changes for Build #642

Changes for Build #643
[ehans] HIVE-6017: Contribute Decimal128 high-performance decimal(p, s) package 
from Microsoft to Hive (Hideaki Kumura via Eric Hanson)


Changes for Build #644
[cws] HIVE-5911: Recent change to schema upgrade scripts breaks file naming 
conventions (Sergey Shelukhin via cws)

[cws] HIVE-3746: Fix HS2 ResultSet Serialization Performance Regression II 
(Navis via cws)

[cws] HIVE-3746: Fix HS2 ResultSet Serialization Performance Regression (Navis 
via cws)

[jitendra] HIVE-6010: TestCompareCliDriver enables tests that would ensure 
vectorization produces same results as non-vectorized execution (Sergey 
Shelukhin via Jitendra Pandey)


Changes for Build #645



No tests ran.

The Apache Jenkins build system has built Hive-trunk-hadoop2 (build #645)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-hadoop2/645/ 
to view the results.

[jira] [Commented] (HIVE-5923) SQL std auth - parser changes


[ 
https://issues.apache.org/jira/browse/HIVE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861655#comment-13861655
 ] 

Brock Noland commented on HIVE-5923:


+1

 SQL std auth - parser changes
 -

 Key: HIVE-5923
 URL: https://issues.apache.org/jira/browse/HIVE-5923
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-5923.1.patch, HIVE-5923.2.patch, HIVE-5923.3.patch, 
 HIVE-5923.4.patch

   Original Estimate: 96h
  Time Spent: 72h
  Remaining Estimate: 12h

 There are new access control statements proposed in the functional spec in 
 HIVE-5837 . It also proposes some small changes to the existing query syntax 
 (mostly extensions and some optional keywords).
 The syntax supported should depend on the current authorization mode.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-5446) Hive can CREATE an external table but not SELECT from it when file path have spaces


[ 
https://issues.apache.org/jira/browse/HIVE-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861663#comment-13861663
 ] 

Hive QA commented on HIVE-5446:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12621210/HIVE-5446.2.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 4873 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_single_sourced_multi_insert
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_reduce_deduplicate
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/795/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/795/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12621210

 Hive can CREATE an external table but not SELECT from it when file path have 
 spaces
 ---

 Key: HIVE-5446
 URL: https://issues.apache.org/jira/browse/HIVE-5446
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-5446.1.patch, HIVE-5446.2.patch


 Create external table table1 (age int, 
 gender string, totBil float, 
 dirBill float, alkphos int,
 sgpt int, sgot int, totProt float, 
 aLB float, aG float, sel int) 
 ROW FORMAT DELIMITED
 FIELDS TERMINATED BY ','
 STORED AS TEXTFILE
 LOCATION 'hdfs://namenodehost:9000/hive newtable';
 select * from table1;
 return nothing even there is file in the target folder



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Hive-trunk-h0.21 - Build # 2545 - Still Failing

Changes for Build #2539

Changes for Build #2540
[navis] HIVE-5414 : The result of show grant is not visible via JDBC (Navis 
reviewed by Thejas M Nair)


Changes for Build #2541

Changes for Build #2542
[ehans] HIVE-6017: Contribute Decimal128 high-performance decimal(p, s) package 
from Microsoft to Hive (Hideaki Kumura via Eric Hanson)


Changes for Build #2543
[cws] HIVE-3746: Fix HS2 ResultSet Serialization Performance Regression II 
(Navis via cws)

[cws] HIVE-3746: Fix HS2 ResultSet Serialization Performance Regression (Navis 
via cws)

[jitendra] HIVE-6010: TestCompareCliDriver enables tests that would ensure 
vectorization produces same results as non-vectorized execution (Sergey 
Shelukhin via Jitendra Pandey)


Changes for Build #2544
[cws] HIVE-5911: Recent change to schema upgrade scripts breaks file naming 
conventions (Sergey Shelukhin via cws)


Changes for Build #2545



No tests ran.

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #2545)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/2545/ to 
view the results.

[jira] [Commented] (HIVE-6017) Contribute Decimal128 high-performance decimal(p, s) package from Microsoft to Hive


[ 
https://issues.apache.org/jira/browse/HIVE-6017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861701#comment-13861701
 ] 

Eric Hanson commented on HIVE-6017:
---

I don't think this needs end-user documentation. This is an internal 
performance enhancement. The user-visible type system won't change.

 Contribute Decimal128 high-performance decimal(p, s) package from Microsoft 
 to Hive
 ---

 Key: HIVE-6017
 URL: https://issues.apache.org/jira/browse/HIVE-6017
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6017.01.patch, HIVE-6017.02.patch, 
 HIVE-6017.03.patch, HIVE-6017.04.patch


 Contribute the Decimal128 high-performance decimal package developed by 
 Microsoft to Hive. This was originally written for Microsoft PolyBase by 
 Hideaki Kimura.
 This code is about 8X more efficient than Java BigDecimal for typical 
 operations. It uses a finite (128 bit) precision and can handle up to 
 decimal(38, X). It is also mutable so you can change the contents of an 
 existing object. This helps reduce the cost of new() and garbage collection.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-5446) Hive can CREATE an external table but not SELECT from it when file path have spaces

2014-01-03 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861700#comment-13861700
 ] 

Xuefu Zhang commented on HIVE-5446:
---

With a series of standardizing path in Hive such as via HIVE6048, HIVE-6121, 
etc, URI decoding might become unnecessary. At least, it's worth reproducing 
the problem with the latest trunk and fix it accordingly if the problem remains.

 Hive can CREATE an external table but not SELECT from it when file path have 
 spaces
 ---

 Key: HIVE-5446
 URL: https://issues.apache.org/jira/browse/HIVE-5446
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-5446.1.patch, HIVE-5446.2.patch


 Create external table table1 (age int, 
 gender string, totBil float, 
 dirBill float, alkphos int,
 sgpt int, sgot int, totProt float, 
 aLB float, aG float, sel int) 
 ROW FORMAT DELIMITED
 FIELDS TERMINATED BY ','
 STORED AS TEXTFILE
 LOCATION 'hdfs://namenodehost:9000/hive newtable';
 select * from table1;
 return nothing even there is file in the target folder



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-2599) Support Composit/Compound Keys with HBaseStorageHandler


[ 
https://issues.apache.org/jira/browse/HIVE-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861705#comment-13861705
 ] 

Brock Noland commented on HIVE-2599:


Hi Swarnim,

This looks pretty good!  Am I correct that the patch takes care of both selects 
and inserts?

Hi Nick,

Do you have a simple example?

Brock

 Support Composit/Compound Keys with HBaseStorageHandler
 ---

 Key: HIVE-2599
 URL: https://issues.apache.org/jira/browse/HIVE-2599
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Affects Versions: 0.8.0
Reporter: Hans Uhlig
Assignee: Swarnim Kulkarni
 Attachments: HIVE-2599.1.patch.txt, HIVE-2599.2.patch.txt, 
 HIVE-2599.2.patch.txt


 It would be really nice for hive to be able to understand composite keys from 
 an underlying HBase schema. Currently we have to store key fields twice to be 
 able to both key and make data available. I noticed John Sichi mentioned in 
 HIVE-1228 that this would be a separate issue but I cant find any follow up. 
 How feasible is this in the HBaseStorageHandler?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6134) Merging small files based on file size only works for CTAS queries

2014-01-03 Thread Eric Chu (JIRA)

Eric Chu created HIVE-6134:
--

 Summary: Merging small files based on file size only works for 
CTAS queries
 Key: HIVE-6134
 URL: https://issues.apache.org/jira/browse/HIVE-6134
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 0.11.0, 0.10.0, 0.8.0
Reporter: Eric Chu


According to the documentation, if we set hive.merge.mapfiles to true, Hive 
will launch an additional MR job to merge the small output files at the end of 
a map-only job when the average output file size is smaller than 
hive.merge.smallfiles.avgsize. Similarly, by setting hive.merge.mapredfiles to 
true, Hive will merge the output files of a map-reduce job. 

My expectation is that this is true for all MR queries. However, my observation 
is that this is only true for CTAS queries. In GenMRFileSink1.java, 
HIVEMERGEMAPFILES and HIVEMERGEMAPREDFILES are only used if ((ctx.getMvTask() 
!= null)  (!ctx.getMvTask().isEmpty())). So, for a regular SELECT query that 
doesn't have move tasks, these properties are not used.

Is my understanding correct and if so, what's the reasoning behind the logic of 
not supporting this for regular SELECT queries? It seems to me that this should 
be supported for regular SELECT queries as well. One scenario where this hits 
us hard is when users try to download the result in HUE, and HUE times out b/c 
there are thousands of output files. The workaround is to re-run the query as 
CTAS, but it's a significant time sink.





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-2599) Support Composit/Compound Keys with HBaseStorageHandler

2014-01-03 Thread Swarnim Kulkarni (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861727#comment-13861727
 ] 

Swarnim Kulkarni commented on HIVE-2599:


{quote}
Am I correct that the patch takes care of both selects and inserts?
{quote}

Unfortunately no. This one would allow to querying of custom composite keys but 
currently doesn't support writing them back to HBase. Do you want me to include 
that support as a part of this patch itself or open up a separate issue for 
that?

 Support Composit/Compound Keys with HBaseStorageHandler
 ---

 Key: HIVE-2599
 URL: https://issues.apache.org/jira/browse/HIVE-2599
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Affects Versions: 0.8.0
Reporter: Hans Uhlig
Assignee: Swarnim Kulkarni
 Attachments: HIVE-2599.1.patch.txt, HIVE-2599.2.patch.txt, 
 HIVE-2599.2.patch.txt


 It would be really nice for hive to be able to understand composite keys from 
 an underlying HBase schema. Currently we have to store key fields twice to be 
 able to both key and make data available. I noticed John Sichi mentioned in 
 HIVE-1228 that this would be a separate issue but I cant find any follow up. 
 How feasible is this in the HBaseStorageHandler?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-2599) Support Composit/Compound Keys with HBaseStorageHandler


[ 
https://issues.apache.org/jira/browse/HIVE-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861736#comment-13861736
 ] 

Brock Noland commented on HIVE-2599:


What happens if an insert is tried?  We can address that in a follow on JIRA as 
long as the results of an insert aren't data corruption or a terrible error 
message.

 Support Composit/Compound Keys with HBaseStorageHandler
 ---

 Key: HIVE-2599
 URL: https://issues.apache.org/jira/browse/HIVE-2599
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Affects Versions: 0.8.0
Reporter: Hans Uhlig
Assignee: Swarnim Kulkarni
 Attachments: HIVE-2599.1.patch.txt, HIVE-2599.2.patch.txt, 
 HIVE-2599.2.patch.txt


 It would be really nice for hive to be able to understand composite keys from 
 an underlying HBase schema. Currently we have to store key fields twice to be 
 able to both key and make data available. I noticed John Sichi mentioned in 
 HIVE-1228 that this would be a separate issue but I cant find any follow up. 
 How feasible is this in the HBaseStorageHandler?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-5446) Hive can CREATE an external table but not SELECT from it when file path have spaces


[ 
https://issues.apache.org/jira/browse/HIVE-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861744#comment-13861744
 ] 

Shuaishuai Nie commented on HIVE-5446:
--

Hi [~xuefuz], thanks for the advice. I run the unit test with the latest trunk 
and the problem still exist, so I think we still need the fix. Also I have 
validated that the failed tests in HIVE QA is not related to the patch.

 Hive can CREATE an external table but not SELECT from it when file path have 
 spaces
 ---

 Key: HIVE-5446
 URL: https://issues.apache.org/jira/browse/HIVE-5446
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-5446.1.patch, HIVE-5446.2.patch


 Create external table table1 (age int, 
 gender string, totBil float, 
 dirBill float, alkphos int,
 sgpt int, sgot int, totProt float, 
 aLB float, aG float, sel int) 
 ROW FORMAT DELIMITED
 FIELDS TERMINATED BY ','
 STORED AS TEXTFILE
 LOCATION 'hdfs://namenodehost:9000/hive newtable';
 select * from table1;
 return nothing even there is file in the target folder



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-5032) Enable hive creating external table at the root directory of DFS


[ 
https://issues.apache.org/jira/browse/HIVE-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861745#comment-13861745
 ] 

Shuaishuai Nie commented on HIVE-5032:
--

Yes [~leftylev], I think documentation is not necessary

 Enable hive creating external table at the root directory of DFS
 

 Key: HIVE-5032
 URL: https://issues.apache.org/jira/browse/HIVE-5032
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-5032.1.patch, HIVE-5032.2.patch, HIVE-5032.3.patch


 Creating external table using HIVE with location point to the root directory 
 of DFS will fail because the function 
 HiveFileFormatUtils#doGetPartitionDescFromPath treat authority of the path 
 the same as folder and cannot find a match in the pathToPartitionInfo table 
 when doing prefix match. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-5224) When creating table with AVRO serde, the avro.schema.url should be about to load serde schema from file system beside HDFS


[ 
https://issues.apache.org/jira/browse/HIVE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861746#comment-13861746
 ] 

Hive QA commented on HIVE-5224:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12621212/HIVE-5224.4.patch

{color:green}SUCCESS:{color} +1 4873 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/796/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/796/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12621212

 When creating table with AVRO serde, the avro.schema.url should be about to 
 load serde schema from file system beside HDFS
 

 Key: HIVE-5224
 URL: https://issues.apache.org/jira/browse/HIVE-5224
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-5224.1.patch, HIVE-5224.2.patch, HIVE-5224.4.patch, 
 Hive-5224.3.patch


 Now when loading schema for table with AVRO serde, the file system is hard 
 coded to hdfs in AvroSerdeUtils.java. This should enable loading schema from 
 file system beside hdfs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-5757) Implement vectorized support for CASE


 [ 
https://issues.apache.org/jira/browse/HIVE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-5757:
--

   Resolution: Implemented
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk.

 Implement vectorized support for CASE
 -

 Key: HIVE-5757
 URL: https://issues.apache.org/jira/browse/HIVE-5757
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-5757.1.patch, HIVE-5757.2.patch, HIVE-5757.3.patch, 
 HIVE-5757.4.patch


 Implement support for CASE in vectorized mode. The approach is to use the 
 vectorized UDF adaptor internally. A higher-performance version that used 
 VectorExpression subclasses was considered but not done due to complexity. 
 Such a version potentially could be done in the future if it's important 
 enough.
 This is high priority because CASE is a fairly popular expression.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6125) Tez: Refactoring changes


[ 
https://issues.apache.org/jira/browse/HIVE-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861751#comment-13861751
 ] 

Gunther Hagleitner commented on HIVE-6125:
--

failure seems unrelated. passes locally.

 Tez: Refactoring changes
 

 Key: HIVE-6125
 URL: https://issues.apache.org/jira/browse/HIVE-6125
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6125.1.patch


 In order to facilitate merge back I've separated out all the changes that 
 don't require Tez. These changes introduce new interfaces, move code etc. In 
 preparation of the Tez specific classes. This should help show what changes 
 have been made that affect the MR codepath as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6125) Tez: Refactoring changes


 [ 
https://issues.apache.org/jira/browse/HIVE-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6125:
-

Attachment: HIVE-6125.2.patch

No change in .2 other than rebasing it.

 Tez: Refactoring changes
 

 Key: HIVE-6125
 URL: https://issues.apache.org/jira/browse/HIVE-6125
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6125.1.patch, HIVE-6125.2.patch


 In order to facilitate merge back I've separated out all the changes that 
 don't require Tez. These changes introduce new interfaces, move code etc. In 
 preparation of the Tez specific classes. This should help show what changes 
 have been made that affect the MR codepath as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6125) Tez: Refactoring changes


[ 
https://issues.apache.org/jira/browse/HIVE-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861754#comment-13861754
 ] 

Gunther Hagleitner commented on HIVE-6125:
--

This one is without golden files (easier to navigate): 
https://reviews.apache.org/r/16611/

 Tez: Refactoring changes
 

 Key: HIVE-6125
 URL: https://issues.apache.org/jira/browse/HIVE-6125
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6125.1.patch, HIVE-6125.2.patch


 In order to facilitate merge back I've separated out all the changes that 
 don't require Tez. These changes introduce new interfaces, move code etc. In 
 preparation of the Tez specific classes. This should help show what changes 
 have been made that affect the MR codepath as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Hive-trunk-hadoop2 - Build # 646 - Still Failing

Changes for Build #640

Changes for Build #641
[navis] HIVE-5414 : The result of show grant is not visible via JDBC (Navis 
reviewed by Thejas M Nair)

[navis] HIVE-4257 : java.sql.SQLNonTransientConnectionException on 
JDBCStatsAggregator (Teddy Choi via Navis, reviewed by Ashutosh)


Changes for Build #642

Changes for Build #643
[ehans] HIVE-6017: Contribute Decimal128 high-performance decimal(p, s) package 
from Microsoft to Hive (Hideaki Kumura via Eric Hanson)


Changes for Build #644
[cws] HIVE-5911: Recent change to schema upgrade scripts breaks file naming 
conventions (Sergey Shelukhin via cws)

[cws] HIVE-3746: Fix HS2 ResultSet Serialization Performance Regression II 
(Navis via cws)

[cws] HIVE-3746: Fix HS2 ResultSet Serialization Performance Regression (Navis 
via cws)

[jitendra] HIVE-6010: TestCompareCliDriver enables tests that would ensure 
vectorization produces same results as non-vectorized execution (Sergey 
Shelukhin via Jitendra Pandey)


Changes for Build #645

Changes for Build #646
[ehans] HIVE-5757: Implement vectorized support for CASE (Eric Hanson)




No tests ran.

The Apache Jenkins build system has built Hive-trunk-hadoop2 (build #646)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-hadoop2/646/ 
to view the results.

[jira] [Updated] (HIVE-6051) Create DecimalColumnVector and a representative VectorExpression for decimal


 [ 
https://issues.apache.org/jira/browse/HIVE-6051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6051:
--

Attachment: HIVE-6051.02.patch

 Create DecimalColumnVector and a representative VectorExpression for decimal
 

 Key: HIVE-6051
 URL: https://issues.apache.org/jira/browse/HIVE-6051
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6051.01.patch, HIVE-6051.02.patch


 Create a DecimalColumnVector to use as a basis for vectorized decimal 
 operations. Include a representative VectorExpression on decimal (e.g. 
 column-column addition) to demonstrate it's use.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6051) Create DecimalColumnVector and a representative VectorExpression for decimal


[ 
https://issues.apache.org/jira/browse/HIVE-6051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861768#comment-13861768
 ] 

Eric Hanson commented on HIVE-6051:
---

Fixed minor formatting issue to allow patch to apply to trunk.

 Create DecimalColumnVector and a representative VectorExpression for decimal
 

 Key: HIVE-6051
 URL: https://issues.apache.org/jira/browse/HIVE-6051
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6051.01.patch, HIVE-6051.02.patch


 Create a DecimalColumnVector to use as a basis for vectorized decimal 
 operations. Include a representative VectorExpression on decimal (e.g. 
 column-column addition) to demonstrate it's use.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Hive-trunk-h0.21 - Build # 2546 - Still Failing

Changes for Build #2539

Changes for Build #2540
[navis] HIVE-5414 : The result of show grant is not visible via JDBC (Navis 
reviewed by Thejas M Nair)


Changes for Build #2541

Changes for Build #2542
[ehans] HIVE-6017: Contribute Decimal128 high-performance decimal(p, s) package 
from Microsoft to Hive (Hideaki Kumura via Eric Hanson)


Changes for Build #2543
[cws] HIVE-3746: Fix HS2 ResultSet Serialization Performance Regression II 
(Navis via cws)

[cws] HIVE-3746: Fix HS2 ResultSet Serialization Performance Regression (Navis 
via cws)

[jitendra] HIVE-6010: TestCompareCliDriver enables tests that would ensure 
vectorization produces same results as non-vectorized execution (Sergey 
Shelukhin via Jitendra Pandey)


Changes for Build #2544
[cws] HIVE-5911: Recent change to schema upgrade scripts breaks file naming 
conventions (Sergey Shelukhin via cws)


Changes for Build #2545

Changes for Build #2546
[ehans] HIVE-5757: Implement vectorized support for CASE (Eric Hanson)




No tests ran.

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #2546)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/2546/ to 
view the results.

[jira] [Commented] (HIVE-5946) DDL authorization task factory should be better tested


[ 
https://issues.apache.org/jira/browse/HIVE-5946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861793#comment-13861793
 ] 

Brock Noland commented on HIVE-5946:


Sounds good, I will rebase this patch after committing the other change.

 DDL authorization task factory should be better tested
 --

 Key: HIVE-5946
 URL: https://issues.apache.org/jira/browse/HIVE-5946
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-5946.patch


 Thejas is working on various authorization issues and one element that might 
 be useful in that effort and increase test coverage and testability would be 
 perform authorization task creation in a factory.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-2599) Support Composit/Compound Keys with HBaseStorageHandler

2014-01-03 Thread Nick Dimiduk (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861802#comment-13861802
 ] 

Nick Dimiduk commented on HIVE-2599:


bq. Do you have a simple example?

The best documentation available is still in the [unit 
tests|https://github.com/apache/hbase/blob/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestStruct.java].
 I will do a proper writeup of using this feature, it's just not a priority for 
me as of late.

 Support Composit/Compound Keys with HBaseStorageHandler
 ---

 Key: HIVE-2599
 URL: https://issues.apache.org/jira/browse/HIVE-2599
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Affects Versions: 0.8.0
Reporter: Hans Uhlig
Assignee: Swarnim Kulkarni
 Attachments: HIVE-2599.1.patch.txt, HIVE-2599.2.patch.txt, 
 HIVE-2599.2.patch.txt


 It would be really nice for hive to be able to understand composite keys from 
 an underlying HBase schema. Currently we have to store key fields twice to be 
 able to both key and make data available. I noticed John Sichi mentioned in 
 HIVE-1228 that this would be a separate issue but I cant find any follow up. 
 How feasible is this in the HBaseStorageHandler?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6135) Fix merge error on tez branch (TestCompareCliDriver)

Gunther Hagleitner created HIVE-6135:


 Summary: Fix merge error on tez branch (TestCompareCliDriver)
 Key: HIVE-6135
 URL: https://issues.apache.org/jira/browse/HIVE-6135
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: tez-branch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-2599) Support Composit/Compound Keys with HBaseStorageHandler


[ 
https://issues.apache.org/jira/browse/HIVE-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861808#comment-13861808
 ] 

Brock Noland commented on HIVE-2599:


Cool, sounds good. I think we can address this in a follow on JIRA since 
Swarnim has a working patch here for a common use case.

 Support Composit/Compound Keys with HBaseStorageHandler
 ---

 Key: HIVE-2599
 URL: https://issues.apache.org/jira/browse/HIVE-2599
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Affects Versions: 0.8.0
Reporter: Hans Uhlig
Assignee: Swarnim Kulkarni
 Attachments: HIVE-2599.1.patch.txt, HIVE-2599.2.patch.txt, 
 HIVE-2599.2.patch.txt


 It would be really nice for hive to be able to understand composite keys from 
 an underlying HBase schema. Currently we have to store key fields twice to be 
 able to both key and make data available. I noticed John Sichi mentioned in 
 HIVE-1228 that this would be a separate issue but I cant find any follow up. 
 How feasible is this in the HBaseStorageHandler?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6135) Fix merge error on tez branch (TestCompareCliDriver)


 [ 
https://issues.apache.org/jira/browse/HIVE-6135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6135:
-

Attachment: HIVE-6135.1.patch

 Fix merge error on tez branch (TestCompareCliDriver)
 

 Key: HIVE-6135
 URL: https://issues.apache.org/jira/browse/HIVE-6135
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: tez-branch

 Attachments: HIVE-6135.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Resolved] (HIVE-6135) Fix merge error on tez branch (TestCompareCliDriver)


 [ 
https://issues.apache.org/jira/browse/HIVE-6135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner resolved HIVE-6135.
--

Resolution: Fixed

Committed to branch.

 Fix merge error on tez branch (TestCompareCliDriver)
 

 Key: HIVE-6135
 URL: https://issues.apache.org/jira/browse/HIVE-6135
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: tez-branch

 Attachments: HIVE-6135.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6134) Merging small files based on file size only works for CTAS queries

2014-01-03 Thread Eric Chu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861818#comment-13861818
 ] 

Eric Chu commented on HIVE-6134:


[~brocknoland] and [~xuefuz]: I was talking to Yin Huai about this issue and he 
suggested I pinged you on this, especially on how it affects HUE UX as 
mentioned above. 

 Merging small files based on file size only works for CTAS queries
 --

 Key: HIVE-6134
 URL: https://issues.apache.org/jira/browse/HIVE-6134
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0, 0.10.0, 0.11.0, 0.12.0
Reporter: Eric Chu

 According to the documentation, if we set hive.merge.mapfiles to true, Hive 
 will launch an additional MR job to merge the small output files at the end 
 of a map-only job when the average output file size is smaller than 
 hive.merge.smallfiles.avgsize. Similarly, by setting hive.merge.mapredfiles 
 to true, Hive will merge the output files of a map-reduce job. 
 My expectation is that this is true for all MR queries. However, my 
 observation is that this is only true for CTAS queries. In 
 GenMRFileSink1.java, HIVEMERGEMAPFILES and HIVEMERGEMAPREDFILES are only used 
 if ((ctx.getMvTask() != null)  (!ctx.getMvTask().isEmpty())). So, for a 
 regular SELECT query that doesn't have move tasks, these properties are not 
 used.
 Is my understanding correct and if so, what's the reasoning behind the logic 
 of not supporting this for regular SELECT queries? It seems to me that this 
 should be supported for regular SELECT queries as well. One scenario where 
 this hits us hard is when users try to download the result in HUE, and HUE 
 times out b/c there are thousands of output files. The workaround is to 
 re-run the query as CTAS, but it's a significant time sink.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6136) Hive metastore configured with DB2 LUW doesn't work

2014-01-03 Thread Thomas Friedrich (JIRA)

Thomas Friedrich created HIVE-6136:
--

 Summary: Hive metastore configured with DB2 LUW doesn't work
 Key: HIVE-6136
 URL: https://issues.apache.org/jira/browse/HIVE-6136
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.12.0
Reporter: Thomas Friedrich


Hive 0.12 with datanucleus 3.2.1 generates invalid SQL syntax if the metastore 
is configured with DB2.

To reproduce the issue, simply create a table and drop it using Hive CLI:
create table test(i1 int);
drop table test;

Drop will fail and this is the stacktrace:
com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-206, 
SQLSTATE=42703, SQLERRMC=SUBQ.A0.CREATE_TIME, DRIVER=4.16.53
at com.ibm.db2.jcc.am.fd.a(fd.java:739)
at com.ibm.db2.jcc.am.fd.a(fd.java:60)
at com.ibm.db2.jcc.am.fd.a(fd.java:127)
at com.ibm.db2.jcc.am.to.c(to.java:2771)
at com.ibm.db2.jcc.am.to.d(to.java:2759)
at com.ibm.db2.jcc.am.to.a(to.java:2192)
at com.ibm.db2.jcc.am.uo.a(uo.java:7827)
at com.ibm.db2.jcc.t4.ab.h(ab.java:141)
at com.ibm.db2.jcc.t4.ab.b(ab.java:41)
at com.ibm.db2.jcc.t4.o.a(o.java:32)
at com.ibm.db2.jcc.t4.tb.i(tb.java:145)
at com.ibm.db2.jcc.am.to.kb(to.java:2161)
at com.ibm.db2.jcc.am.uo.wc(uo.java:3657)
at com.ibm.db2.jcc.am.uo.b(uo.java:4454)
at com.ibm.db2.jcc.am.uo.jc(uo.java:760)
at com.ibm.db2.jcc.am.uo.executeQuery(uo.java:725)
at 
com.jolbox.bonecp.PreparedStatementHandle.executeQuery(PreparedStatementHandle.java:172)
at 
org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeQuery(ParamLoggingPreparedStatement.java:381)
at 
org.datanucleus.store.rdbms.SQLController.executeStatementQuery(SQLController.java:504)
at 
org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:637)
at org.datanucleus.store.query.Query.executeQuery(Query.java:1786)
at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:266)
at 
org.apache.hadoop.hive.metastore.ObjectStore.listMPartitions(ObjectStore.java:1698)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsInternal(ObjectStore.java:1428)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:1402)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
at java.lang.reflect.Method.invoke(Method.java:611)
at 
org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:124)
at com.sun.proxy.$Proxy7.getPartitions(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.dropPartitionsAndGetLocations(HiveMetaStore.java:1286)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:1189)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:1328)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.j
at java.lang.reflect.Method.invoke(Method.java:611)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.
at com.sun.proxy.$Proxy8.drop_table_with_environment_context(Unknown 
Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreCl
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreCl
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.j
at java.lang.reflect.Method.invoke(Method.java:611)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaSt
at com.sun.proxy.$Proxy9.dropTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:869)
at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:836)
at org.apache.hadoop.hive.ql.exec.DDLTask.dropTable(DDLTask.java:3329)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:277)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
at

[jira] [Updated] (HIVE-6136) Hive metastore configured with DB2 LUW doesn't work

2014-01-03 Thread Thomas Friedrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Friedrich updated HIVE-6136:
---

Attachment: hive.log

Hive log output for failed ALTER statement

 Hive metastore configured with DB2 LUW doesn't work
 ---

 Key: HIVE-6136
 URL: https://issues.apache.org/jira/browse/HIVE-6136
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.12.0
Reporter: Thomas Friedrich
 Attachments: hive.log


 Hive 0.12 with datanucleus 3.2.1 generates invalid SQL syntax if the 
 metastore is configured with DB2.
 To reproduce the issue, simply create a table and drop it using Hive CLI:
 create table test(i1 int);
 drop table test;
 Drop will fail and this is the stacktrace:
 com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-206, 
 SQLSTATE=42703, SQLERRMC=SUBQ.A0.CREATE_TIME, DRIVER=4.16.53
 at com.ibm.db2.jcc.am.fd.a(fd.java:739)
 at com.ibm.db2.jcc.am.fd.a(fd.java:60)
 at com.ibm.db2.jcc.am.fd.a(fd.java:127)
 at com.ibm.db2.jcc.am.to.c(to.java:2771)
 at com.ibm.db2.jcc.am.to.d(to.java:2759)
 at com.ibm.db2.jcc.am.to.a(to.java:2192)
 at com.ibm.db2.jcc.am.uo.a(uo.java:7827)
 at com.ibm.db2.jcc.t4.ab.h(ab.java:141)
 at com.ibm.db2.jcc.t4.ab.b(ab.java:41)
 at com.ibm.db2.jcc.t4.o.a(o.java:32)
 at com.ibm.db2.jcc.t4.tb.i(tb.java:145)
 at com.ibm.db2.jcc.am.to.kb(to.java:2161)
 at com.ibm.db2.jcc.am.uo.wc(uo.java:3657)
 at com.ibm.db2.jcc.am.uo.b(uo.java:4454)
 at com.ibm.db2.jcc.am.uo.jc(uo.java:760)
 at com.ibm.db2.jcc.am.uo.executeQuery(uo.java:725)
 at 
 com.jolbox.bonecp.PreparedStatementHandle.executeQuery(PreparedStatementHandle.java:172)
 at 
 org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeQuery(ParamLoggingPreparedStatement.java:381)
 at 
 org.datanucleus.store.rdbms.SQLController.executeStatementQuery(SQLController.java:504)
 at 
 org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:637)
 at org.datanucleus.store.query.Query.executeQuery(Query.java:1786)
 at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
 at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:266)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.listMPartitions(ObjectStore.java:1698)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsInternal(ObjectStore.java:1428)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:1402)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
 at java.lang.reflect.Method.invoke(Method.java:611)
 at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:124)
 at com.sun.proxy.$Proxy7.getPartitions(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.dropPartitionsAndGetLocations(HiveMetaStore.java:1286)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:1189)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:1328)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.j
 at java.lang.reflect.Method.invoke(Method.java:611)
 at 
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.
 at com.sun.proxy.$Proxy8.drop_table_with_environment_context(Unknown 
 Source)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreCl
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreCl
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.j
 at java.lang.reflect.Method.invoke(Method.java:611)
 at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaSt
 at com.sun.proxy.$Proxy9.dropTable(Unknown Source)
 at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:869)
 at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:836)

[jira] [Commented] (HIVE-6098) Merge Tez branch into trunk


[ 
https://issues.apache.org/jira/browse/HIVE-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861831#comment-13861831
 ] 

Thejas M Nair commented on HIVE-6098:
-

FYI, Gunther has created HIVE-6125 which has part of the tez changes, that 
involve refactoring of the existing hive code.


 Merge Tez branch into trunk
 ---

 Key: HIVE-6098
 URL: https://issues.apache.org/jira/browse/HIVE-6098
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.12.0
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6098.1.patch, HIVE-6098.2.patch, HIVE-6098.3.patch, 
 HIVE-6098.4.patch, HIVE-6098.5.patch, HIVE-6098.6.patch, HIVE-6098.7.patch, 
 hive-on-tez-conf.txt


 I think the Tez branch is at a point where we can consider merging it back 
 into trunk after review. 
 Tez itself has had its first release, most hive features are available on Tez 
 and the test coverage is decent. There are a few known limitations, all of 
 which can be handled in trunk as far as I can tell (i.e.: None of them are 
 large disruptive changes that still require a branch.)
 Limitations:
 - Union all is not yet supported on Tez
 - SMB is not yet supported on Tez
 - Bucketed map-join is executed as broadcast join (bucketing is ignored)
 Since the user is free to toggle hive.optimize.tez, it's obviously possible 
 to just run these on MR.
 I am hoping to follow the approach that was taken with vectorization and 
 shoot for a merge instead of single commit. This would retain history of the 
 branch. Also in vectorization we required at least three +1s before merge, 
 I'm hoping to go with that as well.
 I will add a combined patch to this ticket for review purposes (not for 
 commit). I'll also attach instructions to run on a cluster if anyone wants to 
 try.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6136) Hive metastore configured with DB2 LUW doesn't work


 [ 
https://issues.apache.org/jira/browse/HIVE-6136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-6136:
---

Description: 
Hive 0.12 with datanucleus 3.2.1 generates invalid SQL syntax if the metastore 
is configured with DB2.

To reproduce the issue, simply create a table and drop it using Hive CLI:
create table test(i1 int);
drop table test;

Drop will fail and this is the stacktrace:
{noformat}
com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-206, 
SQLSTATE=42703, SQLERRMC=SUBQ.A0.CREATE_TIME, DRIVER=4.16.53
at com.ibm.db2.jcc.am.fd.a(fd.java:739)
at com.ibm.db2.jcc.am.fd.a(fd.java:60)
at com.ibm.db2.jcc.am.fd.a(fd.java:127)
at com.ibm.db2.jcc.am.to.c(to.java:2771)
at com.ibm.db2.jcc.am.to.d(to.java:2759)
at com.ibm.db2.jcc.am.to.a(to.java:2192)
at com.ibm.db2.jcc.am.uo.a(uo.java:7827)
at com.ibm.db2.jcc.t4.ab.h(ab.java:141)
at com.ibm.db2.jcc.t4.ab.b(ab.java:41)
at com.ibm.db2.jcc.t4.o.a(o.java:32)
at com.ibm.db2.jcc.t4.tb.i(tb.java:145)
at com.ibm.db2.jcc.am.to.kb(to.java:2161)
at com.ibm.db2.jcc.am.uo.wc(uo.java:3657)
at com.ibm.db2.jcc.am.uo.b(uo.java:4454)
at com.ibm.db2.jcc.am.uo.jc(uo.java:760)
at com.ibm.db2.jcc.am.uo.executeQuery(uo.java:725)
at 
com.jolbox.bonecp.PreparedStatementHandle.executeQuery(PreparedStatementHandle.java:172)
at 
org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeQuery(ParamLoggingPreparedStatement.java:381)
at 
org.datanucleus.store.rdbms.SQLController.executeStatementQuery(SQLController.java:504)
at 
org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:637)
at org.datanucleus.store.query.Query.executeQuery(Query.java:1786)
at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:266)
at 
org.apache.hadoop.hive.metastore.ObjectStore.listMPartitions(ObjectStore.java:1698)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsInternal(ObjectStore.java:1428)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:1402)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
at java.lang.reflect.Method.invoke(Method.java:611)
at 
org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:124)
at com.sun.proxy.$Proxy7.getPartitions(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.dropPartitionsAndGetLocations(HiveMetaStore.java:1286)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:1189)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:1328)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.j
at java.lang.reflect.Method.invoke(Method.java:611)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.
at com.sun.proxy.$Proxy8.drop_table_with_environment_context(Unknown 
Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreCl
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreCl
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.j
at java.lang.reflect.Method.invoke(Method.java:611)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaSt
at com.sun.proxy.$Proxy9.dropTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:869)
at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:836)
at org.apache.hadoop.hive.ql.exec.DDLTask.dropTable(DDLTask.java:3329)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:277)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1437)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1215)
at

[jira] [Commented] (HIVE-6089) Add metrics to HiveServer2

2014-01-03 Thread Thiruvel Thirumoolan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861841#comment-13861841
 ] 

Thiruvel Thirumoolan commented on HIVE-6089:


[~jaideepdhok] Thanks for the feedback. As this is the first metrics patch, I 
will add everything that's straightforward. Will add others in a followup JIRA.

 Add metrics to HiveServer2
 --

 Key: HIVE-6089
 URL: https://issues.apache.org/jira/browse/HIVE-6089
 Project: Hive
  Issue Type: Improvement
  Components: Diagnosability, HiveServer2
Affects Versions: 0.12.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.13.0

 Attachments: HIVE-6089_prototype.patch


 Would like to collect metrics about HiveServer's usage, like active 
 connections, total requests etc.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6134) Merging small files based on file size only works for CTAS queries

2014-01-03 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861846#comment-13861846
 ] 

Xuefu Zhang commented on HIVE-6134:
---

It seems reasonable to me that these flags kicks in only for CTAS, or other 
queries that resulting a new table. In other words, the functionality of 
merging small files for a table should be applied to table (upon request) 
rather than coming in effect for any query that touches the table. I think what 
is missing is a new command/query something like MERGE FILES FOR TABLE 
table_name. This might be further automated in a scheduled fashion in 
HiveServer2. Of course, the scope is much larger.

 Merging small files based on file size only works for CTAS queries
 --

 Key: HIVE-6134
 URL: https://issues.apache.org/jira/browse/HIVE-6134
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0, 0.10.0, 0.11.0, 0.12.0
Reporter: Eric Chu

 According to the documentation, if we set hive.merge.mapfiles to true, Hive 
 will launch an additional MR job to merge the small output files at the end 
 of a map-only job when the average output file size is smaller than 
 hive.merge.smallfiles.avgsize. Similarly, by setting hive.merge.mapredfiles 
 to true, Hive will merge the output files of a map-reduce job. 
 My expectation is that this is true for all MR queries. However, my 
 observation is that this is only true for CTAS queries. In 
 GenMRFileSink1.java, HIVEMERGEMAPFILES and HIVEMERGEMAPREDFILES are only used 
 if ((ctx.getMvTask() != null)  (!ctx.getMvTask().isEmpty())). So, for a 
 regular SELECT query that doesn't have move tasks, these properties are not 
 used.
 Is my understanding correct and if so, what's the reasoning behind the logic 
 of not supporting this for regular SELECT queries? It seems to me that this 
 should be supported for regular SELECT queries as well. One scenario where 
 this hits us hard is when users try to download the result in HUE, and HUE 
 times out b/c there are thousands of output files. The workaround is to 
 re-run the query as CTAS, but it's a significant time sink.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table


 [ 
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5795:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch committed to trunk. Thanks for the contribution [~shuainie] !
[~leftylev] [~shuainie] Should we create a followup jira for adding 
documentation to wiki ?


 Hive should be able to skip header and footer rows when reading data file for 
 a table
 -

 Key: HIVE-5795
 URL: https://issues.apache.org/jira/browse/HIVE-5795
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch, 
 HIVE-5795.4.patch, HIVE-5795.5.patch


 Hive should be able to skip header and footer lines when reading data file 
 from table. In this way, user don't need to processing data which generated 
 by other application with a header or footer and directly use the file for 
 table operations.
 To implement this, the idea is adding new properties in table descriptions to 
 define the number of lines in header and footer and skip them when reading 
 the record from record reader. An DDL example for creating a table with 
 header and footer should be like this:
 {code}
 Create external table testtable (name string, message string) row format 
 delimited fields terminated by '\t' lines terminated by '\n' location 
 '/testtable' tblproperties (skip.header.line.count=1, 
 skip.footer.line.count=2);
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table


 [ 
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5795:


Issue Type: New Feature  (was: Bug)

 Hive should be able to skip header and footer rows when reading data file for 
 a table
 -

 Key: HIVE-5795
 URL: https://issues.apache.org/jira/browse/HIVE-5795
 Project: Hive
  Issue Type: New Feature
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Fix For: 0.13.0

 Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch, 
 HIVE-5795.4.patch, HIVE-5795.5.patch


 Hive should be able to skip header and footer lines when reading data file 
 from table. In this way, user don't need to processing data which generated 
 by other application with a header or footer and directly use the file for 
 table operations.
 To implement this, the idea is adding new properties in table descriptions to 
 define the number of lines in header and footer and skip them when reading 
 the record from record reader. An DDL example for creating a table with 
 header and footer should be like this:
 {code}
 Create external table testtable (name string, message string) row format 
 delimited fields terminated by '\t' lines terminated by '\n' location 
 '/testtable' tblproperties (skip.header.line.count=1, 
 skip.footer.line.count=2);
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table


 [ 
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5795:


Fix Version/s: 0.13.0

 Hive should be able to skip header and footer rows when reading data file for 
 a table
 -

 Key: HIVE-5795
 URL: https://issues.apache.org/jira/browse/HIVE-5795
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Fix For: 0.13.0

 Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch, 
 HIVE-5795.4.patch, HIVE-5795.5.patch


 Hive should be able to skip header and footer lines when reading data file 
 from table. In this way, user don't need to processing data which generated 
 by other application with a header or footer and directly use the file for 
 table operations.
 To implement this, the idea is adding new properties in table descriptions to 
 define the number of lines in header and footer and skip them when reading 
 the record from record reader. An DDL example for creating a table with 
 header and footer should be like this:
 {code}
 Create external table testtable (name string, message string) row format 
 delimited fields terminated by '\t' lines terminated by '\n' location 
 '/testtable' tblproperties (skip.header.line.count=1, 
 skip.footer.line.count=2);
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-5032) Enable hive creating external table at the root directory of DFS


[ 
https://issues.apache.org/jira/browse/HIVE-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861858#comment-13861858
 ] 

Hive QA commented on HIVE-5032:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12621218/HIVE-5032.3.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4873 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_map_operators
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/797/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/797/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12621218

 Enable hive creating external table at the root directory of DFS
 

 Key: HIVE-5032
 URL: https://issues.apache.org/jira/browse/HIVE-5032
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-5032.1.patch, HIVE-5032.2.patch, HIVE-5032.3.patch


 Creating external table using HIVE with location point to the root directory 
 of DFS will fail because the function 
 HiveFileFormatUtils#doGetPartitionDescFromPath treat authority of the path 
 the same as folder and cannot find a match in the pathToPartitionInfo table 
 when doing prefix match. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6134) Merging small files based on file size only works for CTAS queries

2014-01-03 Thread Eric Chu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861870#comment-13861870
 ] 

Eric Chu commented on HIVE-6134:


Thanks Xuefu for the quick response! A few questions/comments:
1. Could you elaborate on why you think it makes sense to only merge small 
files for queries resulting in a new table? Alternatively, what are the issues 
for supporting these properties for regular queries? I'd love to have this 
support for regular queries, unless there's a strong reason against it.
2. If indeed these properties are designed only for queries resulting in a new 
table, then we should mention that in the documentation. Currently it's 
misleading - it sounds like they'd work for regular queries as well.
3. The main pain point here is that users won't know that there are many output 
files until AFTER the query is run. Imagine analysts who don't know these 
details and HUE is the only query interface for them. It's frustrating and time 
consuming to run a long-running query in Hue, only to find out they can't get 
the results b/c HUE times out trying to read these many small files, and so 
they'll have to run the query again as CTAS. Having a table just so they could 
download the result seems to be an overkill.
4. Do you have a suggestion for the aforementioned HUE issue? Hue starts timing 
out when the query results in thousands of small output files. This is a major 
pain point for our analysts today.

 Merging small files based on file size only works for CTAS queries
 --

 Key: HIVE-6134
 URL: https://issues.apache.org/jira/browse/HIVE-6134
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0, 0.10.0, 0.11.0, 0.12.0
Reporter: Eric Chu

 According to the documentation, if we set hive.merge.mapfiles to true, Hive 
 will launch an additional MR job to merge the small output files at the end 
 of a map-only job when the average output file size is smaller than 
 hive.merge.smallfiles.avgsize. Similarly, by setting hive.merge.mapredfiles 
 to true, Hive will merge the output files of a map-reduce job. 
 My expectation is that this is true for all MR queries. However, my 
 observation is that this is only true for CTAS queries. In 
 GenMRFileSink1.java, HIVEMERGEMAPFILES and HIVEMERGEMAPREDFILES are only used 
 if ((ctx.getMvTask() != null)  (!ctx.getMvTask().isEmpty())). So, for a 
 regular SELECT query that doesn't have move tasks, these properties are not 
 used.
 Is my understanding correct and if so, what's the reasoning behind the logic 
 of not supporting this for regular SELECT queries? It seems to me that this 
 should be supported for regular SELECT queries as well. One scenario where 
 this hits us hard is when users try to download the result in HUE, and HUE 
 times out b/c there are thousands of output files. The workaround is to 
 re-run the query as CTAS, but it's a significant time sink.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6125) Tez: Refactoring changes


 [ 
https://issues.apache.org/jira/browse/HIVE-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6125:
-

Status: Patch Available  (was: Open)

 Tez: Refactoring changes
 

 Key: HIVE-6125
 URL: https://issues.apache.org/jira/browse/HIVE-6125
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6125.1.patch, HIVE-6125.2.patch, HIVE-6125.3.patch


 In order to facilitate merge back I've separated out all the changes that 
 don't require Tez. These changes introduce new interfaces, move code etc. In 
 preparation of the Tez specific classes. This should help show what changes 
 have been made that affect the MR codepath as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6125) Tez: Refactoring changes


 [ 
https://issues.apache.org/jira/browse/HIVE-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6125:
-

Status: Open  (was: Patch Available)

 Tez: Refactoring changes
 

 Key: HIVE-6125
 URL: https://issues.apache.org/jira/browse/HIVE-6125
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6125.1.patch, HIVE-6125.2.patch, HIVE-6125.3.patch


 In order to facilitate merge back I've separated out all the changes that 
 don't require Tez. These changes introduce new interfaces, move code etc. In 
 preparation of the Tez specific classes. This should help show what changes 
 have been made that affect the MR codepath as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6125) Tez: Refactoring changes


 [ 
https://issues.apache.org/jira/browse/HIVE-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6125:
-

Attachment: HIVE-6125.3.patch

.3 is another rebase

 Tez: Refactoring changes
 

 Key: HIVE-6125
 URL: https://issues.apache.org/jira/browse/HIVE-6125
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6125.1.patch, HIVE-6125.2.patch, HIVE-6125.3.patch


 In order to facilitate merge back I've separated out all the changes that 
 don't require Tez. These changes introduce new interfaces, move code etc. In 
 preparation of the Tez specific classes. This should help show what changes 
 have been made that affect the MR codepath as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6134) Merging small files based on file size only works for CTAS queries

2014-01-03 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861903#comment-13861903
 ] 

Xuefu Zhang commented on HIVE-6134:
---

[~ericchu30] I guess my above comments was a little off the topic. I thought 
the problem you mentioned was about too many small files for a table (which my 
comments above was mostly about) but now I realized that the problem is about a 
query resulting too many tables. Thanks for your clarifications.

The two problems are different yet seemingly related. I'm wondering if the 
problem #2 (too many small files from a query) is root caused by problem #1 
(too many small files for a table). I cannot image a case of that (besides too 
many mappers or reducers), but appreciate if you can share your case.

If the answer is yes, then the proposal that I outlined above may prevent 
problem #2 from happening. If no, then it may makes sense to have both. For 
information only, HIVE-439, which originally introduced the merge feature, 
seems targeting only at small files from mappers, no mentioning either this is 
for query result or table files. However, the comments did mention about 
movetask, which may be related to the code you saw.

For the Hue issue you mentioned, I'd think that getting rid of the small files 
one way or the other seems reasonable.  

 Merging small files based on file size only works for CTAS queries
 --

 Key: HIVE-6134
 URL: https://issues.apache.org/jira/browse/HIVE-6134
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0, 0.10.0, 0.11.0, 0.12.0
Reporter: Eric Chu

 According to the documentation, if we set hive.merge.mapfiles to true, Hive 
 will launch an additional MR job to merge the small output files at the end 
 of a map-only job when the average output file size is smaller than 
 hive.merge.smallfiles.avgsize. Similarly, by setting hive.merge.mapredfiles 
 to true, Hive will merge the output files of a map-reduce job. 
 My expectation is that this is true for all MR queries. However, my 
 observation is that this is only true for CTAS queries. In 
 GenMRFileSink1.java, HIVEMERGEMAPFILES and HIVEMERGEMAPREDFILES are only used 
 if ((ctx.getMvTask() != null)  (!ctx.getMvTask().isEmpty())). So, for a 
 regular SELECT query that doesn't have move tasks, these properties are not 
 used.
 Is my understanding correct and if so, what's the reasoning behind the logic 
 of not supporting this for regular SELECT queries? It seems to me that this 
 should be supported for regular SELECT queries as well. One scenario where 
 this hits us hard is when users try to download the result in HUE, and HUE 
 times out b/c there are thousands of output files. The workaround is to 
 re-run the query as CTAS, but it's a significant time sink.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table


 [ 
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuaishuai Nie updated HIVE-5795:
-

Release Note: 
hive.file.max.footer
  Default Value: 100
  Max number of lines of footer user can set for a table file.
skip.header.line.count
  Default Value: 0
  Number of header lines for the table file.
skip.footer.line.count
  Default Value: 0
  Number of footer lines for the table file.

 Hive should be able to skip header and footer rows when reading data file for 
 a table
 -

 Key: HIVE-5795
 URL: https://issues.apache.org/jira/browse/HIVE-5795
 Project: Hive
  Issue Type: New Feature
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Fix For: 0.13.0

 Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch, 
 HIVE-5795.4.patch, HIVE-5795.5.patch


 Hive should be able to skip header and footer lines when reading data file 
 from table. In this way, user don't need to processing data which generated 
 by other application with a header or footer and directly use the file for 
 table operations.
 To implement this, the idea is adding new properties in table descriptions to 
 define the number of lines in header and footer and skip them when reading 
 the record from record reader. An DDL example for creating a table with 
 header and footer should be like this:
 {code}
 Create external table testtable (name string, message string) row format 
 delimited fields terminated by '\t' lines terminated by '\n' location 
 '/testtable' tblproperties (skip.header.line.count=1, 
 skip.footer.line.count=2);
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table


[ 
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861911#comment-13861911
 ] 

Shuaishuai Nie commented on HIVE-5795:
--

Thanks [~leftylev] [~thejas]. It seems I don't have permission to edit wiki 
directly. I have added the configuration details in the release note.

 Hive should be able to skip header and footer rows when reading data file for 
 a table
 -

 Key: HIVE-5795
 URL: https://issues.apache.org/jira/browse/HIVE-5795
 Project: Hive
  Issue Type: New Feature
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Fix For: 0.13.0

 Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch, 
 HIVE-5795.4.patch, HIVE-5795.5.patch


 Hive should be able to skip header and footer lines when reading data file 
 from table. In this way, user don't need to processing data which generated 
 by other application with a header or footer and directly use the file for 
 table operations.
 To implement this, the idea is adding new properties in table descriptions to 
 define the number of lines in header and footer and skip them when reading 
 the record from record reader. An DDL example for creating a table with 
 header and footer should be like this:
 {code}
 Create external table testtable (name string, message string) row format 
 delimited fields terminated by '\t' lines terminated by '\n' location 
 '/testtable' tblproperties (skip.header.line.count=1, 
 skip.footer.line.count=2);
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Hive-trunk-hadoop2 - Build # 647 - Still Failing

Changes for Build #640

Changes for Build #641
[navis] HIVE-5414 : The result of show grant is not visible via JDBC (Navis 
reviewed by Thejas M Nair)

[navis] HIVE-4257 : java.sql.SQLNonTransientConnectionException on 
JDBCStatsAggregator (Teddy Choi via Navis, reviewed by Ashutosh)


Changes for Build #642

Changes for Build #643
[ehans] HIVE-6017: Contribute Decimal128 high-performance decimal(p, s) package 
from Microsoft to Hive (Hideaki Kumura via Eric Hanson)


Changes for Build #644
[cws] HIVE-5911: Recent change to schema upgrade scripts breaks file naming 
conventions (Sergey Shelukhin via cws)

[cws] HIVE-3746: Fix HS2 ResultSet Serialization Performance Regression II 
(Navis via cws)

[cws] HIVE-3746: Fix HS2 ResultSet Serialization Performance Regression (Navis 
via cws)

[jitendra] HIVE-6010: TestCompareCliDriver enables tests that would ensure 
vectorization produces same results as non-vectorized execution (Sergey 
Shelukhin via Jitendra Pandey)


Changes for Build #645

Changes for Build #646
[ehans] HIVE-5757: Implement vectorized support for CASE (Eric Hanson)


Changes for Build #647
[thejas] HIVE-5795 : Hive should be able to skip header and footer rows when 
reading data file for a table (Shuaishuai Nie via Thejas Nair)




No tests ran.

The Apache Jenkins build system has built Hive-trunk-hadoop2 (build #647)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-hadoop2/647/ 
to view the results.

Hive-trunk-h0.21 - Build # 2547 - Still Failing

Changes for Build #2539

Changes for Build #2540
[navis] HIVE-5414 : The result of show grant is not visible via JDBC (Navis 
reviewed by Thejas M Nair)


Changes for Build #2541

Changes for Build #2542
[ehans] HIVE-6017: Contribute Decimal128 high-performance decimal(p, s) package 
from Microsoft to Hive (Hideaki Kumura via Eric Hanson)


Changes for Build #2543
[cws] HIVE-3746: Fix HS2 ResultSet Serialization Performance Regression II 
(Navis via cws)

[cws] HIVE-3746: Fix HS2 ResultSet Serialization Performance Regression (Navis 
via cws)

[jitendra] HIVE-6010: TestCompareCliDriver enables tests that would ensure 
vectorization produces same results as non-vectorized execution (Sergey 
Shelukhin via Jitendra Pandey)


Changes for Build #2544
[cws] HIVE-5911: Recent change to schema upgrade scripts breaks file naming 
conventions (Sergey Shelukhin via cws)


Changes for Build #2545

Changes for Build #2546
[ehans] HIVE-5757: Implement vectorized support for CASE (Eric Hanson)


Changes for Build #2547
[thejas] HIVE-5795 : Hive should be able to skip header and footer rows when 
reading data file for a table (Shuaishuai Nie via Thejas Nair)




No tests ran.

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #2547)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/2547/ to 
view the results.

[jira] [Updated] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table


 [ 
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuaishuai Nie updated HIVE-5795:
-

Release Note: 
hive.file.max.footer
  Default Value: 100
  Max number of lines of footer user can set for a table file.
skip.header.line.count
  Default Value: 0
  Number of header lines for the table file.
skip.footer.line.count
  Default Value: 0
  Number of footer lines for the table file.

skip.footer.line.count and skip.header.line.count should be specified in 
the table property during creating the table. Following example shows the usage 
of these two properties:

Create external table testtable (name string, message string) row format 
delimited fields terminated by '\t' lines terminated by '\n' location 
'/testtable' tblproperties (skip.header.line.count=1, 
skip.footer.line.count=2);


  was:
hive.file.max.footer
  Default Value: 100
  Max number of lines of footer user can set for a table file.
skip.header.line.count
  Default Value: 0
  Number of header lines for the table file.
skip.footer.line.count
  Default Value: 0
  Number of footer lines for the table file.

skip.footer.line.count and skip.header.line.count should be specified in 
the table property during creating the table. Following example shows the usage 
of these two properties:
{code}
Create external table testtable (name string, message string) row format 
delimited fields terminated by '\t' lines terminated by '\n' location 
'/testtable' tblproperties (skip.header.line.count=1, 
skip.footer.line.count=2);
{code}


 Hive should be able to skip header and footer rows when reading data file for 
 a table
 -

 Key: HIVE-5795
 URL: https://issues.apache.org/jira/browse/HIVE-5795
 Project: Hive
  Issue Type: New Feature
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Fix For: 0.13.0

 Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch, 
 HIVE-5795.4.patch, HIVE-5795.5.patch


 Hive should be able to skip header and footer lines when reading data file 
 from table. In this way, user don't need to processing data which generated 
 by other application with a header or footer and directly use the file for 
 table operations.
 To implement this, the idea is adding new properties in table descriptions to 
 define the number of lines in header and footer and skip them when reading 
 the record from record reader. An DDL example for creating a table with 
 header and footer should be like this:
 {code}
 Create external table testtable (name string, message string) row format 
 delimited fields terminated by '\t' lines terminated by '\n' location 
 '/testtable' tblproperties (skip.header.line.count=1, 
 skip.footer.line.count=2);
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table


 [ 
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuaishuai Nie updated HIVE-5795:
-

Release Note: 
hive.file.max.footer
  Default Value: 100
  Max number of lines of footer user can set for a table file.
skip.header.line.count
  Default Value: 0
  Number of header lines for the table file.
skip.footer.line.count
  Default Value: 0
  Number of footer lines for the table file.

skip.footer.line.count and skip.header.line.count should be specified in 
the table property during creating the table. Following example shows the usage 
of these two properties:
{code}
Create external table testtable (name string, message string) row format 
delimited fields terminated by '\t' lines terminated by '\n' location 
'/testtable' tblproperties (skip.header.line.count=1, 
skip.footer.line.count=2);
{code}

  was:
hive.file.max.footer
  Default Value: 100
  Max number of lines of footer user can set for a table file.
skip.header.line.count
  Default Value: 0
  Number of header lines for the table file.
skip.footer.line.count
  Default Value: 0
  Number of footer lines for the table file.


 Hive should be able to skip header and footer rows when reading data file for 
 a table
 -

 Key: HIVE-5795
 URL: https://issues.apache.org/jira/browse/HIVE-5795
 Project: Hive
  Issue Type: New Feature
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Fix For: 0.13.0

 Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch, 
 HIVE-5795.4.patch, HIVE-5795.5.patch


 Hive should be able to skip header and footer lines when reading data file 
 from table. In this way, user don't need to processing data which generated 
 by other application with a header or footer and directly use the file for 
 table operations.
 To implement this, the idea is adding new properties in table descriptions to 
 define the number of lines in header and footer and skip them when reading 
 the record from record reader. An DDL example for creating a table with 
 header and footer should be like this:
 {code}
 Create external table testtable (name string, message string) row format 
 delimited fields terminated by '\t' lines terminated by '\n' location 
 '/testtable' tblproperties (skip.header.line.count=1, 
 skip.footer.line.count=2);
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-4773) webhcat intermittently fail to commit output to file system


[ 
https://issues.apache.org/jira/browse/HIVE-4773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861922#comment-13861922
 ] 

Hive QA commented on HIVE-4773:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12621220/HIVE-4773.4.patch

{color:green}SUCCESS:{color} +1 4874 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/798/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/798/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12621220

 webhcat intermittently fail to commit output to file system
 ---

 Key: HIVE-4773
 URL: https://issues.apache.org/jira/browse/HIVE-4773
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-4773.1.patch, HIVE-4773.2.patch, HIVE-4773.3.patch, 
 HIVE-4773.4.patch


 With ASV as a default FS, we saw instances where output is not fully flushed 
 to storage before the Templeton controller process exits. This results in 
 stdout and stderr being empty even though the job completed successfully.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6125) Tez: Refactoring changes


[ 
https://issues.apache.org/jira/browse/HIVE-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861945#comment-13861945
 ] 

Thejas M Nair commented on HIVE-6125:
-

LGTM.  Just some minor comments in review board, otherwise +1.



 Tez: Refactoring changes
 

 Key: HIVE-6125
 URL: https://issues.apache.org/jira/browse/HIVE-6125
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6125.1.patch, HIVE-6125.2.patch, HIVE-6125.3.patch


 In order to facilitate merge back I've separated out all the changes that 
 don't require Tez. These changes introduce new interfaces, move code etc. In 
 preparation of the Tez specific classes. This should help show what changes 
 have been made that affect the MR codepath as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table

[
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861962#comment-13861962
]

Lefty Leverenz commented on HIVE-5795:
--

Got it, thanks [~shuainie]. One doc question: can skip.footer.line.count
and skip.header.line.count be changed, or specified for the first time, with
ALTER TABLE tbl SET TBLPROPERTIES and if so would any problems ensue? (Hm,
that's two or three questions. Here's another: can the values vary by
partition?)

[~thejas], a followup jira isn't needed to get the doc task done, because it's
already on my to-do list. I'll post a comment here when the doc is ready for
review.

TL;DR: This jira has a doc release note, so that covers the record-keeping
requirement. The new config parameter and table properties are named here, so
search capability is covered. The only question is whether we want all doc
tasks to have separate jiras. I don't see any immediate advantage to that
policy although we might want to move in that direction eventually.

Hive should be able to skip header and footer rows when reading data file for
a table
-

Key: HIVE-5795
URL: https://issues.apache.org/jira/browse/HIVE-5795
Project: Hive
Issue Type: New Feature
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
Fix For: 0.13.0

Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch,
HIVE-5795.4.patch, HIVE-5795.5.patch

Hive should be able to skip header and footer lines when reading data file
from table. In this way, user don't need to processing data which generated
by other application with a header or footer and directly use the file for
table operations.
To implement this, the idea is adding new properties in table descriptions to
define the number of lines in header and footer and skip them when reading
the record from record reader. An DDL example for creating a table with
header and footer should be like this:
{code}
Create external table testtable (name string, message string) row format
delimited fields terminated by '\t' lines terminated by '\n' location
'/testtable' tblproperties (skip.header.line.count=1,
skip.footer.line.count=2);
{code}

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table


[ 
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861969#comment-13861969
 ] 

Shuaishuai Nie commented on HIVE-5795:
--

Hi [~leftylev], this property name cannot be changes, and also since this is a 
table level property, it will apply to all partitions on the table. User cannot 
set this property on partition level. It should also works for ALTER TABLE 
statement if is not set when creating the table.

 Hive should be able to skip header and footer rows when reading data file for 
 a table
 -

 Key: HIVE-5795
 URL: https://issues.apache.org/jira/browse/HIVE-5795
 Project: Hive
  Issue Type: New Feature
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Fix For: 0.13.0

 Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch, 
 HIVE-5795.4.patch, HIVE-5795.5.patch


 Hive should be able to skip header and footer lines when reading data file 
 from table. In this way, user don't need to processing data which generated 
 by other application with a header or footer and directly use the file for 
 table operations.
 To implement this, the idea is adding new properties in table descriptions to 
 define the number of lines in header and footer and skip them when reading 
 the record from record reader. An DDL example for creating a table with 
 header and footer should be like this:
 {code}
 Create external table testtable (name string, message string) row format 
 delimited fields terminated by '\t' lines terminated by '\n' location 
 '/testtable' tblproperties (skip.header.line.count=1, 
 skip.footer.line.count=2);
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-4887) hive should have an option to disable non sql commands that impose security risk

2014-01-03 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861993#comment-13861993
 ] 

Jason Dere commented on HIVE-4887:
--

Hi [~thejas], [~brocknoland], on the subject of UDFs/JARs, for HIVE-6047 I was 
proposing that Hive have registered sets of jars which would be referenced by 
the UDFs, as opposed to URIs.  It would still be similar in that privileges 
could be defined so that users can only create UDFs using jars sets they can 
access.  Let me know if you guys would be ok with that.

 hive should have an option to disable non sql commands that impose security 
 risk
 

 Key: HIVE-4887
 URL: https://issues.apache.org/jira/browse/HIVE-4887
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization, Security
Reporter: Thejas M Nair
   Original Estimate: 72h
  Remaining Estimate: 72h

 Hive's RDBMS style of authorization (using grant/revoke), relies on all data 
 access being done through hive select queries. But hive also supports running 
 dfs commands, shell commands (eg !cat file), and shell commands through 
 hive streaming.
 This creates problems in securing a hive server using this authorization 
 model. UDF is another way to write custom code that can compromise security, 
 but you can control that by restricting access to users to be only through 
 jdbc connection to hive server (2).
 (note that there are other major problems such as this one - HIVE-3271)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-2599) Support Composit/Compound Keys with HBaseStorageHandler


[ 
https://issues.apache.org/jira/browse/HIVE-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861998#comment-13861998
 ] 

Hive QA commented on HIVE-2599:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12597571/HIVE-2599.2.patch.txt

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/800/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/800/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n '' ]]
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-Build-800/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 
'ql/src/test/results/clientnegative/exchange_partition_neg_partition_exists2.q.out'
Reverted 
'ql/src/test/results/clientnegative/exchange_partition_neg_partition_exists.q.out'
Reverted 
'ql/src/test/results/clientnegative/exchange_partition_neg_partition_exists3.q.out'
Reverted 
'ql/src/test/results/clientnegative/exchange_partition_neg_incomplete_partition.q.out'
Reverted 'ql/src/test/results/clientpositive/exchange_partition3.q.out'
Reverted 'ql/src/test/results/clientpositive/exchange_partition.q.out'
Reverted 'ql/src/test/results/clientpositive/exchange_partition2.q.out'
Reverted 
'ql/src/test/queries/clientnegative/exchange_partition_neg_incomplete_partition.q'
Reverted 
'ql/src/test/queries/clientnegative/exchange_partition_neg_partition_exists2.q'
Reverted 
'ql/src/test/queries/clientnegative/exchange_partition_neg_partition_exists3.q'
Reverted 
'ql/src/test/queries/clientnegative/exchange_partition_neg_partition_missing.q'
Reverted 
'ql/src/test/queries/clientnegative/exchange_partition_neg_partition_exists.q'
Reverted 'ql/src/test/queries/clientpositive/exchange_partition.q'
Reverted 'ql/src/test/queries/clientpositive/exchange_partition2.q'
Reverted 'ql/src/test/queries/clientpositive/exchange_partition3.q'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java'
++ egrep -v '^X|^Performing status on external'
++ awk '{print $2}'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20/target 
shims/0.20S/target shims/0.23/target shims/aggregator/target 
shims/common/target shims/common-secure/target packaging/target 
hbase-handler/target testutils/target jdbc/target metastore/target 
itests/target itests/hcatalog-unit/target itests/test-serde/target 
itests/qtest/target itests/hive-unit/target itests/custom-serde/target 
itests/util/target hcatalog/target hcatalog/storage-handlers/hbase/target 
hcatalog/server-extensions/target hcatalog/core/target 
hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target 
hcatalog/hcatalog-pig-adapter/target hwi/target common/target common/src/gen 
contrib/target service/target serde/target beeline/target odbc/target 
cli/target ql/dependency-reduced-pom.xml ql/target
+ svn update

Fetching external item into 'hcatalog/src/test/e2e/harness'
External at revision 1555274.

At revision 1555274.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12597571

 Support Composit/Compound Keys with HBaseStorageHandler
 ---

 Key: HIVE-2599
 URL: https://issues.apache.org/jira/browse/HIVE-2599
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Affects Versions: 0.8.0
Reporter: Hans Uhlig
Assignee: Swarnim Kulkarni
 Attachments: HIVE-2599.1.patch.txt, HIVE-2599.2.patch.txt, 
 HIVE-2599.2.patch.txt


 It would be really nice for hive to be able to understand composite keys from 
 an underlying HBase schema. Currently we have to store key fields twice to be 
 able to both key

[jira] [Commented] (HIVE-6129) alter exchange is implemented in inverted manner


[ 
https://issues.apache.org/jira/browse/HIVE-6129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861995#comment-13861995
 ] 

Hive QA commented on HIVE-6129:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12621248/HIVE-6129.1.patch.txt

{color:green}SUCCESS:{color} +1 4874 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/799/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/799/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12621248

 alter exchange is implemented in inverted manner
 

 Key: HIVE-6129
 URL: https://issues.apache.org/jira/browse/HIVE-6129
 Project: Hive
  Issue Type: Bug
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: HIVE-6129.1.patch.txt


 see 
 https://issues.apache.org/jira/browse/HIVE-4095?focusedCommentId=13819885page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13819885
 alter exchange should be implemented accord to document in 
 https://cwiki.apache.org/confluence/display/Hive/Exchange+Partition. i.e 
 {code}
 alter table T1 exchange partition (ds='1') with table T2 
 {code}
 should be (after creating T1@ds=1) 
 {quote}
 moves the data from T2 to T1@ds=1 
 {quote}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6124) Support basic Decimal arithmetic in vector mode (+, -, *)


 [ 
https://issues.apache.org/jira/browse/HIVE-6124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6124:
--

Attachment: HIVE-6124.02.patch

Finished patch, parked here for safe-keeping. Supports col-col, scalar-col, and 
col-scalar decimal operations. Unit tests included. Currently includes 
DecimalColumnVector, which needs to be removed before commit (after commit of 
HIVE-6051). 

 Support basic Decimal arithmetic in vector mode (+, -, *)
 -

 Key: HIVE-6124
 URL: https://issues.apache.org/jira/browse/HIVE-6124
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-6124.01.patch, HIVE-6124.02.patch


 Create support for basic decimal arithmetic (+, -, * but not /, %) based on 
 templates for column-scalar, scalar-column, and column-column operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-5923) SQL std auth - parser changes

2014-01-03 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5923:


   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Patch committed to trunk. Thanks for the review Brock!
(Regarding commit wait period - The original version of the patch was reviewed 
yesterday, and new version reviewed again today has only minor changes. )



 SQL std auth - parser changes
 -

 Key: HIVE-5923
 URL: https://issues.apache.org/jira/browse/HIVE-5923
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.13.0

 Attachments: HIVE-5923.1.patch, HIVE-5923.2.patch, HIVE-5923.3.patch, 
 HIVE-5923.4.patch

   Original Estimate: 96h
  Time Spent: 72h
  Remaining Estimate: 12h

 There are new access control statements proposed in the functional spec in 
 HIVE-5837 . It also proposes some small changes to the existing query syntax 
 (mostly extensions and some optional keywords).
 The syntax supported should depend on the current authorization mode.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6137) Hive should report that the file/path doesn’t exist when it doesn’t (it now reports SocketTimeoutException)

Hari Sankar Sivarama Subramaniyan created HIVE-6137:
---

 Summary: Hive should report that the file/path doesn’t exist when 
it doesn’t (it now reports SocketTimeoutException)
 Key: HIVE-6137
 URL: https://issues.apache.org/jira/browse/HIVE-6137
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan




Hive should report that the file/path doesn’t exist when it doesn’t (it now 
reports SocketTimeoutException):

Execute a Hive DDL query with a reference to a non-existent blob (such as 
CREATE EXTERNAL TABLE...) and check Hive logs (stderr):

FAILED: Error in metadata: org.apache.thrift.transport.TTransportException: 
java.net.SocketTimeoutException: Read timed out
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask

This error message is not intuitive. If a file doesn't exist, Hive should 
report FileNotFoundException.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6125) Tez: Refactoring changes


 [ 
https://issues.apache.org/jira/browse/HIVE-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6125:
-

Status: Patch Available  (was: Open)

 Tez: Refactoring changes
 

 Key: HIVE-6125
 URL: https://issues.apache.org/jira/browse/HIVE-6125
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6125.1.patch, HIVE-6125.2.patch, HIVE-6125.3.patch, 
 HIVE-6125.4.patch


 In order to facilitate merge back I've separated out all the changes that 
 don't require Tez. These changes introduce new interfaces, move code etc. In 
 preparation of the Tez specific classes. This should help show what changes 
 have been made that affect the MR codepath as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6125) Tez: Refactoring changes


 [ 
https://issues.apache.org/jira/browse/HIVE-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6125:
-

Status: Open  (was: Patch Available)

 Tez: Refactoring changes
 

 Key: HIVE-6125
 URL: https://issues.apache.org/jira/browse/HIVE-6125
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6125.1.patch, HIVE-6125.2.patch, HIVE-6125.3.patch, 
 HIVE-6125.4.patch


 In order to facilitate merge back I've separated out all the changes that 
 don't require Tez. These changes introduce new interfaces, move code etc. In 
 preparation of the Tez specific classes. This should help show what changes 
 have been made that affect the MR codepath as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6125) Tez: Refactoring changes


 [ 
https://issues.apache.org/jira/browse/HIVE-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6125:
-

Attachment: HIVE-6125.4.patch

.4 addresses the review comments.

 Tez: Refactoring changes
 

 Key: HIVE-6125
 URL: https://issues.apache.org/jira/browse/HIVE-6125
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6125.1.patch, HIVE-6125.2.patch, HIVE-6125.3.patch, 
 HIVE-6125.4.patch


 In order to facilitate merge back I've separated out all the changes that 
 don't require Tez. These changes introduce new interfaces, move code etc. In 
 preparation of the Tez specific classes. This should help show what changes 
 have been made that affect the MR codepath as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Hive-trunk-hadoop2 - Build # 648 - Still Failing

Changes for Build #640

Changes for Build #641
[navis] HIVE-5414 : The result of show grant is not visible via JDBC (Navis 
reviewed by Thejas M Nair)

[navis] HIVE-4257 : java.sql.SQLNonTransientConnectionException on 
JDBCStatsAggregator (Teddy Choi via Navis, reviewed by Ashutosh)


Changes for Build #642

Changes for Build #643
[ehans] HIVE-6017: Contribute Decimal128 high-performance decimal(p, s) package 
from Microsoft to Hive (Hideaki Kumura via Eric Hanson)


Changes for Build #644
[cws] HIVE-5911: Recent change to schema upgrade scripts breaks file naming 
conventions (Sergey Shelukhin via cws)

[cws] HIVE-3746: Fix HS2 ResultSet Serialization Performance Regression II 
(Navis via cws)

[cws] HIVE-3746: Fix HS2 ResultSet Serialization Performance Regression (Navis 
via cws)

[jitendra] HIVE-6010: TestCompareCliDriver enables tests that would ensure 
vectorization produces same results as non-vectorized execution (Sergey 
Shelukhin via Jitendra Pandey)


Changes for Build #645

Changes for Build #646
[ehans] HIVE-5757: Implement vectorized support for CASE (Eric Hanson)


Changes for Build #647
[thejas] HIVE-5795 : Hive should be able to skip header and footer rows when 
reading data file for a table (Shuaishuai Nie via Thejas Nair)


Changes for Build #648
[thejas] HIVE-5923 : SQL std auth - parser changes (Thejas Nair, reviewed by 
Brock Noland)




No tests ran.

The Apache Jenkins build system has built Hive-trunk-hadoop2 (build #648)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-hadoop2/648/ 
to view the results.

Re: [DISCUSS] Proposed Changes to the Apache Hive Project Bylaws

2014-01-03 Thread Alan Gates

One other benefit in rotating chairs is that it exposes more of Hive’s PMC 
members to the board and other Apache old timers.  This is helpful in getting 
better integrated into Apache and becoming a candidate for Apache membership.  
It is also an excellent education in the Apache Way for those who serve.

Alan.

On Dec 31, 2013, at 3:30 PM, Lefty Leverenz leftylever...@gmail.com wrote:

 Okay, I'm convinced that one-year terms for the chair are reasonable.
 Thanks for the reassurance, Edward and Thejas.
 
 Is 24h rule is needed at all? In other projects, I've seen patches simply
 reverted by author (or someone else). It's a rare occurrence, and it should
 be possible to revert a patch if someone -1s it after commit, esp. within
 the same 24 hours when not many other changes are in.
 
 
 Sergey makes a good point, but the 24h rule seems helpful in prioritizing
 tasks.  We're all deadline-driven, right?
 
 I'm the chief culprit of seeing patch available and ignoring it until it
 has been committed.  Then if I find some minor typo or doc issue, I'm
 embarrassed at posting a comment after the commit because nobody wants to
 revert a patch just for documentation.
 
 -- Lefty
 
 
 On Sun, Dec 29, 2013 at 12:06 PM, Thejas Nair the...@hortonworks.comwrote:
 
 On Sun, Dec 29, 2013 at 12:06 AM, Lefty Leverenz
 leftylever...@gmail.com wrote:
 Let's discuss annual rotation of the PMC chair a bit more.  Although I
 agree with the points made in favor, I wonder about frequent loss of
 expertise and needing to establish new relationships.  What's the ramp-up
 time?
 
 The ramp up time is not significant, as you can see from the list of
 responsibilities mentioned here -
 http://www.apache.org/dev/pmc.html#chair .
 We have enough people in PMC who have been involved with Apache
 project for long time and are familiar with apache bylaws and way of
 doing things. Also, the former PMC chairs are likely to be around to
 help as needed.
 
 Could a current chair be chosen for another consecutive term?  Could two
 chairs alternate years indefinitely?
 I would take the meaning of rotation to mean that we have a new chair
 for the next term. I think it should be OK to have same chair in
 alternative year. 2 years is a long time and it sounds reasonable
 given the size of the community ! :)
 
 Do many other projects have annual rotations?
 Yes, at least hadoop and pig project have that.  I could not find
 by-laws pages easily for other projects.
 
 
 Would it be inconvenient to change chairs in the middle of a release?
 No. The PMC Chair position does not have any special role in a release.
 
 And now to trivialize my comments:  while making other changes, let's fix
 this typo:  Membership of the PMC can be revoked by an unanimous vote
 ... *(should
 be a unanimous ... just like a university because the rule is based
 on
 sound, not spelling)*.
 
 I think you should feel free to fix such a typos in this wiki without
 a vote on it ! :)
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.
 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: How do you run single query test(s) after mavenization?

2014-01-03 Thread Alan Gates


 
 
 
 The rest of the ant instances are okay because the MVN section afterwards
 gives the alternative, but should we keep ant or make the replacements?
 
   - 9.  Now you can run the ant 'thriftif' target ...
   - 11.  ant thriftif -Dthrift.home=...
   - 15.  ant thriftif
   - 18. ant clean package
   - The maven equivalent of ant thriftif is:
 
 mvn clean install -Pthriftif -DskipTests -Dthrift.home=/usr/local
 
 
 
 I have not generated the thrift stuff recently. It would be great if Alan
 or someone else who has would update this section.

I can take a look at this.  It works with pretty minimal changes.

Alan.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

[jira] [Commented] (HIVE-5155) Support secure proxy user access to HiveServer2

2014-01-03 Thread Shivaraju Gowda (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862038#comment-13862038
 ] 

Shivaraju Gowda commented on HIVE-5155:
---

It is important to note that the middle ware server does not have access to 
Principal's credentials. All it has is a javax.security.auth.Subject(Subject) 
from the end-user(Principal) and can do a Subject.doAS() to connect to 
HiveServer2. In Proposal 2, the middle ware server is expected to have access 
to Hadoop-level super-user's credentials(by doing kinit) or it has the 
Subject from a Hadoop-level super-user which has been passed on to it. In the 
code I have attached above, I am trying to show that any end-user's Subject can 
be effectively used to connect to HiveServer2 using Subject.doAs() in the 
middle ware server. This  will allow multi-user kerberos access through the 
middleware server without additional requirements of proxy access. I might have 
overlooked or be unaware of some limitations of such an approach, so I am 
soliciting feedback to check that.

 Support secure proxy user access to HiveServer2
 ---

 Key: HIVE-5155
 URL: https://issues.apache.org/jira/browse/HIVE-5155
 Project: Hive
  Issue Type: Improvement
  Components: Authentication, HiveServer2, JDBC
Affects Versions: 0.12.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-5155-1-nothrift.patch, HIVE-5155-noThrift.2.patch, 
 HIVE-5155-noThrift.4.patch, HIVE-5155-noThrift.5.patch, 
 HIVE-5155-noThrift.6.patch, HIVE-5155.1.patch, HIVE-5155.2.patch, 
 HIVE-5155.3.patch, ProxyAuth.java, ProxyAuth.out, TestKERBEROS_Hive_JDBC.java


 The HiveServer2 can authenticate a client using via Kerberos and impersonate 
 the connecting user with underlying secure hadoop. This becomes a gateway for 
 a remote client to access secure hadoop cluster. Now this works fine for when 
 the client obtains Kerberos ticket and directly connects to HiveServer2. 
 There's another big use case for middleware tools where the end user wants to 
 access Hive via another server. For example Oozie action or Hue submitting 
 queries or a BI tool server accessing to HiveServer2. In these cases, the 
 third party server doesn't have end user's Kerberos credentials and hence it 
 can't submit queries to HiveServer2 on behalf of the end user.
 This ticket is for enabling proxy access to HiveServer2 for third party tools 
 on behalf of end users. There are two parts of the solution proposed in this 
 ticket:
 1) Delegation token based connection for Oozie (OOZIE-1457)
 This is the common mechanism for Hadoop ecosystem components. Hive Remote 
 Metastore and HCatalog already support this. This is suitable for tool like 
 Oozie that submits the MR jobs as actions on behalf of its client. Oozie 
 already uses similar mechanism for Metastore/HCatalog access.
 2) Direct proxy access for privileged hadoop users
 The delegation token implementation can be a challenge for non-hadoop 
 (especially non-java) components. This second part enables a privileged user 
 to directly specify an alternate session user during the connection. If the 
 connecting user has hadoop level privilege to impersonate the requested 
 userid, then HiveServer2 will run the session as that requested user. For 
 example, user Hue is allowed to impersonate user Bob (via core-site.xml proxy 
 user configuration). Then user Hue can connect to HiveServer2 and specify Bob 
 as session user via a session property. HiveServer2 will verify Hue's proxy 
 user privilege and then impersonate user Bob instead of Hue. This will enable 
 any third party tool to impersonate alternate userid without having to 
 implement delegation token connection.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6138) Tez: Add some additional comments to clarify intent

Gunther Hagleitner created HIVE-6138:


 Summary: Tez: Add some additional comments to clarify intent
 Key: HIVE-6138
 URL: https://issues.apache.org/jira/browse/HIVE-6138
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: tez-branch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6138) Tez: Add some additional comments to clarify intent


 [ 
https://issues.apache.org/jira/browse/HIVE-6138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6138:
-

Attachment: HIVE-6138.1.patch

 Tez: Add some additional comments to clarify intent
 ---

 Key: HIVE-6138
 URL: https://issues.apache.org/jira/browse/HIVE-6138
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: tez-branch

 Attachments: HIVE-6138.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Resolved] (HIVE-6138) Tez: Add some additional comments to clarify intent


 [ 
https://issues.apache.org/jira/browse/HIVE-6138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner resolved HIVE-6138.
--

Resolution: Fixed

Committed to branch.

 Tez: Add some additional comments to clarify intent
 ---

 Key: HIVE-6138
 URL: https://issues.apache.org/jira/browse/HIVE-6138
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: tez-branch

 Attachments: HIVE-6138.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6017) Contribute Decimal128 high-performance decimal(p, s) package from Microsoft to Hive


[ 
https://issues.apache.org/jira/browse/HIVE-6017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862044#comment-13862044
 ] 

Lefty Leverenz commented on HIVE-6017:
--

Okay, thanks Eric.

 Contribute Decimal128 high-performance decimal(p, s) package from Microsoft 
 to Hive
 ---

 Key: HIVE-6017
 URL: https://issues.apache.org/jira/browse/HIVE-6017
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6017.01.patch, HIVE-6017.02.patch, 
 HIVE-6017.03.patch, HIVE-6017.04.patch


 Contribute the Decimal128 high-performance decimal package developed by 
 Microsoft to Hive. This was originally written for Microsoft PolyBase by 
 Hideaki Kimura.
 This code is about 8X more efficient than Java BigDecimal for typical 
 operations. It uses a finite (128 bit) precision and can handle up to 
 decimal(38, X). It is also mutable so you can change the contents of an 
 existing object. This helps reduce the cost of new() and garbage collection.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Hive-trunk-h0.21 - Build # 2548 - Still Failing

Changes for Build #2539

Changes for Build #2540
[navis] HIVE-5414 : The result of show grant is not visible via JDBC (Navis 
reviewed by Thejas M Nair)


Changes for Build #2541

Changes for Build #2542
[ehans] HIVE-6017: Contribute Decimal128 high-performance decimal(p, s) package 
from Microsoft to Hive (Hideaki Kumura via Eric Hanson)


Changes for Build #2543
[cws] HIVE-3746: Fix HS2 ResultSet Serialization Performance Regression II 
(Navis via cws)

[cws] HIVE-3746: Fix HS2 ResultSet Serialization Performance Regression (Navis 
via cws)

[jitendra] HIVE-6010: TestCompareCliDriver enables tests that would ensure 
vectorization produces same results as non-vectorized execution (Sergey 
Shelukhin via Jitendra Pandey)


Changes for Build #2544
[cws] HIVE-5911: Recent change to schema upgrade scripts breaks file naming 
conventions (Sergey Shelukhin via cws)


Changes for Build #2545

Changes for Build #2546
[ehans] HIVE-5757: Implement vectorized support for CASE (Eric Hanson)


Changes for Build #2547
[thejas] HIVE-5795 : Hive should be able to skip header and footer rows when 
reading data file for a table (Shuaishuai Nie via Thejas Nair)


Changes for Build #2548
[thejas] HIVE-5923 : SQL std auth - parser changes (Thejas Nair, reviewed by 
Brock Noland)




No tests ran.

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #2548)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/2548/ to 
view the results.

[jira] [Commented] (HIVE-5923) SQL std auth - parser changes


[ 
https://issues.apache.org/jira/browse/HIVE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862053#comment-13862053
 ] 

Lefty Leverenz commented on HIVE-5923:
--

Should this be documented now or should we wait for the umbrella jira 
(HIVE-5837) or release 0.13.0, whichever comes first?

 SQL std auth - parser changes
 -

 Key: HIVE-5923
 URL: https://issues.apache.org/jira/browse/HIVE-5923
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.13.0

 Attachments: HIVE-5923.1.patch, HIVE-5923.2.patch, HIVE-5923.3.patch, 
 HIVE-5923.4.patch

   Original Estimate: 96h
  Time Spent: 72h
  Remaining Estimate: 12h

 There are new access control statements proposed in the functional spec in 
 HIVE-5837 . It also proposes some small changes to the existing query syntax 
 (mostly extensions and some optional keywords).
 The syntax supported should depend on the current authorization mode.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-5771) Constant propagation optimizer for Hive


[ 
https://issues.apache.org/jira/browse/HIVE-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862064#comment-13862064
 ] 

Eric Hanson commented on HIVE-5771:
---

Where does this patch stand? Ted, are you going to move it forward?

 Constant propagation optimizer for Hive
 ---

 Key: HIVE-5771
 URL: https://issues.apache.org/jira/browse/HIVE-5771
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Ted Xu
Assignee: Ted Xu
 Attachments: HIVE-5771.1.patch, HIVE-5771.2.patch, HIVE-5771.3.patch, 
 HIVE-5771.4.patch, HIVE-5771.patch


 Currently there is no constant folding/propagation optimizer, all expressions 
 are evaluated at runtime. 
 HIVE-2470 did a great job on evaluating constants on UDF initializing phase, 
 however, it is still a runtime evaluation and it doesn't propagate constants 
 from a subquery to outside.
 It may reduce I/O and accelerate process if we introduce such an optimizer.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6139) Implement vectorized decimal division and modulo

Eric Hanson created HIVE-6139:
-

 Summary: Implement vectorized decimal division and modulo
 Key: HIVE-6139
 URL: https://issues.apache.org/jira/browse/HIVE-6139
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6139) Implement vectorized decimal division and modulo


 [ 
https://issues.apache.org/jira/browse/HIVE-6139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6139:
--

Description: Support column-scalar, scalar-column, and column-column 
versions for division and modulo. Include unit tests.

 Implement vectorized decimal division and modulo
 

 Key: HIVE-6139
 URL: https://issues.apache.org/jira/browse/HIVE-6139
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson

 Support column-scalar, scalar-column, and column-column versions for division 
 and modulo. Include unit tests.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

checking progress of automated tests for a patch

2014-01-03 Thread Eric Hanson (BIG DATA)

Is there a way to check the progress of the automated tests after you've 
uploaded a patch? If so, how?

Thanks,
Eric

[jira] [Commented] (HIVE-6125) Tez: Refactoring changes


[ 
https://issues.apache.org/jira/browse/HIVE-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862080#comment-13862080
 ] 

Hive QA commented on HIVE-6125:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12621386/HIVE-6125.3.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4874 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/802/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/802/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12621386

 Tez: Refactoring changes
 

 Key: HIVE-6125
 URL: https://issues.apache.org/jira/browse/HIVE-6125
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6125.1.patch, HIVE-6125.2.patch, HIVE-6125.3.patch, 
 HIVE-6125.4.patch


 In order to facilitate merge back I've separated out all the changes that 
 don't require Tez. These changes introduce new interfaces, move code etc. In 
 preparation of the Tez specific classes. This should help show what changes 
 have been made that affect the MR codepath as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-5904) HiveServer2 JDBC connect to non-default database


 [ 
https://issues.apache.org/jira/browse/HIVE-5904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5904:


Assignee: Matt Tucker

 HiveServer2 JDBC connect to non-default database
 

 Key: HIVE-5904
 URL: https://issues.apache.org/jira/browse/HIVE-5904
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.12.0
Reporter: Matt Tucker
Assignee: Matt Tucker
 Attachments: HIVE-5904.patch


 When connecting to HiveServer to via the following URLs, the session uses the 
 'default' database, instead of the intended database.
 jdbc://localhost:1/customDb
 jdbc:///customDb



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-5904) HiveServer2 JDBC connect to non-default database


 [ 
https://issues.apache.org/jira/browse/HIVE-5904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5904:


Resolution: Duplicate
Status: Resolved  (was: Patch Available)

The fix for this got committed through HIVE-4256 .
[~matucker],
Sorry, didn't notice that you had a patch for same issue here.
I have added you as a contributor on jira so that you can assign yourself jiras.


 HiveServer2 JDBC connect to non-default database
 

 Key: HIVE-5904
 URL: https://issues.apache.org/jira/browse/HIVE-5904
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.12.0
Reporter: Matt Tucker
Assignee: Matt Tucker
 Attachments: HIVE-5904.patch


 When connecting to HiveServer to via the following URLs, the session uses the 
 'default' database, instead of the intended database.
 jdbc://localhost:1/customDb
 jdbc:///customDb



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Comment Edited] (HIVE-5904) HiveServer2 JDBC connect to non-default database


[ 
https://issues.apache.org/jira/browse/HIVE-5904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862098#comment-13862098
 ] 

Thejas M Nair edited comment on HIVE-5904 at 1/4/14 12:58 AM:
--

The fix for this got committed through HIVE-4256 .
[~matucker],
Sorry, didn't notice that you had a patch for same issue here.
I have added you as a contributor on jira so that you can assign jiras to 
yourself.



was (Author: thejas):
The fix for this got committed through HIVE-4256 .
[~matucker],
Sorry, didn't notice that you had a patch for same issue here.
I have added you as a contributor on jira so that you can assign yourself jiras.


 HiveServer2 JDBC connect to non-default database
 

 Key: HIVE-5904
 URL: https://issues.apache.org/jira/browse/HIVE-5904
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.12.0
Reporter: Matt Tucker
Assignee: Matt Tucker
 Attachments: HIVE-5904.patch


 When connecting to HiveServer to via the following URLs, the session uses the 
 'default' database, instead of the intended database.
 jdbc://localhost:1/customDb
 jdbc:///customDb



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6140) trim udf is very slow

Thejas M Nair created HIVE-6140:
---

 Summary: trim udf is very slow
 Key: HIVE-6140
 URL: https://issues.apache.org/jira/browse/HIVE-6140
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Thejas M Nair



Paraphrasing what was reported by [~cartershanklin] -

I used the attached Perl script to generate 500 million two-character strings 
which always included a space. I loaded it using:
create table letters (l string); 
load data local inpath '/home/sandbox/data.csv' overwrite into table letters;
Then I ran this SQL script:
select count(l) from letters where l = 'l ';
select count(l) from letters where trim(l) = 'l';

First query = 170 seconds
Second query  = 514 seconds





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: How do you run single query test(s) after mavenization?

2014-01-03 Thread Alan Gates

Ok, I’ve updated it to just have the maven instructions, since I’m assuming no 
one cares about the ant ones anymore.

Alan.

On Jan 3, 2014, at 3:46 PM, Alan Gates ga...@hortonworks.com wrote:

 
 
 
 
 The rest of the ant instances are okay because the MVN section afterwards
 gives the alternative, but should we keep ant or make the replacements?
 
  - 9.  Now you can run the ant 'thriftif' target ...
  - 11.  ant thriftif -Dthrift.home=...
  - 15.  ant thriftif
  - 18. ant clean package
  - The maven equivalent of ant thriftif is:
 
 mvn clean install -Pthriftif -DskipTests -Dthrift.home=/usr/local
 
 
 
 I have not generated the thrift stuff recently. It would be great if Alan
 or someone else who has would update this section.
 
 I can take a look at this.  It works with pretty minimal changes.
 
 Alan.
 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

[jira] [Assigned] (HIVE-6140) trim udf is very slow

2014-01-03 Thread Anandha L Ranganathan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anandha L Ranganathan reassigned HIVE-6140:
---

Assignee: Anandha L Ranganathan

 trim udf is very slow
 -

 Key: HIVE-6140
 URL: https://issues.apache.org/jira/browse/HIVE-6140
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Thejas M Nair
Assignee: Anandha L Ranganathan

 Paraphrasing what was reported by [~cartershanklin] -
 I used the attached Perl script to generate 500 million two-character strings 
 which always included a space. I loaded it using:
 create table letters (l string); 
 load data local inpath '/home/sandbox/data.csv' overwrite into table letters;
 Then I ran this SQL script:
 select count(l) from letters where l = 'l ';
 select count(l) from letters where trim(l) = 'l';
 First query = 170 seconds
 Second query  = 514 seconds



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-5923) SQL std auth - parser changes


 [ 
https://issues.apache.org/jira/browse/HIVE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5923:


Release Note: 
Grant privilege and revoke privilege statements need to be changed to remove 
the requirement (but not the option) for the noise word TABLE. In the SQL 
specification table is the assumed default for grant and revoke statements. 
Today Hive’s syntax is GRANT action ON TABLE table TO grantee. It should be 
GRANT action ON [TABLE] table TO grantee.
Grant role and revoke role statements has been changed to remove the need for 
keyword ROLE.
Support for WITH ADMIN OPTION needs to be added to grant role and revoke role 
statement syntax.

 SQL std auth - parser changes
 -

 Key: HIVE-5923
 URL: https://issues.apache.org/jira/browse/HIVE-5923
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.13.0

 Attachments: HIVE-5923.1.patch, HIVE-5923.2.patch, HIVE-5923.3.patch, 
 HIVE-5923.4.patch

   Original Estimate: 96h
  Time Spent: 72h
  Remaining Estimate: 12h

 There are new access control statements proposed in the functional spec in 
 HIVE-5837 . It also proposes some small changes to the existing query syntax 
 (mostly extensions and some optional keywords).
 The syntax supported should depend on the current authorization mode.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-5923) SQL std auth - parser changes

[
https://issues.apache.org/jira/browse/HIVE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Thejas M Nair updated HIVE-5923:

Release Note:
Grant privilege and revoke privilege statements don't have the requirement (but
not the option) for the noise word TABLE.
TABLE is the assumed default for grant and revoke statements.
Hive’s syntax changes from GRANT action ON TABLE table TO grantee to GRANT
action ON [TABLE] table TO grantee.

Grant role and revoke role statements has been changed to remove the need for
keyword ROLE.

Support for WITH ADMIN OPTION has been added to grant role and revoke role
statement syntax.

was:
Grant privilege and revoke privilege statements need to be changed to remove
the requirement (but not the option) for the noise word TABLE. In the SQL
specification table is the assumed default for grant and revoke statements.
Today Hive’s syntax is GRANT action ON TABLE table TO grantee. It should be
GRANT action ON [TABLE] table TO grantee.
Grant role and revoke role statements has been changed to remove the need for
keyword ROLE.
Support for WITH ADMIN OPTION needs to be added to grant role and revoke role
statement syntax.

SQL std auth - parser changes
-

Key: HIVE-5923
URL: https://issues.apache.org/jira/browse/HIVE-5923
Project: Hive
Issue Type: Sub-task
Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
Fix For: 0.13.0

Attachments: HIVE-5923.1.patch, HIVE-5923.2.patch, HIVE-5923.3.patch,
HIVE-5923.4.patch

Original Estimate: 96h
Time Spent: 72h
Remaining Estimate: 12h

There are new access control statements proposed in the functional spec in
HIVE-5837 . It also proposes some small changes to the existing query syntax
(mostly extensions and some optional keywords).
The syntax supported should depend on the current authorization mode.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6140) trim udf is very slow

2014-01-03 Thread Anandha L Ranganathan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862177#comment-13862177
 ] 

Anandha L Ranganathan commented on HIVE-6140:
-

[~thejas]/[~cartershanklin]

Could you provide data.csv file that caused the problem. Otherwise provide 
example of the data.

 trim udf is very slow
 -

 Key: HIVE-6140
 URL: https://issues.apache.org/jira/browse/HIVE-6140
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Thejas M Nair
Assignee: Anandha L Ranganathan

 Paraphrasing what was reported by [~cartershanklin] -
 I used the attached Perl script to generate 500 million two-character strings 
 which always included a space. I loaded it using:
 create table letters (l string); 
 load data local inpath '/home/sandbox/data.csv' overwrite into table letters;
 Then I ran this SQL script:
 select count(l) from letters where l = 'l ';
 select count(l) from letters where trim(l) = 'l';
 First query = 170 seconds
 Second query  = 514 seconds



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6051) Create DecimalColumnVector and a representative VectorExpression for decimal


[ 
https://issues.apache.org/jira/browse/HIVE-6051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862191#comment-13862191
 ] 

Hive QA commented on HIVE-6051:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12621359/HIVE-6051.02.patch

{color:green}SUCCESS:{color} +1 4876 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/803/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/803/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12621359

 Create DecimalColumnVector and a representative VectorExpression for decimal
 

 Key: HIVE-6051
 URL: https://issues.apache.org/jira/browse/HIVE-6051
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6051.01.patch, HIVE-6051.02.patch


 Create a DecimalColumnVector to use as a basis for vectorized decimal 
 operations. Include a representative VectorExpression on decimal (e.g. 
 column-column addition) to demonstrate it's use.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: checking progress of automated tests for a patch

2014-01-03 Thread Brock Noland

Yes, you can, it's here:
http://bigtop01.cloudera.org:8080/view/Hive/job/PreCommit-HIVE-Build/

Unfortunately the version of jenkins they are using doesn't support putting
the JIRA number in the build description so it's kind of opaque. But if you
view the full console for a build (
http://bigtop01.cloudera.org:8080/view/Hive/job/PreCommit-HIVE-Build/804/consoleFull)
at the top you'll see something like:

ISSUE_NUM=6125


which means that build is for HIVE-6125. All the logs for that build (#804)
can be seen here:

http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/




On Fri, Jan 3, 2014 at 6:25 PM, Eric Hanson (BIG DATA) 
eric.n.han...@microsoft.com wrote:

 Is there a way to check the progress of the automated tests after you've
 uploaded a patch? If so, how?

 Thanks,
 Eric




-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org

[jira] [Commented] (HIVE-6125) Tez: Refactoring changes