[jira] [Commented] (HIVE-7728) Enable q-tests for TABLESAMPLE feature [Spark Branch]

2014-08-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101940#comment-14101940
 ] 

Hive QA commented on HIVE-7728:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12662698/HIVE-7728.1-spark.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5927 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_fs_default_name2
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/60/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/60/console
Test logs: 
http://ec2-54-176-176-199.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-60/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12662698

> Enable q-tests for TABLESAMPLE feature  [Spark Branch]
> --
>
> Key: HIVE-7728
> URL: https://issues.apache.org/jira/browse/HIVE-7728
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
> Attachments: HIVE-7728.1-spark.patch
>
>
> Enable q-tests for TABLESAMPLE feature since automatic test environment is 
> ready.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6329) Support column level encryption/decryption

2014-08-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101937#comment-14101937
 ] 

Hive QA commented on HIVE-6329:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12662662/HIVE-6329.10.patch.txt

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5821 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/395/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/395/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-395/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12662662

> Support column level encryption/decryption
> --
>
> Key: HIVE-6329
> URL: https://issues.apache.org/jira/browse/HIVE-6329
> Project: Hive
>  Issue Type: New Feature
>  Components: Security, Serializers/Deserializers
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-6329.1.patch.txt, HIVE-6329.10.patch.txt, 
> HIVE-6329.2.patch.txt, HIVE-6329.3.patch.txt, HIVE-6329.4.patch.txt, 
> HIVE-6329.5.patch.txt, HIVE-6329.6.patch.txt, HIVE-6329.7.patch.txt, 
> HIVE-6329.8.patch.txt, HIVE-6329.9.patch.txt
>
>
> Receiving some requirements on encryption recently but hive is not supporting 
> it. Before the full implementation via HIVE-5207, this might be useful for 
> some cases.
> {noformat}
> hive> create table encode_test(id int, name STRING, phone STRING, address 
> STRING) 
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
> > WITH SERDEPROPERTIES ('column.encode.columns'='phone,address', 
> 'column.encode.classname'='org.apache.hadoop.hive.serde2.Base64WriteOnly') 
> STORED AS TEXTFILE;
> OK
> Time taken: 0.584 seconds
> hive> insert into table encode_test select 
> 100,'navis','010--','Seoul, Seocho' from src tablesample (1 rows);
> ..
> OK
> Time taken: 5.121 seconds
> hive> select * from encode_test;
> OK
> 100   navis MDEwLTAwMDAtMDAwMA==  U2VvdWwsIFNlb2Nobw==
> Time taken: 0.078 seconds, Fetched: 1 row(s)
> hive> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7513) Add ROW__ID VirtualColumn

2014-08-18 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101926#comment-14101926
 ] 

Lefty Leverenz commented on HIVE-7513:
--

Is this just behind-the-scenes or does it need some user doc?

> Add ROW__ID VirtualColumn
> -
>
> Key: HIVE-7513
> URL: https://issues.apache.org/jira/browse/HIVE-7513
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.14.0
>
> Attachments: HIVE-7513.10.patch, HIVE-7513.11.patch, 
> HIVE-7513.12.patch, HIVE-7513.13.patch, HIVE-7513.14.patch, 
> HIVE-7513.3.patch, HIVE-7513.4.patch, HIVE-7513.5.patch, HIVE-7513.8.patch, 
> HIVE-7513.9.patch, HIVE-7513.codeOnly.txt
>
>
> In order to support Update/Delete we need to read rowId from AcidInputFormat 
> and pass that along through the operator pipeline (built from the WHERE 
> clause of the SQL Statement) so that it can be written to the delta file by 
> the update/delete (sink) operators.
> The parser will add this column to the projection list to make sure it's 
> passed along.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7778) hive deal with sql witch has whitespace character

2014-08-18 Thread peter zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

peter zhao updated HIVE-7778:
-

Description: 
i run sql "set hive.exec.dynamic.partition.mode=nonstrict" with ibatis,because 
ibatis usiing xml file to hold the sql str.it has some format,so hive server 
recive the sql like this "  \t set hive.exec.dynamic.partition.mode=nonstrict  
",so 
in org.apache.hive.service.cli.operation.HiveCommandOperation.run() method, it 
deal with "\t" or any other whitespace charactors not very good.then generat 
variable key is "set hive.exec.dynamic.partition.mode", and the right key may 
be "hive.exec.dynamic.partition.mode", so my next "select by partition" sql 
throw a strict exception.

  String command = getStatement().trim();
  String[] tokens = statement.split("\\\s"); //this position may be 
change to command.split("\\\s"); 
  String commandArgs = command.substring(tokens\[0\].length()).trim();

  was:
i run sql "set hive.exec.dynamic.partition.mode=nonstrict" with ibatis,because 
ibatis usiing xml file to hold the sql str.it has some format,so hive server 
recive the sql like this "  \t set hive.exec.dynamic.partition.mode=nonstrict  
",so 
in org.apache.hive.service.cli.operation.HiveCommandOperation.run() method, it 
deal with "\t" or any other whitespace charactors not very good.then generat 
variable key is "set hive.exec.dynamic.partition.mode", and the right key may 
be "hive.exec.dynamic.partition.mode", so my next "select by partition" sql 
throw a strict exception.

  String command = getStatement().trim();
  String[] tokens = statement.split("\\s"); //this position may be 
change to command.split("\\s"); 
  String commandArgs = command.substring(tokens[0].length()).trim();


> hive deal with sql witch has whitespace character
> -
>
> Key: HIVE-7778
> URL: https://issues.apache.org/jira/browse/HIVE-7778
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.13.1
>Reporter: peter zhao
>Priority: Minor
>
> i run sql "set hive.exec.dynamic.partition.mode=nonstrict" with 
> ibatis,because ibatis usiing xml file to hold the sql str.it has some 
> format,so hive server recive the sql like this "  \t set 
> hive.exec.dynamic.partition.mode=nonstrict  ",so 
> in org.apache.hive.service.cli.operation.HiveCommandOperation.run() method, 
> it deal with "\t" or any other whitespace charactors not very good.then 
> generat variable key is "set hive.exec.dynamic.partition.mode", and the right 
> key may be "hive.exec.dynamic.partition.mode", so my next "select by 
> partition" sql throw a strict exception.
>   String command = getStatement().trim();
>   String[] tokens = statement.split("\\\s"); //this position may be 
> change to command.split("\\\s"); 
>   String commandArgs = command.substring(tokens\[0\].length()).trim();



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7778) hive deal with sql witch has whitespace character

2014-08-18 Thread peter zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

peter zhao updated HIVE-7778:
-

Description: 
i run sql "set hive.exec.dynamic.partition.mode=nonstrict" with ibatis,because 
ibatis usiing xml file to hold the sql str.it has some format,so hive server 
recive the sql like this "  \t set hive.exec.dynamic.partition.mode=nonstrict  
",so 
in org.apache.hive.service.cli.operation.HiveCommandOperation.run() method, it 
deal with "\t" or any other whitespace charactors not very good.then generat 
variable key is "set hive.exec.dynamic.partition.mode", and the right key may 
be "hive.exec.dynamic.partition.mode", so my next "select by partition" sql 
throw a strict exception.

  String command = getStatement().trim();
  String[] tokens = statement.split("\\s"); //this position may be 
change to command.split("\\s"); 
  String commandArgs = command.substring(tokens[0].length()).trim();

  was:

i run sql "set hive.exec.dynamic.partition.mode=nonstrict" with ibatis,becaust 
ibatis usiing xml file to hold the sql str.it has some format,so hive server 
recive the sql like this "  \t set hive.exec.dynamic.partition.mode=nonstrict  
",so 
in org.apache.hive.service.cli.operation.HiveCommandOperation.run() method, it 
deal with "\t" not very good.then generat variable key is "set 
hive.exec.dynamic.partition.mode", and the right key may be 
"hive.exec.dynamic.partition.mode", so my next "select by partition" sql throw 
a strict exception.


> hive deal with sql witch has whitespace character
> -
>
> Key: HIVE-7778
> URL: https://issues.apache.org/jira/browse/HIVE-7778
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.13.1
>Reporter: peter zhao
>Priority: Minor
>
> i run sql "set hive.exec.dynamic.partition.mode=nonstrict" with 
> ibatis,because ibatis usiing xml file to hold the sql str.it has some 
> format,so hive server recive the sql like this "  \t set 
> hive.exec.dynamic.partition.mode=nonstrict  ",so 
> in org.apache.hive.service.cli.operation.HiveCommandOperation.run() method, 
> it deal with "\t" or any other whitespace charactors not very good.then 
> generat variable key is "set hive.exec.dynamic.partition.mode", and the right 
> key may be "hive.exec.dynamic.partition.mode", so my next "select by 
> partition" sql throw a strict exception.
>   String command = getStatement().trim();
>   String[] tokens = statement.split("\\s"); //this position may be 
> change to command.split("\\s"); 
>   String commandArgs = command.substring(tokens[0].length()).trim();



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7778) hive deal with sql witch has whitespace character

2014-08-18 Thread peter zhao (JIRA)
peter zhao created HIVE-7778:


 Summary: hive deal with sql witch has whitespace character
 Key: HIVE-7778
 URL: https://issues.apache.org/jira/browse/HIVE-7778
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.13.1
Reporter: peter zhao
Priority: Minor



i run sql "set hive.exec.dynamic.partition.mode=nonstrict" with ibatis,becaust 
ibatis usiing xml file to hold the sql str.it has some format,so hive server 
recive the sql like this "  \t set hive.exec.dynamic.partition.mode=nonstrict  
",so 
in org.apache.hive.service.cli.operation.HiveCommandOperation.run() method, it 
deal with "\t" not very good.then generat variable key is "set 
hive.exec.dynamic.partition.mode", and the right key may be 
"hive.exec.dynamic.partition.mode", so my next "select by partition" sql throw 
a strict exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7777) add CSV support for Serde

2014-08-18 Thread Ferdinand Xu (JIRA)
Ferdinand Xu created HIVE-:
--

 Summary: add CSV support for Serde
 Key: HIVE-
 URL: https://issues.apache.org/jira/browse/HIVE-
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu


There is no official support for csvSerde for hive while there is an open 
source project in github(https://github.com/ogrodnek/csv-serde). CSV is of high 
frequency in use as a data format.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7341) Support for Table replication across HCatalog instances

2014-08-18 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101912#comment-14101912
 ] 

Lefty Leverenz commented on HIVE-7341:
--

Thanks for the doc note, [~sushanth].  When you say "should mostly be covered 
by javadocs and the bug report" that leaves a little wiggle room for user docs, 
although I don't see a good place for this in the HCatalog wikidocs.  Would 
this only be done by external systems such as Falcon, or could it also be done 
directly by a Hive/HCat administrator?

> Support for Table replication across HCatalog instances
> ---
>
> Key: HIVE-7341
> URL: https://issues.apache.org/jira/browse/HIVE-7341
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog
>Affects Versions: 0.13.1
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Fix For: 0.14.0
>
> Attachments: HIVE-7341.1.patch, HIVE-7341.2.patch, HIVE-7341.3.patch, 
> HIVE-7341.4.patch, HIVE-7341.5.patch
>
>
> The HCatClient currently doesn't provide very much support for replicating 
> HCatTable definitions between 2 HCatalog Server (i.e. Hive metastore) 
> instances. 
> Systems similar to Apache Falcon might find the need to replicate partition 
> data between 2 clusters, and keep the HCatalog metadata in sync between the 
> two. This poses a couple of problems:
> # The definition of the source table might change (in column schema, I/O 
> formats, record-formats, serde-parameters, etc.) The system will need a way 
> to diff 2 tables and update the target-metastore with the changes. E.g. 
> {code}
> targetTable.resolve( sourceTable, targetTable.diff(sourceTable) );
> hcatClient.updateTableSchema(dbName, tableName, targetTable);
> {code}
> # The current {{HCatClient.addPartitions()}} API requires that the 
> partition's schema be derived from the table's schema, thereby requiring that 
> the table-schema be resolved *before* partitions with the new schema are 
> added to the table. This is problematic, because it introduces race 
> conditions when 2 partitions with differing column-schemas (e.g. right after 
> a schema change) are copied in parallel. This can be avoided if each 
> HCatAddPartitionDesc kept track of the partition's schema, in flight.
> # The source and target metastores might be running different/incompatible 
> versions of Hive. 
> The impending patch attempts to address these concerns (with some caveats).
> # {{HCatTable}} now has 
> ## a {{diff()}} method, to compare against another HCatTable instance
> ## a {{resolve(diff)}} method to copy over specified table-attributes from 
> another HCatTable
> ## a serialize/deserialize mechanism (via {{HCatClient.serializeTable()}} and 
> {{HCatClient.deserializeTable()}}), so that HCatTable instances constructed 
> in other class-loaders may be used for comparison
> # {{HCatPartition}} now provides finer-grained control over a Partition's 
> column-schema, StorageDescriptor settings, etc. This allows partitions to be 
> copied completely from source, with the ability to override specific 
> properties if required (e.g. location).
> # {{HCatClient.updateTableSchema()}} can now update the entire 
> table-definition, not just the column schema.
> # I've cleaned up and removed most of the redundancy between the HCatTable, 
> HCatCreateTableDesc and HCatCreateTableDesc.Builder. The prior API failed to 
> separate the table-attributes from the add-table-operation's attributes. By 
> providing fluent-interfaces in HCatTable, and composing an HCatTable instance 
> in HCatCreateTableDesc, the interfaces are cleaner(ish). The old setters are 
> deprecated, in favour of those in HCatTable. Likewise, HCatPartition and 
> HCatAddPartitionDesc.
> I'll post a patch for trunk shortly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7646) Modify parser to support new grammar for Insert,Update,Delete

2014-08-18 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7646:
-

Attachment: HIVE-7646.1.patch

> Modify parser to support new grammar for Insert,Update,Delete
> -
>
> Key: HIVE-7646
> URL: https://issues.apache.org/jira/browse/HIVE-7646
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7646.1.patch, HIVE-7646.patch
>
>
> need parser to recognize constructs such as :
> {code:sql}
> INSERT INTO Cust (Customer_Number, Balance, Address)
> VALUES (101, 50.00, '123 Main Street'), (102, 75.00, '123 Pine Ave');
> {code}
> {code:sql}
> DELETE FROM Cust WHERE Balance > 5.0
> {code}
> {code:sql}
> UPDATE Cust
> SET column1=value1,column2=value2,...
> WHERE some_column=some_value
> {code}
> also useful
> {code:sql}
> select a,b from values((1,2),(3,4)) as FOO(a,b)
> {code}
> This makes writing tests easier.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5718) Support direct fetch for lateral views, sub queries, etc.

2014-08-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101892#comment-14101892
 ] 

Hive QA commented on HIVE-5718:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12662660/HIVE-5718.10.patch.txt

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5819 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/394/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/394/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-394/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12662660

> Support direct fetch for lateral views, sub queries, etc.
> -
>
> Key: HIVE-5718
> URL: https://issues.apache.org/jira/browse/HIVE-5718
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: D13857.1.patch, D13857.2.patch, D13857.3.patch, 
> HIVE-5718.10.patch.txt, HIVE-5718.4.patch.txt, HIVE-5718.5.patch.txt, 
> HIVE-5718.6.patch.txt, HIVE-5718.7.patch.txt, HIVE-5718.8.patch.txt, 
> HIVE-5718.9.patch.txt
>
>
> Extend HIVE-2925 with LV and SubQ.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7646) Modify parser to support new grammar for Insert,Update,Delete

2014-08-18 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7646:
-

Description: 
need parser to recognize constructs such as :
{code:sql}
INSERT INTO Cust (Customer_Number, Balance, Address)
VALUES (101, 50.00, '123 Main Street'), (102, 75.00, '123 Pine Ave');
{code}
{code:sql}
DELETE FROM Cust WHERE Balance > 5.0
{code}
{code:sql}
UPDATE Cust
SET column1=value1,column2=value2,...
WHERE some_column=some_value
{code}

also useful
{code:sql}
select a,b from values((1,2),(3,4)) as FOO(a,b)
{code}
This makes writing tests easier.

  was:
need parser to recognize constructs such as :
{code:sql}
INSERT INTO Cust (Customer_Number, Balance, Address)
VALUES (101, 50.00, '123 Main Street'), (102, 75.00, '123 Pine Ave');
{code}
{code:sql}
DELETE FROM Cust WHERE Balance > 5.0
{code}
{code:sql}
UPDATE Cust
SET column1=value1,column2=value2,...
WHERE some_column=some_value
{code}


> Modify parser to support new grammar for Insert,Update,Delete
> -
>
> Key: HIVE-7646
> URL: https://issues.apache.org/jira/browse/HIVE-7646
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7646.patch
>
>
> need parser to recognize constructs such as :
> {code:sql}
> INSERT INTO Cust (Customer_Number, Balance, Address)
> VALUES (101, 50.00, '123 Main Street'), (102, 75.00, '123 Pine Ave');
> {code}
> {code:sql}
> DELETE FROM Cust WHERE Balance > 5.0
> {code}
> {code:sql}
> UPDATE Cust
> SET column1=value1,column2=value2,...
> WHERE some_column=some_value
> {code}
> also useful
> {code:sql}
> select a,b from values((1,2),(3,4)) as FOO(a,b)
> {code}
> This makes writing tests easier.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7776) enable sample10.q.[Spark Branch]

2014-08-18 Thread Chengxiang Li (JIRA)
Chengxiang Li created HIVE-7776:
---

 Summary: enable sample10.q.[Spark Branch]
 Key: HIVE-7776
 URL: https://issues.apache.org/jira/browse/HIVE-7776
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Chengxiang Li


sample10.q contain dynamic partition operation, should enable this qtest after 
hive on spark support dynamic partition.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7775) enable sample8.q.[Spark Branch]

2014-08-18 Thread Chengxiang Li (JIRA)
Chengxiang Li created HIVE-7775:
---

 Summary: enable sample8.q.[Spark Branch]
 Key: HIVE-7775
 URL: https://issues.apache.org/jira/browse/HIVE-7775
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Chengxiang Li


sample8.q contain join query, should enable this qtest after hive on spark 
support join operation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7728) Enable q-tests for TABLESAMPLE feature [Spark Branch]

2014-08-18 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-7728:


Status: Patch Available  (was: Open)

> Enable q-tests for TABLESAMPLE feature  [Spark Branch]
> --
>
> Key: HIVE-7728
> URL: https://issues.apache.org/jira/browse/HIVE-7728
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
> Attachments: HIVE-7728.1-spark.patch
>
>
> Enable q-tests for TABLESAMPLE feature since automatic test environment is 
> ready.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7728) Enable q-tests for TABLESAMPLE feature [Spark Branch]

2014-08-18 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-7728:


Attachment: HIVE-7728.1-spark.patch

Miss sample8.q as lack of join support, and sample10.q as lack of dynamic 
partition support, should add those qtests after the features are enabled.

> Enable q-tests for TABLESAMPLE feature  [Spark Branch]
> --
>
> Key: HIVE-7728
> URL: https://issues.apache.org/jira/browse/HIVE-7728
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
> Attachments: HIVE-7728.1-spark.patch
>
>
> Enable q-tests for TABLESAMPLE feature since automatic test environment is 
> ready.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5799) session/operation timeout for hiveserver2

2014-08-18 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101884#comment-14101884
 ] 

Lefty Leverenz commented on HIVE-5799:
--

Sorry for chiming in at the last minute (or second) but could the three 
parameter descriptions please specify their time units?  Presumably the 
defaults of "0s" indicate seconds, but hive-default.xml.template and the wiki 
will only show "0".  If there's no other reason for a new patch, though, I can 
fix this in HIVE-6586.

Otherwise the parameter descriptions look good.  +1 for docs.

> session/operation timeout for hiveserver2
> -
>
> Key: HIVE-5799
> URL: https://issues.apache.org/jira/browse/HIVE-5799
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-5799.1.patch.txt, HIVE-5799.10.patch.txt, 
> HIVE-5799.11.patch.txt, HIVE-5799.2.patch.txt, HIVE-5799.3.patch.txt, 
> HIVE-5799.4.patch.txt, HIVE-5799.5.patch.txt, HIVE-5799.6.patch.txt, 
> HIVE-5799.7.patch.txt, HIVE-5799.8.patch.txt, HIVE-5799.9.patch.txt
>
>
> Need some timeout facility for preventing resource leakages from instable  or 
> bad clients.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7432) Remove deprecated Avro's Schema.parse usages

2014-08-18 Thread Ashish Kumar Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101865#comment-14101865
 ] 

Ashish Kumar Singh commented on HIVE-7432:
--

Thanks [~brocknoland]!

> Remove deprecated Avro's Schema.parse usages
> 
>
> Key: HIVE-7432
> URL: https://issues.apache.org/jira/browse/HIVE-7432
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Kumar Singh
>Assignee: Ashish Kumar Singh
> Fix For: 0.14.0
>
> Attachments: HIVE-7432.1.patch, HIVE-7432.2.patch, HIVE-7432.patch
>
>
> Schema.parse has been deprecated by Avro, however it is being used at 
> multiple places in Hive.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7432) Remove deprecated Avro's Schema.parse usages

2014-08-18 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7432:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Thank you so much Ashish! I have committed this patch to trunk!!

> Remove deprecated Avro's Schema.parse usages
> 
>
> Key: HIVE-7432
> URL: https://issues.apache.org/jira/browse/HIVE-7432
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Kumar Singh
>Assignee: Ashish Kumar Singh
> Fix For: 0.14.0
>
> Attachments: HIVE-7432.1.patch, HIVE-7432.2.patch, HIVE-7432.patch
>
>
> Schema.parse has been deprecated by Avro, however it is being used at 
> multiple places in Hive.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5799) session/operation timeout for hiveserver2

2014-08-18 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101859#comment-14101859
 ] 

Brock Noland commented on HIVE-5799:


This looks quite good. Does anyone have an issue with committing this assuming 
the tests pass?

> session/operation timeout for hiveserver2
> -
>
> Key: HIVE-5799
> URL: https://issues.apache.org/jira/browse/HIVE-5799
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-5799.1.patch.txt, HIVE-5799.10.patch.txt, 
> HIVE-5799.11.patch.txt, HIVE-5799.2.patch.txt, HIVE-5799.3.patch.txt, 
> HIVE-5799.4.patch.txt, HIVE-5799.5.patch.txt, HIVE-5799.6.patch.txt, 
> HIVE-5799.7.patch.txt, HIVE-5799.8.patch.txt, HIVE-5799.9.patch.txt
>
>
> Need some timeout facility for preventing resource leakages from instable  or 
> bad clients.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7769) add --SORT_BEFORE_DIFF to union all .q tests

2014-08-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101841#comment-14101841
 ] 

Hive QA commented on HIVE-7769:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12662657/HIVE-7769.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5819 tests executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/393/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/393/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-393/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12662657

> add --SORT_BEFORE_DIFF to union all .q tests
> 
>
> Key: HIVE-7769
> URL: https://issues.apache.org/jira/browse/HIVE-7769
> Project: Hive
>  Issue Type: Bug
>Reporter: Na Yang
>Assignee: Na Yang
> Attachments: HIVE-7769.patch
>
>
> Some union all test cases do not generate deterministic ordered result. We 
> need to add  --SORT_BEFORE_DIFF to those .q tests



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7771) ORC PPD fails for some decimal predicates

2014-08-18 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101829#comment-14101829
 ] 

Daniel Dai commented on HIVE-7771:
--

Are you change the search argument to use BigDecimal? If so, shall we change 
SearchArgumentImpl as well?
{code}
- literal instanceof HiveDecimal) {
+ literal instanceof BigDecimal) {
...
- } else if (literal instanceof HiveDecimal) {
+ } else if (literal instanceof BigDecimal) {
{code}

> ORC PPD fails for some decimal predicates
> -
>
> Key: HIVE-7771
> URL: https://issues.apache.org/jira/browse/HIVE-7771
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: HIVE-7771.1.patch
>
>
> Some queries like 
> {code}
> select * from table where dcol<=11.22BD;
> {code}
> fails when ORC predicate pushdown is enabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7548) Precondition checks should not fail the merge task in case of automatic trigger

2014-08-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101807#comment-14101807
 ] 

Hive QA commented on HIVE-7548:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12662649/HIVE-7548.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5816 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge_incompat2
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/391/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/391/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-391/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12662649

> Precondition checks should not fail the merge task in case of automatic 
> trigger
> ---
>
> Key: HIVE-7548
> URL: https://issues.apache.org/jira/browse/HIVE-7548
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: HIVE-7548.1.patch
>
>
> ORC fast merge (HIVE-7509) will fail the merge task in case if any of the 
> precondition checks fail. Precondition check fail is good for "ALTER TABLE .. 
> CONCATENATE" but not for automatic trigger of merge task from conditional 
> resolver. In case if a partition has non-compatible ORC files for merging 
> then the merge task should ignore it and not fail the task.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7223) Support generic PartitionSpecs in Metastore partition-functions

2014-08-18 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101806#comment-14101806
 ] 

Mithun Radhakrishnan commented on HIVE-7223:


Hello, Alan. Thanks so much for reviewing. I'm creating the Review Board 
request right now. (It looks like a 2MB diff isn't helping.) I'll update this 
JIRA with the review-board request as soon as it completes.

On the concerns you've raised:

{quote}
hive_metastore.thrift - do we need get_partitions_pspec_with_auth?
{quote}
I didn't know if we needed this right off the bat. I figured we could add this 
later, if it was missed.

{quote}
PartValEqWrapperLite.equals, is values and location all you need to check 
equality? Are containing db and table not important?
{quote}
add_partitions_pspec_core() ensures that all Partitions derived from the PSpec 
belong to the same DB/Table. (This keeps it consistent with 
add_partitions_core().) So a further check on the DB/Table in 
PartValEqWrapperLite was moot.

{quote}
PartValEqWrapperLite.add_partitions_pspec_core, I'm wondering if you should 
give the caller an option to have it throw if the partitions already exists... 
why do you throw on duplicates in the list but not already exists?
{quote}
{{ifNotExists}} kinda fills that gap: an exception is thrown if the partition 
being added already exists (and {{ifNotExists == false}}). Dupes within the 
list are a sign of a user/programming error, which is why I check this 
condition explicitly. That also aligns with {{add_partitions_core}}'s protocol.

> Support generic PartitionSpecs in Metastore partition-functions
> ---
>
> Key: HIVE-7223
> URL: https://issues.apache.org/jira/browse/HIVE-7223
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog, Metastore
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-7223.1.patch, HIVE-7223.2.patch
>
>
> Currently, the functions in the HiveMetaStore API that handle multiple 
> partitions do so using List. E.g. 
> {code}
> public List listPartitions(String db_name, String tbl_name, short 
> max_parts);
> public List listPartitionsByFilter(String db_name, String 
> tbl_name, String filter, short max_parts);
> public int add_partitions(List new_parts);
> {code}
> Partition objects are fairly heavyweight, since each Partition carries its 
> own copy of a StorageDescriptor, partition-values, etc. Tables with tens of 
> thousands of partitions take so long to have their partitions listed that the 
> client times out with default hive.metastore.client.socket.timeout. There is 
> the additional expense of serializing and deserializing metadata for large 
> sets of partitions, w.r.t time and heap-space. Reducing the thrift traffic 
> should help in this regard.
> In a date-partitioned table, all sub-partitions for a particular date are 
> *likely* (but not expected) to have:
> # The same base directory (e.g. {{/feeds/search/20140601/}})
> # Similar directory structure (e.g. {{/feeds/search/20140601/[US,UK,IN]}})
> # The same SerDe/StorageHandler/IOFormat classes
> # Sorting/Bucketing/SkewInfo settings
> In this “most likely” scenario (henceforth termed “normal”), it’s possible to 
> represent the partition-list (for a date) in a more condensed form: a list of 
> LighterPartition instances, all sharing a common StorageDescriptor whose 
> location points to the root directory. 
> We can go one better for the {{add_partitions()}} case: When adding all 
> partitions for a given date, the “normal” case affords us the ability to 
> specify the top-level date-directory, where sub-partitions can be inferred 
> from the HDFS directory-path.
> These extensions are hard to introduce at the metastore-level, since 
> partition-functions explicitly specify {{List}} arguments. I 
> wonder if a {{PartitionSpec}} interface might help:
> {code}
> public PartitionSpec listPartitions(db_name, tbl_name, max_parts) throws ... 
> ; 
> public int add_partitions( PartitionSpec new_parts ) throws … ;
> {code}
> where the PartitionSpec looks like:
> {code}
> public interface PartitionSpec {
> public List getPartitions();
> public List getPartNames();
> public Iterator getPartitionIter();
> public Iterator getPartNameIter();
> }
> {code}
> For addPartitions(), an {{HDFSDirBasedPartitionSpec}} class could implement 
> {{PartitionSpec}}, store a top-level directory, and return Partition 
> instances from sub-directory names, while storing a single StorageDescriptor 
> for all of them.
> Similarly, list_partitions() could return a List, where each 
> PartitionSpec corresponds to a set or partitions that can share a 
> StorageDescriptor.
> By exposing iterator semantics, neither the client no

[jira] [Updated] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

2014-08-18 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7613:
---

Summary: Research optimization of auto convert join to map join [Spark 
branch]  (was: Research optimization of auto convert join to map join[Spark 
branch])

> Research optimization of auto convert join to map join [Spark branch]
> -
>
> Key: HIVE-7613
> URL: https://issues.apache.org/jira/browse/HIVE-7613
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Priority: Minor
>
> ConvertJoinMapJoin is an optimization the replaces a common join(aka shuffle 
> join) with a map join(aka broadcast or fragment replicate join) when 
> possible. we need to research how to make it workable with Hive on Spark.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7767) hive.optimize.union.remove does not work properly [Spark Branch]

2014-08-18 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7767:
---

Description: 
Turing on the hive.optimize.union.remove property generates wrong union all 
result. 

For Example:
{noformat}
create table inputTbl1(key string, val string) stored as textfile;
load data local inpath '../../data/files/T1.txt' into table inputTbl1;
SELECT *
FROM (
  SELECT key, count(1) as values from inputTbl1 group by key
  UNION ALL
  SELECT key, count(1) as values from inputTbl1 group by key
) a;  
{noformat}
when the hive.optimize.union.remove is turned on, the query result is like: 
{noformat}
1   1
2   1
3   1
7   1
8   2
{noformat}

when the hive.optimize.union.remove is turned off, the query result is like: 
{noformat}
7   1
2   1
8   2
3   1
1   1
7   1
2   1
8   2
3   1
1   1
{noformat}
The expected query result is:
{noformat}
7   1
2   1
8   2
3   1
1   1
7   1
2   1
8   2
3   1
1   1
{noformat}

  was:
Turing on the hive.optimize.union.remove property generates wrong union all 
result. 

For Example:
create table inputTbl1(key string, val string) stored as textfile;
load data local inpath '../../data/files/T1.txt' into table inputTbl1;
SELECT *
FROM (
  SELECT key, count(1) as values from inputTbl1 group by key
  UNION ALL
  SELECT key, count(1) as values from inputTbl1 group by key
) a;  

when the hive.optimize.union.remove is turned on, the query result is like: 
1   1
2   1
3   1
7   1
8   2

when the hive.optimize.union.remove is turned off, the query result is like: 
7   1
2   1
8   2
3   1
1   1
7   1
2   1
8   2
3   1
1   1

The expected query result is:
7   1
2   1
8   2
3   1
1   1
7   1
2   1
8   2
3   1
1   1


> hive.optimize.union.remove does not work properly [Spark Branch]
> 
>
> Key: HIVE-7767
> URL: https://issues.apache.org/jira/browse/HIVE-7767
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Na Yang
>Assignee: Na Yang
>
> Turing on the hive.optimize.union.remove property generates wrong union all 
> result. 
> For Example:
> {noformat}
> create table inputTbl1(key string, val string) stored as textfile;
> load data local inpath '../../data/files/T1.txt' into table inputTbl1;
> SELECT *
> FROM (
>   SELECT key, count(1) as values from inputTbl1 group by key
>   UNION ALL
>   SELECT key, count(1) as values from inputTbl1 group by key
> ) a;  
> {noformat}
> when the hive.optimize.union.remove is turned on, the query result is like: 
> {noformat}
> 1 1
> 2 1
> 3 1
> 7 1
> 8 2
> {noformat}
> when the hive.optimize.union.remove is turned off, the query result is like: 
> {noformat}
> 7 1
> 2 1
> 8 2
> 3 1
> 1 1
> 7 1
> 2 1
> 8 2
> 3 1
> 1 1
> {noformat}
> The expected query result is:
> {noformat}
> 7 1
> 2 1
> 8 2
> 3 1
> 1 1
> 7 1
> 2 1
> 8 2
> 3 1
> 1 1
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7763) Failed to qeury TABLESAMPLE on empty bucket table [Spark Branch]

2014-08-18 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7763:
---

Fix Version/s: spark-branch

> Failed to qeury TABLESAMPLE on empty bucket table [Spark Branch]
> 
>
> Key: HIVE-7763
> URL: https://issues.apache.org/jira/browse/HIVE-7763
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
> Fix For: spark-branch
>
> Attachments: HIVE-7763.1-spark.patch
>
>
> Get the following exception:
> {noformat}
> 2014-08-18 16:23:15,213 ERROR [Executor task launch worker-0]: 
> executor.Executor (Logging.scala:logError(96)) - Exception in task 0.0 in 
> stage 1.0 (TID 0)
> java.lang.RuntimeException: Map operator initialization failed
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:127)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:52)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:30)
> at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164)
> at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164)
> at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596)
> at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596)
> at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:54)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:722)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input 
> path are inconsistent
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:404)
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:93)
> ... 16 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Configuration 
> and input path are inconsistent
> at org.apache.hadoop.hive.ql.exec
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7763) Failed to query TABLESAMPLE on empty bucket table [Spark Branch]

2014-08-18 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7763:
---

Summary: Failed to query TABLESAMPLE on empty bucket table [Spark Branch]  
(was: Failed to qeury TABLESAMPLE on empty bucket table [Spark Branch])

> Failed to query TABLESAMPLE on empty bucket table [Spark Branch]
> 
>
> Key: HIVE-7763
> URL: https://issues.apache.org/jira/browse/HIVE-7763
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
> Fix For: spark-branch
>
> Attachments: HIVE-7763.1-spark.patch
>
>
> Get the following exception:
> {noformat}
> 2014-08-18 16:23:15,213 ERROR [Executor task launch worker-0]: 
> executor.Executor (Logging.scala:logError(96)) - Exception in task 0.0 in 
> stage 1.0 (TID 0)
> java.lang.RuntimeException: Map operator initialization failed
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:127)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:52)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:30)
> at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164)
> at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164)
> at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596)
> at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596)
> at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:54)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:722)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input 
> path are inconsistent
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:404)
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:93)
> ... 16 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Configuration 
> and input path are inconsistent
> at org.apache.hadoop.hive.ql.exec
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HIVE-7334) Create SparkShuffler, shuffling data between map-side data processing and reduce-side processing [Spark Branch]

2014-08-18 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland resolved HIVE-7334.


   Resolution: Fixed
Fix Version/s: spark-branch

[~lirui] as per your comments I am resolving this since HIVE-7528 is resolved. 
If anyone disagrees, please re-open.

> Create SparkShuffler, shuffling data between map-side data processing and 
> reduce-side processing [Spark Branch]
> ---
>
> Key: HIVE-7334
> URL: https://issues.apache.org/jira/browse/HIVE-7334
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Xuefu Zhang
>Assignee: Rui Li
> Fix For: spark-branch
>
> Attachments: HIVE-7334.patch
>
>
> Please refer to the design spec.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7552) Collect spark job statistic through spark metrics [Spark Branch]

2014-08-18 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7552:
---

Summary: Collect spark job statistic through spark metrics [Spark Branch]  
(was: Collect spark job statistic through spark metrics[Spark Branch])

> Collect spark job statistic through spark metrics [Spark Branch]
> 
>
> Key: HIVE-7552
> URL: https://issues.apache.org/jira/browse/HIVE-7552
> Project: Hive
>  Issue Type: New Feature
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Na Yang
>
> MR/Tez use counters to collect job statistic information, while Spark does 
> not use accumulator to do the same thing. Instead, Spark store task metrics 
> information in TaskMetrics and send it back to scheduler. We  could get spark 
> job statistic information through combine all TaskMetrics with SparkListener.
> NO PRECOMMIT TESTS. This is for spark-branch only.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7551) expand spark accumulator to support hive counter [Spark Branch]

2014-08-18 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7551:
---

Summary: expand spark accumulator  to support hive counter [Spark Branch]  
(was: expand spark accumulator  to support hive counter)

> expand spark accumulator  to support hive counter [Spark Branch]
> 
>
> Key: HIVE-7551
> URL: https://issues.apache.org/jira/browse/HIVE-7551
> Project: Hive
>  Issue Type: New Feature
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Na Yang
>
> hive collect some operator statistic information through counter, we need to 
> support MR/Tez counter counterpart through spark accumulator.
> NO PRECOMMIT TESTS. This is for spark branch only.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7528) Support cluster by and distributed by [Spark Branch]

2014-08-18 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101778#comment-14101778
 ] 

Brock Noland commented on HIVE-7528:


Thank you!!

> Support cluster by and distributed by [Spark Branch]
> 
>
> Key: HIVE-7528
> URL: https://issues.apache.org/jira/browse/HIVE-7528
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Rui Li
> Fix For: spark-branch
>
> Attachments: HIVE-7528.1-spark.patch, HIVE-7528.spark.patch
>
>
> clustered by = distributed by + sort by, so this is related to HIVE-7527. If 
> sort by is in place, the assumption is that we don't need to do anything 
> about distributed by or clustered by. Still, we need to confirm and verify.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7774) Issues with location path for temporary external tables

2014-08-18 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-7774:
-

Attachment: HIVE-7774.1.patch

Attaching patch. Looks like MiniMR does not catch this issue for some reason, 
therefore I am not updating the temp_table_external test.

> Issues with location path for temporary external tables
> ---
>
> Key: HIVE-7774
> URL: https://issues.apache.org/jira/browse/HIVE-7774
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-7774.1.patch
>
>
> Depending on the location string passed into temp external table, a query 
> requiring a map/reduce job will fail.  Example:
> {noformat}
> create temporary external table tmp1 (c1 string) location '/tmp/tmp1';
> describe extended tmp1;
> select count(*) from tmp1;
> {noformat}
> Will result in the following error:
> {noformat}
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
>   ... 9 more
> Caused by: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>   at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
>   ... 14 more
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
>   ... 17 more
> Caused by: java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:154)
>   ... 22 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input 
> path are inconsistent
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:404)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:123)
>   ... 22 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Configuration 
> and input path are inconsistent
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:398)
>   ... 23 more
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> {noformat}
> If the location is set to 'hdfs:/tmp/tmp1', it gets the following error:
> {noformat}
> java.io.IOException: cannot find dir = 
> hdfs://node-1.example.com:8020/tmp/tmp1/tmp1.txt in pathToPartitionInfo: 
> [hdfs:/tmp/tmp1]
>   at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:344)
>   at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:306)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$CombineHiveInputSplit.(CombineHiveInputFormat.java:108)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:455)
>   at 
> org.apache.h

[jira] [Updated] (HIVE-7774) Issues with location path for temporary external tables

2014-08-18 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-7774:
-

Status: Patch Available  (was: Open)

> Issues with location path for temporary external tables
> ---
>
> Key: HIVE-7774
> URL: https://issues.apache.org/jira/browse/HIVE-7774
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-7774.1.patch
>
>
> Depending on the location string passed into temp external table, a query 
> requiring a map/reduce job will fail.  Example:
> {noformat}
> create temporary external table tmp1 (c1 string) location '/tmp/tmp1';
> describe extended tmp1;
> select count(*) from tmp1;
> {noformat}
> Will result in the following error:
> {noformat}
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
>   ... 9 more
> Caused by: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>   at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
>   ... 14 more
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
>   ... 17 more
> Caused by: java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:154)
>   ... 22 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input 
> path are inconsistent
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:404)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:123)
>   ... 22 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Configuration 
> and input path are inconsistent
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:398)
>   ... 23 more
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> {noformat}
> If the location is set to 'hdfs:/tmp/tmp1', it gets the following error:
> {noformat}
> java.io.IOException: cannot find dir = 
> hdfs://node-1.example.com:8020/tmp/tmp1/tmp1.txt in pathToPartitionInfo: 
> [hdfs:/tmp/tmp1]
>   at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:344)
>   at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:306)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$CombineHiveInputSplit.(CombineHiveInputFormat.java:108)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:455)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:520)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSpli

[jira] [Commented] (HIVE-7774) Issues with location path for temporary external tables

2014-08-18 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101773#comment-14101773
 ] 

Jason Dere commented on HIVE-7774:
--

For non-temp tables (and temp non-external tables), it looks like 
Warehouse.getDnsPath() is used to properly format the table location string 
before the location is saved. This needs to be done for temp external tables as 
well.

> Issues with location path for temporary external tables
> ---
>
> Key: HIVE-7774
> URL: https://issues.apache.org/jira/browse/HIVE-7774
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
>
> Depending on the location string passed into temp external table, a query 
> requiring a map/reduce job will fail.  Example:
> {noformat}
> create temporary external table tmp1 (c1 string) location '/tmp/tmp1';
> describe extended tmp1;
> select count(*) from tmp1;
> {noformat}
> Will result in the following error:
> {noformat}
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
>   ... 9 more
> Caused by: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>   at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
>   ... 14 more
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
>   ... 17 more
> Caused by: java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:154)
>   ... 22 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input 
> path are inconsistent
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:404)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:123)
>   ... 22 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Configuration 
> and input path are inconsistent
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:398)
>   ... 23 more
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> {noformat}
> If the location is set to 'hdfs:/tmp/tmp1', it gets the following error:
> {noformat}
> java.io.IOException: cannot find dir = 
> hdfs://node-1.example.com:8020/tmp/tmp1/tmp1.txt in pathToPartitionInfo: 
> [hdfs:/tmp/tmp1]
>   at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:344)
>   at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:306)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$CombineHiveInputSplit.(CombineHiveInputFormat.java:108)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFor

Re: Review Request 24834: HIVE-7771: ORC PPD fails for some decimal predicates

2014-08-18 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24834/#review50955
---



ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java


No. It is not immediate return. It just takes the underlying object value 
from constant expression node to make it suitable for type conversion.

This type coercion is necessary to match the types of column stats object 
and predicate object. In some cases, predicate object type is embeded within 
ExprNodeConstantDesc which is why we need to extract the actual value out of it.


- Prasanth_J


On Aug. 19, 2014, 1:32 a.m., Prasanth_J wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24834/
> ---
> 
> (Updated Aug. 19, 2014, 1:32 a.m.)
> 
> 
> Review request for hive, Gopal V and Gunther Hagleitner.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Some queries like 
> {code}
> select * from table where dcol<=11.22BD;
> {code}
> fails when ORC predicate pushdown is enabled.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java f5023bb 
>   ql/src/test/queries/clientpositive/orc_ppd_decimal.q a93590e 
>   ql/src/test/results/clientpositive/orc_ppd_decimal.q.out 0c11ea8 
> 
> Diff: https://reviews.apache.org/r/24834/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Prasanth_J
> 
>



[jira] [Created] (HIVE-7774) Issues with location path for temporary external tables

2014-08-18 Thread Jason Dere (JIRA)
Jason Dere created HIVE-7774:


 Summary: Issues with location path for temporary external tables
 Key: HIVE-7774
 URL: https://issues.apache.org/jira/browse/HIVE-7774
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere


Depending on the location string passed into temp external table, a query 
requiring a map/reduce job will fail.  Example:

{noformat}
create temporary external table tmp1 (c1 string) location '/tmp/tmp1';
describe extended tmp1;
select count(*) from tmp1;
{noformat}

Will result in the following error:
{noformat}
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
... 14 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 17 more
Caused by: java.lang.RuntimeException: Map operator initialization failed
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:154)
... 22 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input path 
are inconsistent
at 
org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:404)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:123)
... 22 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and 
input path are inconsistent
at 
org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:398)
... 23 more


FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
{noformat}

If the location is set to 'hdfs:/tmp/tmp1', it gets the following error:
{noformat}
java.io.IOException: cannot find dir = 
hdfs://node-1.example.com:8020/tmp/tmp1/tmp1.txt in pathToPartitionInfo: 
[hdfs:/tmp/tmp1]
at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:344)
at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:306)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$CombineHiveInputSplit.(CombineHiveInputFormat.java:108)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:455)
at 
org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:520)
at 
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:512)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.S

[jira] [Commented] (HIVE-7734) Join stats annotation rule is not updating columns statistics correctly

2014-08-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101769#comment-14101769
 ] 

Hive QA commented on HIVE-7734:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12662642/HIVE-7734.2.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5819 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/390/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/390/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-390/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12662642

> Join stats annotation rule is not updating columns statistics correctly
> ---
>
> Key: HIVE-7734
> URL: https://issues.apache.org/jira/browse/HIVE-7734
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor, Statistics
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Fix For: 0.13.0
>
> Attachments: HIVE-7734.1.patch, HIVE-7734.2.patch
>
>
> HIVE-7679 is not doing the correct thing. The scale down/up factor updating 
> column stats was wrong as ratio = newRowCount/oldRowCount is always infinite 
> (oldRowCount = 0). The old row count should be retrieved from parent 
> corresponding to the current column whose statistics is being updated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7068) Integrate AccumuloStorageHandler

2014-08-18 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101754#comment-14101754
 ] 

Navis commented on HIVE-7068:
-

Confirmed build and test running good in local env(in hadoop-1 and hadoop-2). 
I'm +1 on this.

> Integrate AccumuloStorageHandler
> 
>
> Key: HIVE-7068
> URL: https://issues.apache.org/jira/browse/HIVE-7068
> Project: Hive
>  Issue Type: New Feature
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 0.14.0
>
> Attachments: HIVE-7068.1.patch, HIVE-7068.2.patch, HIVE-7068.3.patch, 
> HIVE-7068.4.patch
>
>
> [Accumulo|http://accumulo.apache.org] is a BigTable-clone which is similar to 
> HBase. Some [initial 
> work|https://github.com/bfemiano/accumulo-hive-storage-manager] has been done 
> to support querying an Accumulo table using Hive already. It is not a 
> complete solution as, most notably, the current implementation presently 
> lacks support for INSERTs.
> I would like to polish up the AccumuloStorageHandler (presently based on 
> 0.10), implement missing basic functionality and compare it to the 
> HBaseStorageHandler (to ensure that we follow the same general usage 
> patterns).
> I've also been in communication with [~bfem] (the initial author) who 
> expressed interest in working on this again. I hope to coordinate efforts 
> with him.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7773) Union all query finished with errors [Spark Branch]

2014-08-18 Thread Rui Li (JIRA)
Rui Li created HIVE-7773:


 Summary: Union all query finished with errors [Spark Branch]
 Key: HIVE-7773
 URL: https://issues.apache.org/jira/browse/HIVE-7773
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Rui Li
Priority: Critical


When I run a union all query, I found the following error in spark log (the 
query finished with correct results though):
{noformat}
java.lang.RuntimeException: Map operator initialization failed
at 
org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:127)
at 
org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:52)
at 
org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:30)
at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164)
at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164)
at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596)
at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:54)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input path 
are inconsistent
at 
org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:404)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:93)
... 16 more
{noformat}
Judging from the log, I think we don't properly handle the input paths when 
cloning the job conf, so it may also affect other queries with multiple maps or 
reduces.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7683) Test TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx is still failing

2014-08-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101719#comment-14101719
 ] 

Ashutosh Chauhan commented on HIVE-7683:


+1

> Test TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx is still failing
> --
>
> Key: HIVE-7683
> URL: https://issues.apache.org/jira/browse/HIVE-7683
> Project: Hive
>  Issue Type: Bug
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-7683.1.patch.txt
>
>
> NO PRECOMMIT TESTS
> As commented in HIVE-7415, counter stat fails sometimes in the test (see 
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/257/testReport/org.apache.hadoop.hive.cli/TestMinimrCliDriver/testCliDriver_ql_rewrite_gbtoidx).
>  Let's try other stat collector and see the test result.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7683) Test TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx is still failing

2014-08-18 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7683:


Status: Patch Available  (was: Open)

> Test TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx is still failing
> --
>
> Key: HIVE-7683
> URL: https://issues.apache.org/jira/browse/HIVE-7683
> Project: Hive
>  Issue Type: Bug
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-7683.1.patch.txt
>
>
> NO PRECOMMIT TESTS
> As commented in HIVE-7415, counter stat fails sometimes in the test (see 
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/257/testReport/org.apache.hadoop.hive.cli/TestMinimrCliDriver/testCliDriver_ql_rewrite_gbtoidx).
>  Let's try other stat collector and see the test result.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7683) Test TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx is still failing

2014-08-18 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7683:


Description: 
NO PRECOMMIT TESTS

As commented in HIVE-7415, counter stat fails sometimes in the test (see 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/257/testReport/org.apache.hadoop.hive.cli/TestMinimrCliDriver/testCliDriver_ql_rewrite_gbtoidx).
 Let's try other stat collector and see the test result.


  was:
As commented in HIVE-7415, counter stat fails sometimes in the test (see 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/257/testReport/org.apache.hadoop.hive.cli/TestMinimrCliDriver/testCliDriver_ql_rewrite_gbtoidx).
 Let's try other stat collector and see the test result.



> Test TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx is still failing
> --
>
> Key: HIVE-7683
> URL: https://issues.apache.org/jira/browse/HIVE-7683
> Project: Hive
>  Issue Type: Bug
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-7683.1.patch.txt
>
>
> NO PRECOMMIT TESTS
> As commented in HIVE-7415, counter stat fails sometimes in the test (see 
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/257/testReport/org.apache.hadoop.hive.cli/TestMinimrCliDriver/testCliDriver_ql_rewrite_gbtoidx).
>  Let's try other stat collector and see the test result.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7683) Test TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx is still failing

2014-08-18 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7683:


Attachment: HIVE-7683.1.patch.txt

> Test TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx is still failing
> --
>
> Key: HIVE-7683
> URL: https://issues.apache.org/jira/browse/HIVE-7683
> Project: Hive
>  Issue Type: Bug
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-7683.1.patch.txt
>
>
> NO PRECOMMIT TESTS
> As commented in HIVE-7415, counter stat fails sometimes in the test (see 
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/257/testReport/org.apache.hadoop.hive.cli/TestMinimrCliDriver/testCliDriver_ql_rewrite_gbtoidx).
>  Let's try other stat collector and see the test result.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7715) CBO:Support Union All

2014-08-18 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7715:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to branch. Thanks [~jpullokkaran]!

> CBO:Support Union All
> -
>
> Key: HIVE-7715
> URL: https://issues.apache.org/jira/browse/HIVE-7715
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Laljo John Pullokkaran
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-7715.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 24770: Improve the columns stats update speed

2014-08-18 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24770/#review50948
---



metastore/if/hive_metastore.thrift


Do you really need dbName, tblName in this struct definition? Since, 
ColumnStatistics already contain this (via ColumnStatisticsDesc), this is 
repetition of info.
Further, since ColumnStatistics contain list of ColStatsObj, I dont think, 
you need List either. Seems like:

struct SetPartitionsStatsRequest {
1: required ColumnStatistics
}

is all you need.

Further, actually it seems like current api 
update_partition_column_statistics() already has this signature, so you 
probably dont need a new api at all.



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java


seems like no one is using this method.



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java


Can you add some comments here? What this method is doing?



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java


If fqr.get(0) == null, why is it OK to have csid = 1?



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java


Use extractSqlLong() method for this.



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java


We want to avoid doing O(# of columns) queries on RDBMS, if possible. 
Reconsider this for loop.



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java


Look in TxnHandler.java where we do mass upsert queries, without doing 
delete followed by insert. That logic looks more cleaner to me.



ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java


Why only first row? Each row corresponds to a partition, we want to 
persists for all partitions in one call, isnt it? And the way its written right 
now, it will actually ignore all but first partition. I think this is the 
reason for failing test of colstats_part_lvl in Hive QA run.


Can you add some .q tests for this, where we do analyze statement first for 
table with no stats and than with stats.

- Ashutosh Chauhan


On Aug. 16, 2014, 6:37 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24770/
> ---
> 
> (Updated Aug. 16, 2014, 6:37 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> improve the columns stats update speed for all the partitions of a table
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7f4afd9 
>   metastore/if/hive_metastore.thrift cb326f4 
>   metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 0e328dd 
>   metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 23b5edf 
>   metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 
> 4bcb2e6 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 9f583a4 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 4768128 
>   
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddPartitionsRequest.java
>  96caab6 
>   
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddPartitionsResult.java
>  ba65da6 
>   
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/DropPartitionsResult.java
>  87444d2 
>   
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Function.java
>  813d5f5 
>   
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/GetOpenTxnsInfoResponse.java
>  5d3bf75 
>   
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/GetOpenTxnsResponse.java
>  b938d7d 
>   
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/HeartbeatTxnRangeResponse.java
>  49f4e56 
>   
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/LockRequest.java
>  f860028 
>   
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/OpenTxnsResponse.java
>  1a99948 
>   
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/RequestPartsSpec.java
>  217a3c1 
>   
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/SetPartitionsStatsRequest.java
>  PRE-CREATION 
>   
> metastore/src/gen/thrift/

Re: Review Request 24834: HIVE-7771: ORC PPD fails for some decimal predicates

2014-08-18 Thread Gopal V

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24834/#review50950
---



ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java


Does this imply an immediate return?

Is there are reason we go through the type coercion codepaths after getting 
the value out of the object?


- Gopal V


On Aug. 19, 2014, 1:32 a.m., Prasanth_J wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24834/
> ---
> 
> (Updated Aug. 19, 2014, 1:32 a.m.)
> 
> 
> Review request for hive, Gopal V and Gunther Hagleitner.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Some queries like 
> {code}
> select * from table where dcol<=11.22BD;
> {code}
> fails when ORC predicate pushdown is enabled.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java f5023bb 
>   ql/src/test/queries/clientpositive/orc_ppd_decimal.q a93590e 
>   ql/src/test/results/clientpositive/orc_ppd_decimal.q.out 0c11ea8 
> 
> Diff: https://reviews.apache.org/r/24834/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Prasanth_J
> 
>



[jira] [Commented] (HIVE-4788) RCFile and bzip2 compression not working

2014-08-18 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101699#comment-14101699
 ] 

Navis commented on HIVE-4788:
-

This is really a simple patch. Could anyone review this?

> RCFile and bzip2 compression not working
> 
>
> Key: HIVE-4788
> URL: https://issues.apache.org/jira/browse/HIVE-4788
> Project: Hive
>  Issue Type: Bug
>  Components: Compression
>Affects Versions: 0.10.0
> Environment: CDH4.2
>Reporter: Johndee Burks
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-4788.1.patch.txt, HIVE-4788.2.patch.txt
>
>
> The issue is that Bzip2 compressed rcfile data is encountering an error when 
> being queried even the most simple query "select *". The issue is easily 
> reproducible using the following. 
> Create a table and load the sample data below. 
> DDL: create table source_data (a string, b string) row format delimited 
> fields terminated by ',';
> Sample data: 
> apple,sauce 
> Test: 
> Do the following and you should receive the error listed below for the rcfile 
> table with bz2 compression. 
> create table rc_nobz2 (a string, b string) stored as rcfile; 
> insert into table rc_nobz2 select * from source_txt; 
> SET io.seqfile.compression.type=BLOCK; 
> SET hive.exec.compress.output=true; 
> SET mapred.compress.map.output=true; 
> SET mapred.output.compress=true; 
> SET mapred.output.compression.codec=org.apache.hadoop.io.compress.BZip2Codec; 
> create table rc_bz2 (a string, b string) stored as rcfile; 
> insert into table rc_bz2 select * from source_txt; 
> hive> select * from rc_bz2; 
> Failed with exception java.io.IOException:java.io.IOException: Stream is not 
> BZip2 formatted: expected 'h' as first byte but got '�' 
> hive> select * from rc_nobz2; 
> apple sauce



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5799) session/operation timeout for hiveserver2

2014-08-18 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-5799:


Attachment: HIVE-5799.11.patch.txt

Fixed test fails

> session/operation timeout for hiveserver2
> -
>
> Key: HIVE-5799
> URL: https://issues.apache.org/jira/browse/HIVE-5799
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-5799.1.patch.txt, HIVE-5799.10.patch.txt, 
> HIVE-5799.11.patch.txt, HIVE-5799.2.patch.txt, HIVE-5799.3.patch.txt, 
> HIVE-5799.4.patch.txt, HIVE-5799.5.patch.txt, HIVE-5799.6.patch.txt, 
> HIVE-5799.7.patch.txt, HIVE-5799.8.patch.txt, HIVE-5799.9.patch.txt
>
>
> Need some timeout facility for preventing resource leakages from instable  or 
> bad clients.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7528) Support cluster by and distributed by [Spark Branch]

2014-08-18 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101687#comment-14101687
 ] 

Rui Li commented on HIVE-7528:
--

Thanks [~brocknoland] I've created HIVE-7772 for it.

> Support cluster by and distributed by [Spark Branch]
> 
>
> Key: HIVE-7528
> URL: https://issues.apache.org/jira/browse/HIVE-7528
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Rui Li
> Fix For: spark-branch
>
> Attachments: HIVE-7528.1-spark.patch, HIVE-7528.spark.patch
>
>
> clustered by = distributed by + sort by, so this is related to HIVE-7527. If 
> sort by is in place, the assumption is that we don't need to do anything 
> about distributed by or clustered by. Still, we need to confirm and verify.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-08-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101684#comment-14101684
 ] 

Hive QA commented on HIVE-7405:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12662636/HIVE-7405.8.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 5819 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_coalesce
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_elt
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_casts
org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/389/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/389/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-389/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12662636

> Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
> --
>
> Key: HIVE-7405
> URL: https://issues.apache.org/jira/browse/HIVE-7405
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
> HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
> HIVE-7405.8.patch
>
>
> Vectorize the basic case that does not have any count distinct aggregation.
> Add a 4th processing mode in VectorGroupByOperator for reduce where each 
> input VectorizedRowBatch has only values for one key at a time.  Thus, the 
> values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7772) Add tests for order/sort/distribute/cluster by query [Spark Branch]

2014-08-18 Thread Rui Li (JIRA)
Rui Li created HIVE-7772:


 Summary: Add tests for order/sort/distribute/cluster by query 
[Spark Branch]
 Key: HIVE-7772
 URL: https://issues.apache.org/jira/browse/HIVE-7772
 Project: Hive
  Issue Type: Test
  Components: Spark
Reporter: Rui Li
Assignee: Rui Li


Now that these queries are supported, we should have tests to catch any 
problems we may have.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7624) Reduce operator initialization failed when running multiple MR query on spark

2014-08-18 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101680#comment-14101680
 ] 

Rui Li commented on HIVE-7624:
--

[~brocknoland] Got it, thanks!

> Reduce operator initialization failed when running multiple MR query on spark
> -
>
> Key: HIVE-7624
> URL: https://issues.apache.org/jira/browse/HIVE-7624
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Fix For: spark-branch
>
> Attachments: HIVE-7624.2-spark.patch, HIVE-7624.3-spark.patch, 
> HIVE-7624.4-spark.patch, HIVE-7624.5-spark.patch, HIVE-7624.6-spark.patch, 
> HIVE-7624.7-spark.patch, HIVE-7624.patch
>
>
> The following error occurs when I try to run a query with multiple reduce 
> works (M->R->R):
> {quote}
> 14/08/05 12:17:07 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 1)
> java.lang.RuntimeException: Reduce operator initialization failed
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:170)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:53)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:31)
> at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164)
> at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164)
> at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596)
> at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596)
> at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:54)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: cannot find field reducesinkkey0 from 
> [0:_col0]
> at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:415)
> at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:147)
> …
> {quote}
> I suspect we're applying the reduce function in wrong order.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 24834: HIVE-7771: ORC PPD fails for some decimal predicates

2014-08-18 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24834/
---

(Updated Aug. 19, 2014, 1:32 a.m.)


Review request for hive, Gopal V and Gunther Hagleitner.


Changes
---

+Gopal


Repository: hive-git


Description
---

Some queries like 
{code}
select * from table where dcol<=11.22BD;
{code}
fails when ORC predicate pushdown is enabled.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java f5023bb 
  ql/src/test/queries/clientpositive/orc_ppd_decimal.q a93590e 
  ql/src/test/results/clientpositive/orc_ppd_decimal.q.out 0c11ea8 

Diff: https://reviews.apache.org/r/24834/diff/


Testing
---


Thanks,

Prasanth_J



[jira] [Updated] (HIVE-7771) ORC PPD fails for some decimal predicates

2014-08-18 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7771:
-

Attachment: HIVE-7771.1.patch

> ORC PPD fails for some decimal predicates
> -
>
> Key: HIVE-7771
> URL: https://issues.apache.org/jira/browse/HIVE-7771
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: HIVE-7771.1.patch
>
>
> Some queries like 
> {code}
> select * from table where dcol<=11.22BD;
> {code}
> fails when ORC predicate pushdown is enabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 24834: HIVE-7771: ORC PPD fails for some decimal predicates

2014-08-18 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24834/
---

Review request for hive and Gunther Hagleitner.


Repository: hive-git


Description
---

Some queries like 
{code}
select * from table where dcol<=11.22BD;
{code}
fails when ORC predicate pushdown is enabled.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java f5023bb 
  ql/src/test/queries/clientpositive/orc_ppd_decimal.q a93590e 
  ql/src/test/results/clientpositive/orc_ppd_decimal.q.out 0c11ea8 

Diff: https://reviews.apache.org/r/24834/diff/


Testing
---


Thanks,

Prasanth_J



[jira] [Updated] (HIVE-7771) ORC PPD fails for some decimal predicates

2014-08-18 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7771:
-

Status: Patch Available  (was: Open)

> ORC PPD fails for some decimal predicates
> -
>
> Key: HIVE-7771
> URL: https://issues.apache.org/jira/browse/HIVE-7771
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: HIVE-7771.1.patch
>
>
> Some queries like 
> {code}
> select * from table where dcol<=11.22BD;
> {code}
> fails when ORC predicate pushdown is enabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7771) ORC PPD fails for some decimal predicates

2014-08-18 Thread Prasanth J (JIRA)
Prasanth J created HIVE-7771:


 Summary: ORC PPD fails for some decimal predicates
 Key: HIVE-7771
 URL: https://issues.apache.org/jira/browse/HIVE-7771
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Prasanth J


Some queries like 
{code}
select * from table where dcol<=11.22BD;
{code}
fails when ORC predicate pushdown is enabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7770) Undo backward-incompatible behaviour change introduced by HIVE-7341

2014-08-18 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-7770:
---

Summary: Undo backward-incompatible behaviour change introduced by 
HIVE-7341  (was: Restore backward-incompatible behaviour introduced by 
HIVE-7341)

> Undo backward-incompatible behaviour change introduced by HIVE-7341
> ---
>
> Key: HIVE-7770
> URL: https://issues.apache.org/jira/browse/HIVE-7770
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
>Reporter: Sushanth Sowmyan
>Assignee: Mithun Radhakrishnan
>  Labels: regression
>
> HIVE-7341 introduced a backward-incompatibility regression in Exception 
> signatures for HCatPartition.getColumns() that breaks compilation for 
> external tools like Falcon. This bug tracks a scrub of any other issues we 
> discover, so we can put them back to how it used to be. This bug needs 
> resolution in the same release as HIVE-7341, and thus, must be resolved in 
> 0.14.0.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6329) Support column level encryption/decryption

2014-08-18 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6329:


Attachment: HIVE-6329.10.patch.txt

Fixed test fails

> Support column level encryption/decryption
> --
>
> Key: HIVE-6329
> URL: https://issues.apache.org/jira/browse/HIVE-6329
> Project: Hive
>  Issue Type: New Feature
>  Components: Security, Serializers/Deserializers
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-6329.1.patch.txt, HIVE-6329.10.patch.txt, 
> HIVE-6329.2.patch.txt, HIVE-6329.3.patch.txt, HIVE-6329.4.patch.txt, 
> HIVE-6329.5.patch.txt, HIVE-6329.6.patch.txt, HIVE-6329.7.patch.txt, 
> HIVE-6329.8.patch.txt, HIVE-6329.9.patch.txt
>
>
> Receiving some requirements on encryption recently but hive is not supporting 
> it. Before the full implementation via HIVE-5207, this might be useful for 
> some cases.
> {noformat}
> hive> create table encode_test(id int, name STRING, phone STRING, address 
> STRING) 
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
> > WITH SERDEPROPERTIES ('column.encode.columns'='phone,address', 
> 'column.encode.classname'='org.apache.hadoop.hive.serde2.Base64WriteOnly') 
> STORED AS TEXTFILE;
> OK
> Time taken: 0.584 seconds
> hive> insert into table encode_test select 
> 100,'navis','010--','Seoul, Seocho' from src tablesample (1 rows);
> ..
> OK
> Time taken: 5.121 seconds
> hive> select * from encode_test;
> OK
> 100   navis MDEwLTAwMDAtMDAwMA==  U2VvdWwsIFNlb2Nobw==
> Time taken: 0.078 seconds, Fetched: 1 row(s)
> hive> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 24833: qualified tablenames usage does not work with several alter-table commands

2014-08-18 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24833/
---

Review request for hive and Thejas Nair.


Bugs: HIVE-7681
https://issues.apache.org/jira/browse/HIVE-7681


Repository: hive-git


Description
---

Changes were made in HIVE-4064 for use of qualified table names in more types 
of queries. But several alter table commands don't work with qualified 
- alter table default.tmpfoo set tblproperties ("bar" = "bar value")
- ALTER TABLE default.kv_rename_test CHANGE a a STRING
- add,drop partition
- alter index rebuild


Diffs
-

  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/SemanticAnalysis/CreateTableHook.java
 ff0f210 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/SemanticAnalysis/HCatSemanticAnalyzer.java
 4d338b5 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/cli/TestSemanticAnalysis.java
 1e25ed3 
  ql/src/java/org/apache/hadoop/hive/ql/hooks/UpdateInputAccessTimeHook.java 
ae89182 
  ql/src/java/org/apache/hadoop/hive/ql/index/IndexMetadataChangeTask.java 
1e01001 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
27e251c 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
e7434a3 
  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 60d490f 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java f31a409 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g a76cad7 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IndexUpdater.java 8527239 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7a71ec7 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
3dfce99 
  ql/src/java/org/apache/hadoop/hive/ql/plan/AlterTableDesc.java 20d863b 
  ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java 67be666 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveOperationType.java
 29ae4a0 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/Operation2Privilege.java
 45404fe 
  ql/src/test/queries/clientpositive/add_part_exist.q d176661 
  ql/src/test/queries/clientpositive/alter1.q 312a017 
  ql/src/test/queries/clientpositive/alter_char1.q d391138 
  ql/src/test/queries/clientpositive/alter_index.q 2aa13da 
  ql/src/test/queries/clientpositive/alter_partition_coltype.q 115eaf9 
  ql/src/test/queries/clientpositive/alter_skewed_table.q 216bbb5 
  ql/src/test/queries/clientpositive/alter_varchar1.q 6f644a0 
  ql/src/test/queries/clientpositive/alter_view_as_select.q dcab3ca 
  ql/src/test/queries/clientpositive/alter_view_rename.q 68cf9d6 
  ql/src/test/queries/clientpositive/archive_multi.q 2c1a6d8 
  ql/src/test/queries/clientpositive/create_or_replace_view.q a8f59b7 
  ql/src/test/queries/clientpositive/drop_multi_partitions.q 14e2356 
  ql/src/test/queries/clientpositive/exchange_partition.q 4be6e3f 
  ql/src/test/queries/clientpositive/index_auto_empty.q 41f4a40 
  ql/src/test/queries/clientpositive/touch.q 8a661ef 
  ql/src/test/queries/clientpositive/unset_table_view_property.q f838cd1 
  ql/src/test/results/clientpositive/add_part_exist.q.out 4c22d6a 
  ql/src/test/results/clientpositive/alter1.q.out 1cfaf75 
  ql/src/test/results/clientpositive/alter_char1.q.out 017da60 
  ql/src/test/results/clientpositive/alter_index.q.out 2093e2f 
  ql/src/test/results/clientpositive/alter_partition_coltype.q.out 25eb48c 
  ql/src/test/results/clientpositive/alter_skewed_table.q.out e6bfc5a 
  ql/src/test/results/clientpositive/alter_varchar1.q.out e74a7ed 
  ql/src/test/results/clientpositive/alter_view_as_select.q.out 53a6b37 
  ql/src/test/results/clientpositive/alter_view_rename.q.out 0f3dd14 
  ql/src/test/results/clientpositive/archive_multi.q.out 7e84def 
  ql/src/test/results/clientpositive/create_or_replace_view.q.out 52ff417 
  ql/src/test/results/clientpositive/drop_multi_partitions.q.out 58a472c 
  ql/src/test/results/clientpositive/exchange_partition.q.out 381a9fd 
  ql/src/test/results/clientpositive/index_auto_empty.q.out 6a1a6c5 
  ql/src/test/results/clientpositive/touch.q.out 7ea3807 
  ql/src/test/results/clientpositive/unset_table_view_property.q.out 8cf6686 

Diff: https://reviews.apache.org/r/24833/diff/


Testing
---


Thanks,

Navis Ryu



[jira] [Commented] (HIVE-7681) qualified tablenames usage does not work with several alter-table commands

2014-08-18 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101661#comment-14101661
 ] 

Navis commented on HIVE-7681:
-

[~thejas] Could you review this? It goes stale frequently.

> qualified tablenames usage does not work with several alter-table commands
> --
>
> Key: HIVE-7681
> URL: https://issues.apache.org/jira/browse/HIVE-7681
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Navis
> Attachments: HIVE-7681.1.patch.txt, HIVE-7681.2.patch.txt, 
> HIVE-7681.3.patch.txt, HIVE-7681.4.patch.txt
>
>
> Changes were made in HIVE-4064 for use of qualified table names in more types 
> of queries. But several alter table commands don't work with qualified 
> - alter table default.tmpfoo set tblproperties ("bar" = "bar value")
> - ALTER TABLE default.kv_rename_test CHANGE a a STRING
> - add,drop partition
> - alter index rebuild



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5718) Support direct fetch for lateral views, sub queries, etc.

2014-08-18 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-5718:


Attachment: HIVE-5718.10.patch.txt

Rebased on trunk

> Support direct fetch for lateral views, sub queries, etc.
> -
>
> Key: HIVE-5718
> URL: https://issues.apache.org/jira/browse/HIVE-5718
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: D13857.1.patch, D13857.2.patch, D13857.3.patch, 
> HIVE-5718.10.patch.txt, HIVE-5718.4.patch.txt, HIVE-5718.5.patch.txt, 
> HIVE-5718.6.patch.txt, HIVE-5718.7.patch.txt, HIVE-5718.8.patch.txt, 
> HIVE-5718.9.patch.txt
>
>
> Extend HIVE-2925 with LV and SubQ.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 24832: HIVE-7769: add --SORT_BEFORE_DIFF to union all .q tests

2014-08-18 Thread Na Yang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24832/
---

(Updated Aug. 19, 2014, 1:11 a.m.)


Review request for hive and Brock Noland.


Bugs: HIVE-7769
https://issues.apache.org/jira/browse/HIVE-7769


Repository: hive-git


Description
---

HIVE-7769: add --SORT_BEFORE_DIFF to union all .q tests


Diffs
-

  ql/src/test/queries/clientpositive/union.q 525eccb 
  ql/src/test/queries/clientpositive/union11.q 77dc2ef 
  ql/src/test/queries/clientpositive/union13.q 8bee1d7 
  ql/src/test/queries/clientpositive/union14.q 4437ad8 
  ql/src/test/queries/clientpositive/union15.q 3080b07 
  ql/src/test/queries/clientpositive/union16.q 1df68b0 
  ql/src/test/queries/clientpositive/union17.q 34b0e8c 
  ql/src/test/queries/clientpositive/union2.q 581cbeb 
  ql/src/test/queries/clientpositive/union20.q 267262e 
  ql/src/test/queries/clientpositive/union21.q 8185994 
  ql/src/test/queries/clientpositive/union27.q c039e9c 
  ql/src/test/queries/clientpositive/union3.q b26a2e2 
  ql/src/test/queries/clientpositive/union33.q 69e46f4 
  ql/src/test/queries/clientpositive/union5.q 9844127 
  ql/src/test/queries/clientpositive/union7.q d66d596 
  ql/src/test/queries/clientpositive/union8.q 6d5bf67 
  ql/src/test/queries/clientpositive/union9.q 7d4c11b 
  ql/src/test/queries/clientpositive/union_null.q 4368b8a 
  ql/src/test/results/clientpositive/union.q.out 6627ad7 
  ql/src/test/results/clientpositive/union11.q.out 738bd4e 
  ql/src/test/results/clientpositive/union13.q.out ae7b3aa 
  ql/src/test/results/clientpositive/union14.q.out f5c3948 
  ql/src/test/results/clientpositive/union15.q.out 0c2380b 
  ql/src/test/results/clientpositive/union16.q.out b2f1139 
  ql/src/test/results/clientpositive/union17.q.out 0e83c3c 
  ql/src/test/results/clientpositive/union2.q.out 37c76bf 
  ql/src/test/results/clientpositive/union20.q.out 663d128 
  ql/src/test/results/clientpositive/union21.q.out 13201a3 
  ql/src/test/results/clientpositive/union27.q.out 0c2a3d1 
  ql/src/test/results/clientpositive/union3.q.out 327fb07 
  ql/src/test/results/clientpositive/union33.q.out 2075b79 
  ql/src/test/results/clientpositive/union5.q.out 5763c11 
  ql/src/test/results/clientpositive/union7.q.out a5a42b4 
  ql/src/test/results/clientpositive/union8.q.out 7debe7a 
  ql/src/test/results/clientpositive/union9.q.out 51ec4e2 
  ql/src/test/results/clientpositive/union_null.q.out dad45ba 

Diff: https://reviews.apache.org/r/24832/diff/


Testing
---


Thanks,

Na Yang



Review Request 24832: HIVE-7769: add --SORT_BEFORE_DIFF to union all .q tests

2014-08-18 Thread Na Yang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24832/
---

Review request for hive and Brock Noland.


Bugs: HIVE-7769
https://issues.apache.org/jira/browse/HIVE-7769


Repository: hive-git


Description
---

HIVE-7769: add --SORT_BEFORE_DIFF to union all .q tests


Diffs
-

  ql/src/test/queries/clientpositive/union.q 525eccb 
  ql/src/test/queries/clientpositive/union11.q 77dc2ef 
  ql/src/test/queries/clientpositive/union13.q 8bee1d7 
  ql/src/test/queries/clientpositive/union14.q 4437ad8 
  ql/src/test/queries/clientpositive/union15.q 3080b07 
  ql/src/test/queries/clientpositive/union16.q 1df68b0 
  ql/src/test/queries/clientpositive/union17.q 34b0e8c 
  ql/src/test/queries/clientpositive/union2.q 581cbeb 
  ql/src/test/queries/clientpositive/union20.q 267262e 
  ql/src/test/queries/clientpositive/union21.q 8185994 
  ql/src/test/queries/clientpositive/union27.q c039e9c 
  ql/src/test/queries/clientpositive/union3.q b26a2e2 
  ql/src/test/queries/clientpositive/union33.q 69e46f4 
  ql/src/test/queries/clientpositive/union5.q 9844127 
  ql/src/test/queries/clientpositive/union7.q d66d596 
  ql/src/test/queries/clientpositive/union8.q 6d5bf67 
  ql/src/test/queries/clientpositive/union9.q 7d4c11b 
  ql/src/test/queries/clientpositive/union_null.q 4368b8a 
  ql/src/test/results/clientpositive/union.q.out 6627ad7 
  ql/src/test/results/clientpositive/union11.q.out 738bd4e 
  ql/src/test/results/clientpositive/union13.q.out ae7b3aa 
  ql/src/test/results/clientpositive/union14.q.out f5c3948 
  ql/src/test/results/clientpositive/union15.q.out 0c2380b 
  ql/src/test/results/clientpositive/union16.q.out b2f1139 
  ql/src/test/results/clientpositive/union17.q.out 0e83c3c 
  ql/src/test/results/clientpositive/union2.q.out 37c76bf 
  ql/src/test/results/clientpositive/union20.q.out 663d128 
  ql/src/test/results/clientpositive/union21.q.out 13201a3 
  ql/src/test/results/clientpositive/union27.q.out 0c2a3d1 
  ql/src/test/results/clientpositive/union3.q.out 327fb07 
  ql/src/test/results/clientpositive/union33.q.out 2075b79 
  ql/src/test/results/clientpositive/union5.q.out 5763c11 
  ql/src/test/results/clientpositive/union7.q.out a5a42b4 
  ql/src/test/results/clientpositive/union8.q.out 7debe7a 
  ql/src/test/results/clientpositive/union9.q.out 51ec4e2 
  ql/src/test/results/clientpositive/union_null.q.out dad45ba 

Diff: https://reviews.apache.org/r/24832/diff/


Testing
---


Thanks,

Na Yang



[jira] [Updated] (HIVE-7769) add --SORT_BEFORE_DIFF to union all .q tests

2014-08-18 Thread Na Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Na Yang updated HIVE-7769:
--

Status: Patch Available  (was: Open)

> add --SORT_BEFORE_DIFF to union all .q tests
> 
>
> Key: HIVE-7769
> URL: https://issues.apache.org/jira/browse/HIVE-7769
> Project: Hive
>  Issue Type: Bug
>Reporter: Na Yang
>Assignee: Na Yang
> Attachments: HIVE-7769.patch
>
>
> Some union all test cases do not generate deterministic ordered result. We 
> need to add  --SORT_BEFORE_DIFF to those .q tests



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7769) add --SORT_BEFORE_DIFF to union all .q tests

2014-08-18 Thread Na Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Na Yang updated HIVE-7769:
--

Attachment: HIVE-7769.patch

> add --SORT_BEFORE_DIFF to union all .q tests
> 
>
> Key: HIVE-7769
> URL: https://issues.apache.org/jira/browse/HIVE-7769
> Project: Hive
>  Issue Type: Bug
>Reporter: Na Yang
>Assignee: Na Yang
> Attachments: HIVE-7769.patch
>
>
> Some union all test cases do not generate deterministic ordered result. We 
> need to add  --SORT_BEFORE_DIFF to those .q tests



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7715) CBO:Support Union All

2014-08-18 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-7715:
-

Attachment: HIVE-7715.patch

> CBO:Support Union All
> -
>
> Key: HIVE-7715
> URL: https://issues.apache.org/jira/browse/HIVE-7715
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Laljo John Pullokkaran
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-7715.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7715) CBO:Support Union All

2014-08-18 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-7715:
-

Status: Patch Available  (was: Open)

> CBO:Support Union All
> -
>
> Key: HIVE-7715
> URL: https://issues.apache.org/jira/browse/HIVE-7715
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Laljo John Pullokkaran
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-7715.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 24498: A method to extrapolate the missing column status for the partitions.

2014-08-18 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24498/#review50938
---



metastore/src/java/org/apache/hadoop/hive/metastore/IExtrapolatePartStatus.java


Needs apache license header. Look at top of any other java file.



metastore/src/java/org/apache/hadoop/hive/metastore/IExtrapolatePartStatus.java


Better name: AggrType?



metastore/src/java/org/apache/hadoop/hive/metastore/LinearExtrapolatePartStatus.java


Apache header?



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java


It will be good if this api also returns count for # of partitions for 
which stats were found.



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java


I think checking total == (# of cols) * (# of parts) is better.



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java


We are doing 3 queries per column. We need to do better here. e.g if there 
are 20 columns in table. We will end up with 60 queries. Tables with few 
hundred columns are not unheard of.

Seems like query for column type can be avoided altogether, since column 
names are sent from client, we can also send type info from client, since 
client already has it.

For other queries, also we need to make it independent of # of columns.



ql/src/test/queries/clientpositive/extrapolate_part_stats.q


Why do you have this flag set to false? Unless, there is a reason take this 
off.


- Ashutosh Chauhan


On Aug. 17, 2014, 4:23 a.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24498/
> ---
> 
> (Updated Aug. 17, 2014, 4:23 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> We propose a method to extrapolate the missing column status for the 
> partitions.
> 
> 
> Diffs
> -
> 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/IExtrapolatePartStatus.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/LinearExtrapolatePartStatus.java
>  PRE-CREATION 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> 767cffc 
>   ql/src/test/queries/clientpositive/extrapolate_part_stats.q PRE-CREATION 
>   ql/src/test/results/clientpositive/extrapolate_part_stats.q.out 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/24498/diff/
> 
> 
> Testing
> ---
> 
> 
> File Attachments
> 
> 
> HIVE-7654.0.patch
>   
> https://reviews.apache.org/media/uploaded/files/2014/08/12/77b155b0-a417-4225-b6b7-4c8c6ce2b97d__HIVE-7654.0.patch
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>



[jira] [Commented] (HIVE-7457) Minor HCatalog Pig Adapter test clean up

2014-08-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101615#comment-14101615
 ] 

Hive QA commented on HIVE-7457:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12662623/HIVE-7457.5.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5819 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/388/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/388/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-388/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12662623

> Minor HCatalog Pig Adapter test clean up
> 
>
> Key: HIVE-7457
> URL: https://issues.apache.org/jira/browse/HIVE-7457
> Project: Hive
>  Issue Type: Sub-task
>Reporter: David Chen
>Assignee: David Chen
>Priority: Minor
> Attachments: HIVE-7457.1.patch, HIVE-7457.2.patch, HIVE-7457.3.patch, 
> HIVE-7457.4.patch, HIVE-7457.5.patch
>
>
> Minor cleanup to the HCatalog Pig Adapter tests in preparation for HIVE-7420:
>  * Run through Hive Eclipse formatter.
>  * Convert JUnit 3-style tests to follow JUnit 4 conventions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 24830: HIVE-7548: Precondition checks should not fail the merge task in case of automatic trigger

2014-08-18 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24830/
---

Review request for hive and Gunther Hagleitner.


Repository: hive-git


Description
---

ORC fast merge (HIVE-7509) will fail the merge task in case if any of the 
precondition checks fail. Precondition check fail is good for "ALTER TABLE .. 
CONCATENATE" but not for automatic trigger of merge task from conditional 
resolver. In case if a partition has non-compatible ORC files for merging then 
the merge task should ignore it and not fail the task.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1d6a93a 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeMapper.java beb4f7d 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileMergeMapper.java b36152a 
  ql/src/test/queries/clientnegative/orc_merge1.q b2d42cd 
  ql/src/test/queries/clientnegative/orc_merge2.q 2f62ee7 
  ql/src/test/queries/clientnegative/orc_merge3.q 5158e2e 
  ql/src/test/queries/clientnegative/orc_merge4.q ad48572 
  ql/src/test/queries/clientnegative/orc_merge5.q e94a8cc 
  ql/src/test/queries/clientpositive/orc_merge_incompat1.q PRE-CREATION 
  ql/src/test/queries/clientpositive/orc_merge_incompat2.q PRE-CREATION 
  ql/src/test/results/clientpositive/orc_merge_incompat1.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/orc_merge_incompat2.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/24830/diff/


Testing
---


Thanks,

Prasanth_J



[jira] [Commented] (HIVE-7548) Precondition checks should not fail the merge task in case of automatic trigger

2014-08-18 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101607#comment-14101607
 ] 

Prasanth J commented on HIVE-7548:
--

HIVE-7704 should be updated with these changes as well.

> Precondition checks should not fail the merge task in case of automatic 
> trigger
> ---
>
> Key: HIVE-7548
> URL: https://issues.apache.org/jira/browse/HIVE-7548
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: HIVE-7548.1.patch
>
>
> ORC fast merge (HIVE-7509) will fail the merge task in case if any of the 
> precondition checks fail. Precondition check fail is good for "ALTER TABLE .. 
> CONCATENATE" but not for automatic trigger of merge task from conditional 
> resolver. In case if a partition has non-compatible ORC files for merging 
> then the merge task should ignore it and not fail the task.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7548) Precondition checks should not fail the merge task in case of automatic trigger

2014-08-18 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7548:
-

Attachment: HIVE-7548.1.patch

> Precondition checks should not fail the merge task in case of automatic 
> trigger
> ---
>
> Key: HIVE-7548
> URL: https://issues.apache.org/jira/browse/HIVE-7548
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: HIVE-7548.1.patch
>
>
> ORC fast merge (HIVE-7509) will fail the merge task in case if any of the 
> precondition checks fail. Precondition check fail is good for "ALTER TABLE .. 
> CONCATENATE" but not for automatic trigger of merge task from conditional 
> resolver. In case if a partition has non-compatible ORC files for merging 
> then the merge task should ignore it and not fail the task.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7548) Precondition checks should not fail the merge task in case of automatic trigger

2014-08-18 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7548:
-

Status: Patch Available  (was: Open)

> Precondition checks should not fail the merge task in case of automatic 
> trigger
> ---
>
> Key: HIVE-7548
> URL: https://issues.apache.org/jira/browse/HIVE-7548
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: HIVE-7548.1.patch
>
>
> ORC fast merge (HIVE-7509) will fail the merge task in case if any of the 
> precondition checks fail. Precondition check fail is good for "ALTER TABLE .. 
> CONCATENATE" but not for automatic trigger of merge task from conditional 
> resolver. In case if a partition has non-compatible ORC files for merging 
> then the merge task should ignore it and not fail the task.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7770) Restore backward-incompatible behaviour introduced by HIVE-7341

2014-08-18 Thread Sushanth Sowmyan (JIRA)
Sushanth Sowmyan created HIVE-7770:
--

 Summary: Restore backward-incompatible behaviour introduced by 
HIVE-7341
 Key: HIVE-7770
 URL: https://issues.apache.org/jira/browse/HIVE-7770
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.14.0
Reporter: Sushanth Sowmyan
Assignee: Mithun Radhakrishnan


HIVE-7341 introduced a backward-incompatibility regression in Exception 
signatures for HCatPartition.getColumns() that breaks compilation for external 
tools like Falcon. This bug tracks a scrub of any other issues we discover, so 
we can put them back to how it used to be. This bug needs resolution in the 
same release as HIVE-7341, and thus, must be resolved in 0.14.0.




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7341) Support for Table replication across HCatalog instances

2014-08-18 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101601#comment-14101601
 ] 

Sushanth Sowmyan commented on HIVE-7341:


Ugh, sorry for not catching it at review, but this patch did cause some 
backward incompatibility that broke compilation for Falcon that depends on this 
api. Rather than reverting this patch back, I figure that opening a new bug 
(*MUST* be resolved in 0.14) to put the old exception behaviour back is a less 
destructive way of doing so.

HIVE-7770 tracks that change.

> Support for Table replication across HCatalog instances
> ---
>
> Key: HIVE-7341
> URL: https://issues.apache.org/jira/browse/HIVE-7341
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog
>Affects Versions: 0.13.1
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Fix For: 0.14.0
>
> Attachments: HIVE-7341.1.patch, HIVE-7341.2.patch, HIVE-7341.3.patch, 
> HIVE-7341.4.patch, HIVE-7341.5.patch
>
>
> The HCatClient currently doesn't provide very much support for replicating 
> HCatTable definitions between 2 HCatalog Server (i.e. Hive metastore) 
> instances. 
> Systems similar to Apache Falcon might find the need to replicate partition 
> data between 2 clusters, and keep the HCatalog metadata in sync between the 
> two. This poses a couple of problems:
> # The definition of the source table might change (in column schema, I/O 
> formats, record-formats, serde-parameters, etc.) The system will need a way 
> to diff 2 tables and update the target-metastore with the changes. E.g. 
> {code}
> targetTable.resolve( sourceTable, targetTable.diff(sourceTable) );
> hcatClient.updateTableSchema(dbName, tableName, targetTable);
> {code}
> # The current {{HCatClient.addPartitions()}} API requires that the 
> partition's schema be derived from the table's schema, thereby requiring that 
> the table-schema be resolved *before* partitions with the new schema are 
> added to the table. This is problematic, because it introduces race 
> conditions when 2 partitions with differing column-schemas (e.g. right after 
> a schema change) are copied in parallel. This can be avoided if each 
> HCatAddPartitionDesc kept track of the partition's schema, in flight.
> # The source and target metastores might be running different/incompatible 
> versions of Hive. 
> The impending patch attempts to address these concerns (with some caveats).
> # {{HCatTable}} now has 
> ## a {{diff()}} method, to compare against another HCatTable instance
> ## a {{resolve(diff)}} method to copy over specified table-attributes from 
> another HCatTable
> ## a serialize/deserialize mechanism (via {{HCatClient.serializeTable()}} and 
> {{HCatClient.deserializeTable()}}), so that HCatTable instances constructed 
> in other class-loaders may be used for comparison
> # {{HCatPartition}} now provides finer-grained control over a Partition's 
> column-schema, StorageDescriptor settings, etc. This allows partitions to be 
> copied completely from source, with the ability to override specific 
> properties if required (e.g. location).
> # {{HCatClient.updateTableSchema()}} can now update the entire 
> table-definition, not just the column schema.
> # I've cleaned up and removed most of the redundancy between the HCatTable, 
> HCatCreateTableDesc and HCatCreateTableDesc.Builder. The prior API failed to 
> separate the table-attributes from the add-table-operation's attributes. By 
> providing fluent-interfaces in HCatTable, and composing an HCatTable instance 
> in HCatCreateTableDesc, the interfaces are cleaner(ish). The old setters are 
> deprecated, in favour of those in HCatTable. Likewise, HCatPartition and 
> HCatAddPartitionDesc.
> I'll post a patch for trunk shortly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7530) Go thru the common code to find references to HIVE_EXECUCTION_ENGINE to make sure conditions works with Spark [Spark Branch]

2014-08-18 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7530:
---

   Resolution: Fixed
Fix Version/s: spark-branch
   Status: Resolved  (was: Patch Available)

Thank you for the contribution!! I have committed this to spark!

> Go thru the common code to find references to HIVE_EXECUCTION_ENGINE to make 
> sure conditions works with Spark [Spark Branch]
> 
>
> Key: HIVE-7530
> URL: https://issues.apache.org/jira/browse/HIVE-7530
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Xuefu Zhang
>Assignee: Na Yang
> Fix For: spark-branch
>
> Attachments: HIVE-7530-spark.patch
>
>
> In common code, such as Utilities.java, I found a lot of references to this 
> conf variable and special handling to a specific engine such as following:
> {code}
>   if (!HiveConf.getVar(job, 
> ConfVars.HIVE_EXECUTION_ENGINE).equals("tez")
>   && isEmptyPath(job, path, ctx)) {
> path = createDummyFileForEmptyPartition(path, job, work,
>  hiveScratchDir, alias, sequenceNumber++);
>   }
> {code}
> We need to make sure the condition still holds after a new execution engine 
> such as "spark" is introduced.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7530) Go thru the common code to find references to HIVE_EXECUCTION_ENGINE to make sure conditions works with Spar

2014-08-18 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7530:
---

Summary: Go thru the common code to find references to 
HIVE_EXECUCTION_ENGINE to make sure conditions works with Spar  (was: Go thru 
the common code to find references to HIVE_EXECUCTION_ENGINE to make sure 
conditions works with Spark)

> Go thru the common code to find references to HIVE_EXECUCTION_ENGINE to make 
> sure conditions works with Spar
> 
>
> Key: HIVE-7530
> URL: https://issues.apache.org/jira/browse/HIVE-7530
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Xuefu Zhang
>Assignee: Na Yang
> Attachments: HIVE-7530-spark.patch
>
>
> In common code, such as Utilities.java, I found a lot of references to this 
> conf variable and special handling to a specific engine such as following:
> {code}
>   if (!HiveConf.getVar(job, 
> ConfVars.HIVE_EXECUTION_ENGINE).equals("tez")
>   && isEmptyPath(job, path, ctx)) {
> path = createDummyFileForEmptyPartition(path, job, work,
>  hiveScratchDir, alias, sequenceNumber++);
>   }
> {code}
> We need to make sure the condition still holds after a new execution engine 
> such as "spark" is introduced.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7408) HCatPartition needs getPartCols method

2014-08-18 Thread JongWon Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JongWon Park updated HIVE-7408:
---

Assignee: Navis  (was: JongWon Park)

> HCatPartition needs getPartCols method
> --
>
> Key: HIVE-7408
> URL: https://issues.apache.org/jira/browse/HIVE-7408
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 0.13.0
>Reporter: JongWon Park
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-7408.1.patch.txt
>
>
> org.apache.hive.hcatalog.api.HCatPartition has getColumns method. However, it 
> is not partition column. HCatPartition needs getPartCols method.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7530) Go thru the common code to find references to HIVE_EXECUCTION_ENGINE to make sure conditions works with Spark

2014-08-18 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101596#comment-14101596
 ] 

Brock Noland commented on HIVE-7530:


+1


> Go thru the common code to find references to HIVE_EXECUCTION_ENGINE to make 
> sure conditions works with Spark
> -
>
> Key: HIVE-7530
> URL: https://issues.apache.org/jira/browse/HIVE-7530
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Xuefu Zhang
>Assignee: Na Yang
> Attachments: HIVE-7530-spark.patch
>
>
> In common code, such as Utilities.java, I found a lot of references to this 
> conf variable and special handling to a specific engine such as following:
> {code}
>   if (!HiveConf.getVar(job, 
> ConfVars.HIVE_EXECUTION_ENGINE).equals("tez")
>   && isEmptyPath(job, path, ctx)) {
> path = createDummyFileForEmptyPartition(path, job, work,
>  hiveScratchDir, alias, sequenceNumber++);
>   }
> {code}
> We need to make sure the condition still holds after a new execution engine 
> such as "spark" is introduced.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HIVE-7408) HCatPartition needs getPartCols method

2014-08-18 Thread JongWon Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JongWon Park reassigned HIVE-7408:
--

Assignee: JongWon Park  (was: Navis)

> HCatPartition needs getPartCols method
> --
>
> Key: HIVE-7408
> URL: https://issues.apache.org/jira/browse/HIVE-7408
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 0.13.0
>Reporter: JongWon Park
>Assignee: JongWon Park
>Priority: Minor
> Attachments: HIVE-7408.1.patch.txt
>
>
> org.apache.hive.hcatalog.api.HCatPartition has getColumns method. However, it 
> is not partition column. HCatPartition needs getPartCols method.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7530) Go thru the common code to find references to HIVE_EXECUCTION_ENGINE to make sure conditions works with Spark [Spark Branch]

2014-08-18 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7530:
---

Summary: Go thru the common code to find references to 
HIVE_EXECUCTION_ENGINE to make sure conditions works with Spark [Spark Branch]  
(was: Go thru the common code to find references to HIVE_EXECUCTION_ENGINE to 
make sure conditions works with Spar)

> Go thru the common code to find references to HIVE_EXECUCTION_ENGINE to make 
> sure conditions works with Spark [Spark Branch]
> 
>
> Key: HIVE-7530
> URL: https://issues.apache.org/jira/browse/HIVE-7530
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Xuefu Zhang
>Assignee: Na Yang
> Attachments: HIVE-7530-spark.patch
>
>
> In common code, such as Utilities.java, I found a lot of references to this 
> conf variable and special handling to a specific engine such as following:
> {code}
>   if (!HiveConf.getVar(job, 
> ConfVars.HIVE_EXECUTION_ENGINE).equals("tez")
>   && isEmptyPath(job, path, ctx)) {
> path = createDummyFileForEmptyPartition(path, job, work,
>  hiveScratchDir, alias, sequenceNumber++);
>   }
> {code}
> We need to make sure the condition still holds after a new execution engine 
> such as "spark" is introduced.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7769) add --SORT_BEFORE_DIFF to union all .q tests

2014-08-18 Thread Na Yang (JIRA)
Na Yang created HIVE-7769:
-

 Summary: add --SORT_BEFORE_DIFF to union all .q tests
 Key: HIVE-7769
 URL: https://issues.apache.org/jira/browse/HIVE-7769
 Project: Hive
  Issue Type: Bug
Reporter: Na Yang
Assignee: Na Yang


Some union all test cases do not generate deterministic ordered result. We need 
to add  --SORT_BEFORE_DIFF to those .q tests



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7734) Join stats annotation rule is not updating columns statistics correctly

2014-08-18 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7734:
-

Attachment: HIVE-7734.2.patch

Added Brock's review comments.

> Join stats annotation rule is not updating columns statistics correctly
> ---
>
> Key: HIVE-7734
> URL: https://issues.apache.org/jira/browse/HIVE-7734
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor, Statistics
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Fix For: 0.13.0
>
> Attachments: HIVE-7734.1.patch, HIVE-7734.2.patch
>
>
> HIVE-7679 is not doing the correct thing. The scale down/up factor updating 
> column stats was wrong as ratio = newRowCount/oldRowCount is always infinite 
> (oldRowCount = 0). The old row count should be retrieved from parent 
> corresponding to the current column whose statistics is being updated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-08-18 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---

Status: Patch Available  (was: In Progress)

> Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
> --
>
> Key: HIVE-7405
> URL: https://issues.apache.org/jira/browse/HIVE-7405
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
> HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
> HIVE-7405.8.patch
>
>
> Vectorize the basic case that does not have any count distinct aggregation.
> Add a 4th processing mode in VectorGroupByOperator for reduce where each 
> input VectorizedRowBatch has only values for one key at a time.  Thus, the 
> values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-08-18 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---

Status: In Progress  (was: Patch Available)

> Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
> --
>
> Key: HIVE-7405
> URL: https://issues.apache.org/jira/browse/HIVE-7405
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
> HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
> HIVE-7405.8.patch
>
>
> Vectorize the basic case that does not have any count distinct aggregation.
> Add a 4th processing mode in VectorGroupByOperator for reduce where each 
> input VectorizedRowBatch has only values for one key at a time.  Thus, the 
> values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-08-18 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---

Attachment: HIVE-7405.8.patch

> Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
> --
>
> Key: HIVE-7405
> URL: https://issues.apache.org/jira/browse/HIVE-7405
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
> HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
> HIVE-7405.8.patch
>
>
> Vectorize the basic case that does not have any count distinct aggregation.
> Add a 4th processing mode in VectorGroupByOperator for reduce where each 
> input VectorizedRowBatch has only values for one key at a time.  Thus, the 
> values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7717) Add .q tests coverage for "union all" [Spark Branch]

2014-08-18 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101535#comment-14101535
 ] 

Brock Noland commented on HIVE-7717:


Ok sounds good, please let me know when the patch is up on a separate JIRA for 
trunk!

FYI I noticed that Tez tests union2-9 most of which already have a sort:

https://github.com/apache/hive/blob/trunk/itests/src/test/resources/testconfiguration.properties#L112

> Add .q tests coverage for "union all" [Spark Branch]
> 
>
> Key: HIVE-7717
> URL: https://issues.apache.org/jira/browse/HIVE-7717
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Na Yang
>Assignee: Na Yang
> Attachments: HIVE-7717.1-spark.patch, HIVE-7717.2-spark.patch
>
>
> Add automation test coverage for "union all", by searching through the 
> q-tests in "ql/src/test/queries/clientpositive/" for union tests (like 
> union*.q) and verifying/enabling them on spark.
> Steps to do:
> 1.  Enable a qtest .q in 
> itests/src/test/resources/testconfiguration.properties by adding the .q test 
> files to spark.query.files.
> 2.  Run mvn test -Dtest=TestSparkCliDriver -Dqfile=.q 
> -Dtest.output.overwrite=true -Phadoop-2 to generate the output (located in 
> ql/src/test/results/clientpositive/spark).  File will be called 
> .q.out.
> 3.  Check the generated output is good by verifying the results.  For 
> comparison, check the MR version in 
> ql/src/test/results/clientpositive/.q.out.  The reason its 
> separate is because the explain plan outputs are different for Spark/MR.
> 4.  Checkin the modification to testconfiguration.properties, and the 
> generated q.out file as well.  You only have to generate the output once.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7673) Authorization api: missing privilege objects in create table/view

2014-08-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101525#comment-14101525
 ] 

Hive QA commented on HIVE-7673:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12662574/HIVE-7673.3.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5823 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_table
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/387/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/387/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-387/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12662574

> Authorization api: missing privilege objects in create table/view
> -
>
> Key: HIVE-7673
> URL: https://issues.apache.org/jira/browse/HIVE-7673
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, SQLStandardAuthorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-7673.1.patch, HIVE-7673.2.patch, HIVE-7673.3.patch
>
>
> Issues being addressed:
> - In case of create-table-as-select query, the database the table belongs to 
> is not among the objects to be authorized.
> - Create table has the objectName field of the table entry with the database 
> prefix - like testdb.testtable, instead of just the table name.
> - checkPrivileges(CREATEVIEW) does not include the name of the view being 
> created in outputHObjs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7717) Add .q tests coverage for "union all" [Spark Branch]

2014-08-18 Thread Na Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101520#comment-14101520
 ] 

Na Yang commented on HIVE-7717:
---

Hi Brock,

By looking at those test cases, most of them do not have "order by" following 
the "union all" operator. For the test cases which pass this time might break  
in another run. I think we can hold this patch until the .q files are updated 
in trunk and merged to spark branch. Then I will regenerate the output files 
from the new .q files with sort enabled. What do you think?

Thanks,
Na  

> Add .q tests coverage for "union all" [Spark Branch]
> 
>
> Key: HIVE-7717
> URL: https://issues.apache.org/jira/browse/HIVE-7717
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Na Yang
>Assignee: Na Yang
> Attachments: HIVE-7717.1-spark.patch, HIVE-7717.2-spark.patch
>
>
> Add automation test coverage for "union all", by searching through the 
> q-tests in "ql/src/test/queries/clientpositive/" for union tests (like 
> union*.q) and verifying/enabling them on spark.
> Steps to do:
> 1.  Enable a qtest .q in 
> itests/src/test/resources/testconfiguration.properties by adding the .q test 
> files to spark.query.files.
> 2.  Run mvn test -Dtest=TestSparkCliDriver -Dqfile=.q 
> -Dtest.output.overwrite=true -Phadoop-2 to generate the output (located in 
> ql/src/test/results/clientpositive/spark).  File will be called 
> .q.out.
> 3.  Check the generated output is good by verifying the results.  For 
> comparison, check the MR version in 
> ql/src/test/results/clientpositive/.q.out.  The reason its 
> separate is because the explain plan outputs are different for Spark/MR.
> 4.  Checkin the modification to testconfiguration.properties, and the 
> generated q.out file as well.  You only have to generate the output once.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7696) small changes to mapjoin hashtable

2014-08-18 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101510#comment-14101510
 ] 

Sergey Shelukhin commented on HIVE-7696:


[~gopalv] this jira

> small changes to mapjoin hashtable
> --
>
> Key: HIVE-7696
> URL: https://issues.apache.org/jira/browse/HIVE-7696
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-7696.01.patch, HIVE-7696.01.patch, 
> HIVE-7696.02.patch, HIVE-7696.patch
>
>
> Parts of HIVE-7617 patch that are not related to the core issue, based on 
> some profiling by [~mmokhtar]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7457) Minor HCatalog Pig Adapter test clean up

2014-08-18 Thread David Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101508#comment-14101508
 ] 

David Chen commented on HIVE-7457:
--

Thanks for reviewing this, [~cwsteinbach]. I renamed a file in my patch, which 
may be the source of the RB confusion. TestOrcHCatPigStorer is now 
TestOrcHCatStorerMulti since it inherits from TestHCatStorerMulti.

The rest of the changes should be removing trailing whitespace and converting 
from JUnit 3 to JUnit 4 conventions.

I will look into the TestHCatLoader failure; however, I remember it was 
reproducing without this patch.

> Minor HCatalog Pig Adapter test clean up
> 
>
> Key: HIVE-7457
> URL: https://issues.apache.org/jira/browse/HIVE-7457
> Project: Hive
>  Issue Type: Sub-task
>Reporter: David Chen
>Assignee: David Chen
>Priority: Minor
> Attachments: HIVE-7457.1.patch, HIVE-7457.2.patch, HIVE-7457.3.patch, 
> HIVE-7457.4.patch, HIVE-7457.5.patch
>
>
> Minor cleanup to the HCatalog Pig Adapter tests in preparation for HIVE-7420:
>  * Run through Hive Eclipse formatter.
>  * Convert JUnit 3-style tests to follow JUnit 4 conventions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7457) Minor HCatalog Pig Adapter test clean up

2014-08-18 Thread David Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Chen updated HIVE-7457:
-

Attachment: HIVE-7457.5.patch

Adding new patch rebased on trunk to run pre-commit tests again.

> Minor HCatalog Pig Adapter test clean up
> 
>
> Key: HIVE-7457
> URL: https://issues.apache.org/jira/browse/HIVE-7457
> Project: Hive
>  Issue Type: Sub-task
>Reporter: David Chen
>Assignee: David Chen
>Priority: Minor
> Attachments: HIVE-7457.1.patch, HIVE-7457.2.patch, HIVE-7457.3.patch, 
> HIVE-7457.4.patch, HIVE-7457.5.patch
>
>
> Minor cleanup to the HCatalog Pig Adapter tests in preparation for HIVE-7420:
>  * Run through Hive Eclipse formatter.
>  * Convert JUnit 3-style tests to follow JUnit 4 conventions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7457) Minor HCatalog Pig Adapter test clean up

2014-08-18 Thread David Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Chen updated HIVE-7457:
-

Status: Patch Available  (was: Open)

> Minor HCatalog Pig Adapter test clean up
> 
>
> Key: HIVE-7457
> URL: https://issues.apache.org/jira/browse/HIVE-7457
> Project: Hive
>  Issue Type: Sub-task
>Reporter: David Chen
>Assignee: David Chen
>Priority: Minor
> Attachments: HIVE-7457.1.patch, HIVE-7457.2.patch, HIVE-7457.3.patch, 
> HIVE-7457.4.patch, HIVE-7457.5.patch
>
>
> Minor cleanup to the HCatalog Pig Adapter tests in preparation for HIVE-7420:
>  * Run through Hive Eclipse formatter.
>  * Convert JUnit 3-style tests to follow JUnit 4 conventions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7717) Add .q tests coverage for "union all" [Spark Branch]

2014-08-18 Thread Na Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Na Yang updated HIVE-7717:
--

Attachment: HIVE-7717.3-spark.patch

> Add .q tests coverage for "union all" [Spark Branch]
> 
>
> Key: HIVE-7717
> URL: https://issues.apache.org/jira/browse/HIVE-7717
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Na Yang
>Assignee: Na Yang
> Attachments: HIVE-7717.1-spark.patch, HIVE-7717.2-spark.patch
>
>
> Add automation test coverage for "union all", by searching through the 
> q-tests in "ql/src/test/queries/clientpositive/" for union tests (like 
> union*.q) and verifying/enabling them on spark.
> Steps to do:
> 1.  Enable a qtest .q in 
> itests/src/test/resources/testconfiguration.properties by adding the .q test 
> files to spark.query.files.
> 2.  Run mvn test -Dtest=TestSparkCliDriver -Dqfile=.q 
> -Dtest.output.overwrite=true -Phadoop-2 to generate the output (located in 
> ql/src/test/results/clientpositive/spark).  File will be called 
> .q.out.
> 3.  Check the generated output is good by verifying the results.  For 
> comparison, check the MR version in 
> ql/src/test/results/clientpositive/.q.out.  The reason its 
> separate is because the explain plan outputs are different for Spark/MR.
> 4.  Checkin the modification to testconfiguration.properties, and the 
> generated q.out file as well.  You only have to generate the output once.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7717) Add .q tests coverage for "union all" [Spark Branch]

2014-08-18 Thread Na Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Na Yang updated HIVE-7717:
--

Attachment: (was: HIVE-7717.3-spark.patch)

> Add .q tests coverage for "union all" [Spark Branch]
> 
>
> Key: HIVE-7717
> URL: https://issues.apache.org/jira/browse/HIVE-7717
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Na Yang
>Assignee: Na Yang
> Attachments: HIVE-7717.1-spark.patch, HIVE-7717.2-spark.patch
>
>
> Add automation test coverage for "union all", by searching through the 
> q-tests in "ql/src/test/queries/clientpositive/" for union tests (like 
> union*.q) and verifying/enabling them on spark.
> Steps to do:
> 1.  Enable a qtest .q in 
> itests/src/test/resources/testconfiguration.properties by adding the .q test 
> files to spark.query.files.
> 2.  Run mvn test -Dtest=TestSparkCliDriver -Dqfile=.q 
> -Dtest.output.overwrite=true -Phadoop-2 to generate the output (located in 
> ql/src/test/results/clientpositive/spark).  File will be called 
> .q.out.
> 3.  Check the generated output is good by verifying the results.  For 
> comparison, check the MR version in 
> ql/src/test/results/clientpositive/.q.out.  The reason its 
> separate is because the explain plan outputs are different for Spark/MR.
> 4.  Checkin the modification to testconfiguration.properties, and the 
> generated q.out file as well.  You only have to generate the output once.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7717) Add .q tests coverage for "union all" [Spark Branch]

2014-08-18 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101502#comment-14101502
 ] 

Brock Noland commented on HIVE-7717:


Gotcha,

Based on your runs...if the order is not deterministic we can either add an 
ORDER BY to the query or add the following to the top of the q file:

{noformat}
-- SORT_BEFORE_DIFF
{noformat}

in either case we'd probably want to make the change on trunk and then merge to 
our branch since the MR outputs would also need to be updated. We'd probably 
want to remove those tests from this change and add them in a follow-up.

> Add .q tests coverage for "union all" [Spark Branch]
> 
>
> Key: HIVE-7717
> URL: https://issues.apache.org/jira/browse/HIVE-7717
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Na Yang
>Assignee: Na Yang
> Attachments: HIVE-7717.1-spark.patch, HIVE-7717.2-spark.patch
>
>
> Add automation test coverage for "union all", by searching through the 
> q-tests in "ql/src/test/queries/clientpositive/" for union tests (like 
> union*.q) and verifying/enabling them on spark.
> Steps to do:
> 1.  Enable a qtest .q in 
> itests/src/test/resources/testconfiguration.properties by adding the .q test 
> files to spark.query.files.
> 2.  Run mvn test -Dtest=TestSparkCliDriver -Dqfile=.q 
> -Dtest.output.overwrite=true -Phadoop-2 to generate the output (located in 
> ql/src/test/results/clientpositive/spark).  File will be called 
> .q.out.
> 3.  Check the generated output is good by verifying the results.  For 
> comparison, check the MR version in 
> ql/src/test/results/clientpositive/.q.out.  The reason its 
> separate is because the explain plan outputs are different for Spark/MR.
> 4.  Checkin the modification to testconfiguration.properties, and the 
> generated q.out file as well.  You only have to generate the output once.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5538) Turn on vectorization by default.

2014-08-18 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101503#comment-14101503
 ] 

Sergey Shelukhin commented on HIVE-5538:


You can consider using TestCompareCliDriver.
See ./ql/src/test/queries/clientcompare/vectorized_math_funcs.q and 
corresponding .qv files in the same directory.
This Cli driver runs the q file with each header and compares the results, 
failing if they are different (there's no .out file).
Maybe if some set of focused tests can be created, it would ensure the 
vectorized and non-vectorized code paths match.

> Turn on vectorization by default.
> -
>
> Key: HIVE-5538
> URL: https://issues.apache.org/jira/browse/HIVE-5538
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jitendra Nath Pandey
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-5538.1.patch, HIVE-5538.2.patch, HIVE-5538.3.patch, 
> HIVE-5538.4.patch, HIVE-5538.5.patch, HIVE-5538.5.patch, HIVE-5538.6.patch
>
>
>   Vectorization should be turned on by default, so that users don't have to 
> specifically enable vectorization. 
>   Vectorization code validates and ensures that a query falls back to row 
> mode if it is not supported on vectorized code path. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7530) Go thru the common code to find references to HIVE_EXECUCTION_ENGINE to make sure conditions works with Spark

2014-08-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101499#comment-14101499
 ] 

Hive QA commented on HIVE-7530:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12662579/HIVE-7530-spark.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 5917 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_fs_default_name2
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/59/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/59/console
Test logs: 
http://ec2-54-176-176-199.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-59/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12662579

> Go thru the common code to find references to HIVE_EXECUCTION_ENGINE to make 
> sure conditions works with Spark
> -
>
> Key: HIVE-7530
> URL: https://issues.apache.org/jira/browse/HIVE-7530
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Xuefu Zhang
>Assignee: Na Yang
> Attachments: HIVE-7530-spark.patch
>
>
> In common code, such as Utilities.java, I found a lot of references to this 
> conf variable and special handling to a specific engine such as following:
> {code}
>   if (!HiveConf.getVar(job, 
> ConfVars.HIVE_EXECUTION_ENGINE).equals("tez")
>   && isEmptyPath(job, path, ctx)) {
> path = createDummyFileForEmptyPartition(path, job, work,
>  hiveScratchDir, alias, sequenceNumber++);
>   }
> {code}
> We need to make sure the condition still holds after a new execution engine 
> such as "spark" is introduced.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7717) Add .q tests coverage for "union all" [Spark Branch]

2014-08-18 Thread Na Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101485#comment-14101485
 ] 

Na Yang commented on HIVE-7717:
---

Hi Brock,

Those tests passed locally. I re-generated some of the test results and found 
out that the explain plans are the same between test runs, but the result data 
have different order although the same number of rows are returned. Let me 
re-ran those tests and upload a new patch.

Thanks,
Na

> Add .q tests coverage for "union all" [Spark Branch]
> 
>
> Key: HIVE-7717
> URL: https://issues.apache.org/jira/browse/HIVE-7717
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Na Yang
>Assignee: Na Yang
> Attachments: HIVE-7717.1-spark.patch, HIVE-7717.2-spark.patch
>
>
> Add automation test coverage for "union all", by searching through the 
> q-tests in "ql/src/test/queries/clientpositive/" for union tests (like 
> union*.q) and verifying/enabling them on spark.
> Steps to do:
> 1.  Enable a qtest .q in 
> itests/src/test/resources/testconfiguration.properties by adding the .q test 
> files to spark.query.files.
> 2.  Run mvn test -Dtest=TestSparkCliDriver -Dqfile=.q 
> -Dtest.output.overwrite=true -Phadoop-2 to generate the output (located in 
> ql/src/test/results/clientpositive/spark).  File will be called 
> .q.out.
> 3.  Check the generated output is good by verifying the results.  For 
> comparison, check the MR version in 
> ql/src/test/results/clientpositive/.q.out.  The reason its 
> separate is because the explain plan outputs are different for Spark/MR.
> 4.  Checkin the modification to testconfiguration.properties, and the 
> generated q.out file as well.  You only have to generate the output once.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   3   >