Re: hive transaction strange behaviour

2015-11-17 Thread Sanjeev Verma
Any help will be much appreciated.Thanks

On Tue, Nov 17, 2015 at 2:39 PM, Sanjeev Verma 
wrote:

> Thank Elliot, Eugene
> I am able to see the Base file created in one of the partition, seems the
> Compactor kicked in and created it but it has not created base files in
> rest of the partition where delta files still exists.why compactor has not
> picked the other partition, when and how these partition will be picked up
> for compaction.
>
> Thanks
>
> On Sat, Nov 14, 2015 at 11:01 PM, Eugene Koifman  > wrote:
>
>> When Compaction process runs, it will create base directory.
>>
>> https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration
>>
>>
>> at a minimum you need hive.compactor.initiator.on
>> =true
>> and hive.compactor.worker.threads
>> 
>> >0
>>
>> Also, see
>> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/PartitionCompact
>>  on
>> how to trigger compaction manually.
>>
>> *Eugene*
>>
>> From: Sanjeev Verma 
>> Reply-To: "u...@hive.apache.org" 
>> Date: Thursday, November 12, 2015 at 11:41 PM
>> To: "u...@hive.apache.org" , "dev@hive.apache.org"
>> 
>> Subject: hive transaction strange behaviour
>>
>> I have enable the hive transaction and able to see the delta files
>> created for some of the partition but i dont not see any base file created
>> yet.it seems strange to me seeing so many delta files without any base
>> file.
>> Could somebody let me know when Base file created.
>>
>> Thanks
>>
>
>


[jira] [Created] (HIVE-12453) Check SessionState status before performing cleanup

2015-11-17 Thread Wei Zheng (JIRA)
Wei Zheng created HIVE-12453:


 Summary: Check SessionState status before performing cleanup
 Key: HIVE-12453
 URL: https://issues.apache.org/jira/browse/HIVE-12453
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.3.0, 2.0.0
Reporter: Wei Zheng
Assignee: Wei Zheng






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12452) orc_merge9.q hangs when writing orc metadata section

2015-11-17 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-12452:


 Summary: orc_merge9.q hangs when writing orc metadata section
 Key: HIVE-12452
 URL: https://issues.apache.org/jira/browse/HIVE-12452
 Project: Hive
  Issue Type: Bug
  Components: ORC
Affects Versions: 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: orc-writer-hang.png

When running tests for HIVE-12450 orc_merge9.q hung without completing.
See attached screenshot for the thread that hung.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 38663: HIVE-11878: ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive

2015-11-17 Thread Ratandeep Ratti

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38663/
---

(Updated Nov. 18, 2015, 5:21 a.m.)


Review request for hive.


Changes
---

Using SystemClassloder as parent of per session classloader


Summary (updated)
-

HIVE-11878: ClassNotFoundException can possibly occur if multiple jars are 
registered one at a time in Hive


Bugs: HIVE-11878
https://issues.apache.org/jira/browse/HIVE-11878


Repository: hive-git


Description (updated)
---

HIVE-11878: ClassNotFoundException can possibly occur if multiple jars are 
registered one at a time in Hive


Diffs (updated)
-

  conf/ivysettings.xml bda842a89bb07710fdcd7180a00833a7388ada8f 
  itests/custom-udfs/pom.xml PRE-CREATION 
  itests/custom-udfs/udf-classloader-udf1/pom.xml PRE-CREATION 
  
itests/custom-udfs/udf-classloader-udf1/src/main/java/hive/it/custom/udfs/UDF1.java
 PRE-CREATION 
  itests/custom-udfs/udf-classloader-udf2/pom.xml PRE-CREATION 
  
itests/custom-udfs/udf-classloader-udf2/src/main/java/hive/it/custom/udfs/UDF2.java
 PRE-CREATION 
  itests/custom-udfs/udf-classloader-util/pom.xml PRE-CREATION 
  
itests/custom-udfs/udf-classloader-util/src/main/java/hive/it/custom/udfs/Util.java
 PRE-CREATION 
  itests/pom.xml 0686f1fd58c2be26b2ee645c4e244159aec565e5 
  itests/qtest/pom.xml 8db6fb04d0a5d4600bc23543a0215d31c1cd0648 
  ql/src/java/org/apache/hadoop/hive/ql/exec/UDFClassLoader.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 
de2eb984159526048e8dacf71d3ff8b0647394a3 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 
ff875df98e1dd64a8af3ad22f4b38dbc1d6a1923 
  ql/src/test/queries/clientpositive/udf_classloader.q PRE-CREATION 
  
ql/src/test/queries/clientpositive/udf_classloader_dynamic_dependency_resolution.q
 PRE-CREATION 
  ql/src/test/results/clientpositive/udf_classloader.q.out PRE-CREATION 
  
ql/src/test/results/clientpositive/udf_classloader_dynamic_dependency_resolution.q.out
 PRE-CREATION 

Diff: https://reviews.apache.org/r/38663/diff/


Testing
---


Thanks,

Ratandeep Ratti



[GitHub] hive pull request: HIVE-12054 Create vectorized ORC writer

2015-11-17 Thread omalley
Github user omalley closed the pull request at:

https://github.com/apache/hive/pull/55


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] hive pull request: HIVE-12055 Stub out the row by row ORC writer.

2015-11-17 Thread omalley
GitHub user omalley opened a pull request:

https://github.com/apache/hive/pull/56

HIVE-12055 Stub out the row by row ORC writer.

Remove the native row by row ORC writer and replace it with stubs that use 
the vectorized writer.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/omalley/hive hive-12055

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/56.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #56


commit d5bf38f1e9b1021e8be55a18432035407f17460e
Author: Owen O'Malley 
Date:   2015-11-11T23:24:03Z

HIVE-12054. Create vectorized ORC write method.

commit 9bfa63a478cca68cd111b32366226a338e3cc86d
Author: Owen O'Malley 
Date:   2015-11-02T17:00:33Z

HIVE-11890. Create ORC submodue.

commit 605a980045ecb4cfce2a8c915c7ad2c8b591bfb2
Author: Owen O'Malley 
Date:   2015-11-13T23:43:59Z

HIVE-12055. Move WriterImpl over to orc module.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Review Request 40423: HIVE-10937 LLAP plan cache

2015-11-17 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40423/
---

Review request for hive, Gopal V and Gunther Hagleitner.


Repository: hive-git


Description
---

see JIRA


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 953e52c 
  common/src/java/org/apache/hive/common/util/FixedSizedObjectPool.java 600c443 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ObjectCache.java 440e0a1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ObjectCacheFactory.java 3d9771a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ObjectCacheWrapper.java 9768efa 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ObjectCache.java 008f8a4 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/LlapMultiObjectCache.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/LlapObjectCache.java 0141230 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MapRecordProcessor.java 
1d645a0 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileRecordProcessor.java 
bb56e1c 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ObjectCache.java 06dca00 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RecordProcessor.java 2f08529 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordProcessor.java 
8768847 

Diff: https://reviews.apache.org/r/40423/diff/


Testing
---


Thanks,

Sergey Shelukhin



[jira] [Created] (HIVE-12451) Orc fast file merging/concatenation should be disabled for ACID tables

2015-11-17 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-12451:


 Summary: Orc fast file merging/concatenation should be disabled 
for ACID tables
 Key: HIVE-12451
 URL: https://issues.apache.org/jira/browse/HIVE-12451
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0, 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


For ACID tables merging of small files should happen only through compaction. 
We should disable "alter table .. concatenate" for ACID tables. We should also 
disable ConditionalMergeFileTask if destination is an ACID table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12450) OrcFileMergeOperator does not use correct compression buffer size

2015-11-17 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-12450:


 Summary: OrcFileMergeOperator does not use correct compression 
buffer size
 Key: HIVE-12450
 URL: https://issues.apache.org/jira/browse/HIVE-12450
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.1, 1.2.0, 1.3.0, 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


OrcFileMergeOperator checks for compatibility before merging orc files. This 
compatibility check include checking compression buffer size. But the output 
file that is created does not honor the compression buffer size and always 
defaults to 256KB. This will not be a problem when reading the orc file but can 
create unwanted memory pressure because of wasted space within compression 
buffer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12449) Report progress information from the Tez processor

2015-11-17 Thread Siddharth Seth (JIRA)
Siddharth Seth created HIVE-12449:
-

 Summary: Report progress information from the Tez processor
 Key: HIVE-12449
 URL: https://issues.apache.org/jira/browse/HIVE-12449
 Project: Hive
  Issue Type: Improvement
  Components: Tez
Reporter: Siddharth Seth
Assignee: Vikram Dixit K


After TEZ-808, Tez tracks processor progress and can kill the tasks if they 
don't make progress fast enough (disabled by default). Hive needs to start 
reporting progress while processing records.

Also, progress will eventually help with better speculation decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 40415: HIVE-11675 make use of file footer PPD API in ETL strategy or separate strategy

2015-11-17 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40415/
---

Review request for hive, Gopal V, Prasanth_J, and Vikram Dixit Kumaraswamy.


Repository: hive-git


Description
---

see jira


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 953e52c 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
c5e7a5f 
  metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
aa96f77 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 46862da 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 488d923 
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 
ec90481 
  storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java 
dc71db4 
  storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java 
d70b3b0 
  
storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java 
eeff131 

Diff: https://reviews.apache.org/r/40415/diff/


Testing
---


Thanks,

Sergey Shelukhin



[jira] [Created] (HIVE-12448) Change to tracking of dag status via dagIdentifier instead of dag name

2015-11-17 Thread Siddharth Seth (JIRA)
Siddharth Seth created HIVE-12448:
-

 Summary: Change to tracking of dag status via dagIdentifier 
instead of dag name
 Key: HIVE-12448
 URL: https://issues.apache.org/jira/browse/HIVE-12448
 Project: Hive
  Issue Type: Sub-task
  Components: llap
Affects Versions: 2.0.0
Reporter: Siddharth Seth
Assignee: Siddharth Seth






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12446) Tracking jira for changes required for move to Tez 0.8.2

2015-11-17 Thread Siddharth Seth (JIRA)
Siddharth Seth created HIVE-12446:
-

 Summary: Tracking jira for changes required for move to Tez 0.8.2
 Key: HIVE-12446
 URL: https://issues.apache.org/jira/browse/HIVE-12446
 Project: Hive
  Issue Type: Task
Reporter: Siddharth Seth






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12447) Fix LlapTaskReporter post TEZ-808 changes

2015-11-17 Thread Siddharth Seth (JIRA)
Siddharth Seth created HIVE-12447:
-

 Summary: Fix LlapTaskReporter post TEZ-808 changes
 Key: HIVE-12447
 URL: https://issues.apache.org/jira/browse/HIVE-12447
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 2.0.0
Reporter: Siddharth Seth
Assignee: Siddharth Seth






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12445) Tracking of completed dags is a slow memory leak

2015-11-17 Thread Siddharth Seth (JIRA)
Siddharth Seth created HIVE-12445:
-

 Summary: Tracking of completed dags is a slow memory leak
 Key: HIVE-12445
 URL: https://issues.apache.org/jira/browse/HIVE-12445
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Siddharth Seth


LLAP daemons track completed DAGs, but never clean up these structures. This is 
primarily to disallow out of order executions. Evaluate whether that can be 
avoided - otherwise this structure needs to be cleaned up with a delay.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12444) Queries against ACID table without base directory may throw exception

2015-11-17 Thread Wei Zheng (JIRA)
Wei Zheng created HIVE-12444:


 Summary: Queries against ACID table without base directory may 
throw exception
 Key: HIVE-12444
 URL: https://issues.apache.org/jira/browse/HIVE-12444
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.1
Reporter: Wei Zheng
Assignee: Wei Zheng


Steps to reproduce:

set hive.fetch.task.conversion=minimal;
set hive.limit.optimize.enable=true;

create table acidtest1(
 c_custkey int,
 c_name string,
 c_nationkey int,
 c_acctbal double)
clustered by (c_nationkey) into 3 buckets
stored as orc
tblproperties("transactional"="true");

insert into table acidtest1
select c_custkey, c_name, c_nationkey, c_acctbal from tpch_text_10.customer;

select cast (c_nationkey as string) from acidtest.acidtest1 limit 10;
{code}
DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, 
vertexId=vertex_1447362491939_0020_1_00, diagnostics=[Vertex 
vertex_1447362491939_0020_1_00 [Map 1] killed/failed due 
to:ROOT_INPUT_INIT_FAILURE, Vertex Input: acidtest1 initializer failed, 
vertex=vertex_1447362491939_0020_1_00 [Map 1], java.lang.RuntimeException: 
serious problem
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1035)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1062)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:308)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:410)
at 
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:155)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:246)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:240)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:240)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:227)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: 
java.lang.IllegalArgumentException: delta_017_017 does not start with 
base_
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1012)
... 15 more
Caused by: java.lang.IllegalArgumentException: delta_017_017 does not 
start with base_
at org.apache.hadoop.hive.ql.io.AcidUtils.parseBase(AcidUtils.java:144)
at 
org.apache.hadoop.hive.ql.io.AcidUtils.parseBaseBucketFilename(AcidUtils.java:172)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:667)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:625)
... 4 more
]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12443) Hive Streaming should expose encoding and serdes for testing

2015-11-17 Thread Alan Gates (JIRA)
Alan Gates created HIVE-12443:
-

 Summary: Hive Streaming should expose encoding and serdes for 
testing
 Key: HIVE-12443
 URL: https://issues.apache.org/jira/browse/HIVE-12443
 Project: Hive
  Issue Type: Improvement
  Components: Testing Infrastructure, Transactions
Affects Versions: 2.0.0
Reporter: Alan Gates
Assignee: Alan Gates


Currently how records are passed into the hive streaming RecordWriter are 
converted from the inbound format to Hive format is opaque.  The encoding and 
writing are done in a single call to RecordWriter.write().  This is problematic 
for test tools that want to intercept the record stream and write it to a 
benchmark in addition to Hive.

All existing RecordWriters have an encode and getSerDe methods.  I propose to 
expose these by making them public in AbstractRecordWriter, and making 
AbstractRecordWriter a public class (it is currently package private).  This 
keeps the RecordWriter interface clean (stream writers will not need to 
directly call these methods) and avoids any backwards incompatible changes.  
Having AbstractRecordWriter public is also desirable for anyone who wants to 
write their own RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12442) Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks

2015-11-17 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-12442:
---

 Summary: Refactor/repackage HiveServer2's Thrift code so that it 
can be used in the tasks
 Key: HIVE-12442
 URL: https://issues.apache.org/jira/browse/HIVE-12442
 Project: Hive
  Issue Type: Improvement
Reporter: Vaibhav Gumashta
Assignee: Rohit Dholakia


For implementing HIVE-12427, the tasks will need to have knowledge of thrift 
types from HS2's thrift API. This jira will look at the least invasive way to 
do that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12441) Driver.acquireLocksAndOpenTxn() should only call recordValidTxns() when needed

2015-11-17 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-12441:
-

 Summary: Driver.acquireLocksAndOpenTxn() should only call 
recordValidTxns() when needed
 Key: HIVE-12441
 URL: https://issues.apache.org/jira/browse/HIVE-12441
 Project: Hive
  Issue Type: Bug
  Components: CLI, Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman


recordValidTxns() is only needed if ACID tables are part of the query.  
Otherwise it's just overhead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12440) expose TxnHandler.abortTxns(Connection dbConn, List txnids) as metastore opertaion

2015-11-17 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-12440:
-

 Summary: expose TxnHandler.abortTxns(Connection dbConn, List 
txnids) as metastore opertaion
 Key: HIVE-12440
 URL: https://issues.apache.org/jira/browse/HIVE-12440
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Metastore, Thrift API, Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman


this is useful for Streaming ingest API where a txn batch is closed before all 
txns have been used up

see TransactionBatch.close()/HIVE-12307

Requires Thrift change



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 40055: HIVE-12017

2015-11-17 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40055/#review106922
---


Seems like we can still leverage few more optimizations which dont need stats. 
Current patch is too aggressive and disables those when stats are not there. I 
think we should run these optimizations before attempting join ordering 
algorithm, since those rules in most cases will produce compact optimal plan.


ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java (lines 883 - 
889)


All these rules don't need stats, it will be better to move them out of try 
block, so that we still leverage them even when stats are not available.


- Ashutosh Chauhan


On Nov. 16, 2015, 6:16 p.m., Jesús Camacho Rodríguez wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40055/
> ---
> 
> (Updated Nov. 16, 2015, 6:16 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and John Pullokkaran.
> 
> 
> Bugs: HIVE-12017
> https://issues.apache.org/jira/browse/HIVE-12017
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Do not disable CBO by default when number of joins in a query is equal or 
> less than 1
> 
> 
> Diffs
> -
> 
>   hbase-handler/src/test/results/positive/hbase_queries.q.out 
> d044c7ed3874acaf521d83bdddfa02276bf71cb3 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOptUtil.java 
> b4e7d47134357bc1e25af8642373ffb9babc015b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateProjectMergeRule.java
>  53f04ee72d8a614a602ada688f89d1febd467689 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/SqlFunctionConverter.java
>  a17fb9498557fc95f273240c1484d69f514fcad0 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 
> de67b54a2c6cfd9bc4413ebf7f715e54c61b966f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 1ca113cba30b26a38abf4910aafd0bec2bbd9a51 
>   ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 
> 623c2e85a84919b41735913c3da32514f5d3ff22 
>   ql/src/test/results/clientnegative/join_nonexistent_part.q.out 
> 391dd0592611d7af8484c52efde3a50fb7dfa44d 
>   ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out 
> aa380b20efee11a0a3a4c7acaeb9482444c1d3ce 
>   ql/src/test/results/clientpositive/archive_excludeHadoop20.q.out 
> c2b98727d21f4990ae7496a0a8fa9ac16598f4c0 
>   ql/src/test/results/clientpositive/archive_multi.q.out 
> 0ad29d122153bd4adf4d19064188b0c4f94e05ab 
>   ql/src/test/results/clientpositive/auto_join1.q.out 
> 48ad641788a6adfad5f7e4fcdfef3d67eac70a4e 
>   ql/src/test/results/clientpositive/auto_join10.q.out 
> fa6f62d18abbf517c4e49ac3fa9da190c23a119f 
>   ql/src/test/results/clientpositive/auto_join11.q.out 
> 851920b9dce7d9fb8d105ef81404f3f67166ad15 
>   ql/src/test/results/clientpositive/auto_join14.q.out 
> 47e1724ab18ac322a83f687fab37ea44c4fdf78a 
>   ql/src/test/results/clientpositive/auto_join24.q.out 
> 5b573033d317e3e7dbf70f9b6ef253b35ac7c140 
>   ql/src/test/results/clientpositive/auto_join26.q.out 
> 94ab76f750a2ce51a645012dcd5beb43b560445a 
>   ql/src/test/results/clientpositive/auto_join32.q.out 
> 161ab6b377a644e62a94d69aa9d3bba02b8045e6 
>   ql/src/test/results/clientpositive/auto_join_filters.q.out 
> a6720d908f4c5a354cb4f3234f8c288249d35d2d 
>   ql/src/test/results/clientpositive/auto_join_nulls.q.out 
> 4416f3e921a3590223658eb6b0e15c317733a7e2 
>   ql/src/test/results/clientpositive/auto_join_reordering_values.q.out 
> f42b45e635ca5e271d48ea6bc48c8b0e45ac67d1 
>   ql/src/test/results/clientpositive/auto_join_stats.q.out 
> d75d6c42eba366905afb4e6e171402c50581ba05 
>   ql/src/test/results/clientpositive/auto_join_stats2.q.out 
> a0aefa3de8aa07ab7f4a634fcc22b29ba621a6c5 
>   ql/src/test/results/clientpositive/auto_smb_mapjoin_14.q.out 
> 1dc9cd07cddb5bce3b2369c1776b690bb239e050 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 
> f1aadef724d6f10ca4a710a3d11382e2f01ca1e5 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_10.q.out 
> fb1e6568de332e930e7836e09aef142f7f66eb17 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out 
> 5dad0fb366d4e1fc21a9a7ba034d60c942e8664e 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out 
> b1ba1483e1ab83c3f7ea71fddf5247bfc5dbde0b 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_14.q.out 
> 33c56fdc6d6f01377dd78e77b99c229ff437d802 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_15.q.out 
> 460e5b1b0f60c213f3a14172482a9a8f8e85454d 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 
> a7a5faa8f8cc29ff53328e6db598cc0acf4cb68e 
>   ql/src/

[jira] [Created] (HIVE-12439) CompactionTxnHandler.markCleaned() add safety guards

2015-11-17 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-12439:
-

 Summary: CompactionTxnHandler.markCleaned() add safety guards
 Key: HIVE-12439
 URL: https://issues.apache.org/jira/browse/HIVE-12439
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman


# add distinct to s = "select txn_id from TXNS, TXN_COMPONENTS where txn_id = 
tc_txnid and txn_state = '" +
   TXN_ABORTED + "' and tc_database = '" + info.dbname + "' and 
tc_table = '" +

# add a safeguard to make sure IN clause is not too large; break up by txn id 
to delete from TXN_COMPONENTS where tc_txnid in ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12438) Separate LLAP client side and server side config parameters

2015-11-17 Thread Siddharth Seth (JIRA)
Siddharth Seth created HIVE-12438:
-

 Summary: Separate LLAP client side and server side config 
parameters
 Key: HIVE-12438
 URL: https://issues.apache.org/jira/browse/HIVE-12438
 Project: Hive
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Siddharth Seth


Potentially separate out the files used as well. llap-daemon-site vs 
llap-client-site.

Most llap parameters are server side only. For ones which are required in 
clients / AM - add an equivalent client side parameter for these.

Also - parameters which enable the llap cache could be renamed.

cc [~sershe]




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12437) SMB join in tez fails when one of the tables is empty

2015-11-17 Thread Vikram Dixit K (JIRA)
Vikram Dixit K created HIVE-12437:
-

 Summary: SMB join in tez fails when one of the tables is empty
 Key: HIVE-12437
 URL: https://issues.apache.org/jira/browse/HIVE-12437
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 1.2.1, 1.0.1
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
Priority: Critical


It looks like a better check for empty tables is to depend on the existence of 
the record reader for the input from tez. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12436) Default hive.metastore.schema.verification to true

2015-11-17 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-12436:
---

 Summary: Default hive.metastore.schema.verification to true
 Key: HIVE-12436
 URL: https://issues.apache.org/jira/browse/HIVE-12436
 Project: Hive
  Issue Type: Task
  Components: Metastore
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


It enforces metastore schema version consistency



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12435) SELECT COUNT(CASE WHEN...) GROUPBY returns wrong results in a case of ORC

2015-11-17 Thread Takahiko Saito (JIRA)
Takahiko Saito created HIVE-12435:
-

 Summary: SELECT COUNT(CASE WHEN...) GROUPBY returns wrong results 
in a case of ORC
 Key: HIVE-12435
 URL: https://issues.apache.org/jira/browse/HIVE-12435
 Project: Hive
  Issue Type: Bug
  Components: ORC
Affects Versions: 2.0.0
Reporter: Takahiko Saito


Run the following query:
{noformat}
create table count_case_groupby (key string, bool boolean) STORED AS orc;
insert into table count_case_groupby values ('key1', true),('key2', 
false),('key3', NULL),('key4', false),('key5',NULL);
{noformat}
The table contains the following:
{noformat}
key1true
key2false
key3NULL
key4false
key5NULL
{noformat}
The below query returns:
{noformat}
SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) AS 
cnt_bool0_ok FROM count_case_groupby GROUP BY key;
key11
key21
key31
key41
key51
{noformat}

while it expects the following results:
{noformat}
key11
key21
key30
key41
key50
{noformat}

The query works with hive ver 1.2. Also it works when a table is not orc format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12434) Merge spark into master 11/17/1015

2015-11-17 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created HIVE-12434:
--

 Summary: Merge spark into master 11/17/1015
 Key: HIVE-12434
 URL: https://issues.apache.org/jira/browse/HIVE-12434
 Project: Hive
  Issue Type: Task
  Components: Spark
Affects Versions: 2.0.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


There are still a few patches that are in Spark branch only. We need to merge 
them to master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12433) Merge trunk into spark 11/17/2015 [Spark Branch]

2015-11-17 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created HIVE-12433:
--

 Summary: Merge trunk into spark 11/17/2015 [Spark Branch]
 Key: HIVE-12433
 URL: https://issues.apache.org/jira/browse/HIVE-12433
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: 1.1.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 40344: HIVE-6113 Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

2015-11-17 Thread Oleksiy Sayankin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40344/
---

(Updated Nov. 17, 2015, 11:22 a.m.)


Review request for hive.


Changes
---

Code change summary. 
Renamed:
datanucleus.validateTables ---> datanucleus.schema.validateTables
datanucleus.validateColumns ---> datanucleus.schema.validateColumns
datanucleus.validateConstraints ---> datanucleus.schema.validateConstraints
datanucleus.autoCreateSchema ---> datanucleus.schema.autoCreateAll

Deleted:
datanucleus.fixedDatastore


Repository: hive-git


Description
---

Bugs in DataNucleus and MySQL Connector! Any Hive user will potentially hit the 
problem.

My SELECT query got incorrect results because JDOQLQuery.compileQueryFull 
swallowed the fatal
datastore exception when it called RDBMSQueryUtils.getStatementForCandidates. 
This is a bug
in RDBMSQueryUtils.getStatementForCandidates or its caller(s).


Diffs (updated)
-

  beeline/src/test/org/apache/hive/beeline/cli/TestHiveCli.java 21ba690 
  beeline/src/test/resources/hive-site.xml b2347c7 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 425c7d9 
  data/conf/hive-site.xml d15cc17 
  data/conf/llap/hive-site.xml becb5b2 
  data/conf/spark/standalone/hive-site.xml 38d0832 
  data/conf/spark/yarn-client/hive-site.xml ada3f3b 
  data/conf/tez/hive-site.xml d008ad1 
  hcatalog/src/test/e2e/templeton/deployers/config/hive/hive-site.mssql.xml 
8473d99 
  hcatalog/src/test/e2e/templeton/deployers/config/hive/hive-site.mysql.xml 
b6f1ab7 
  metastore/scripts/upgrade/mssql/README 8e5a33e 
  pom.xml 4a90cef 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java de2eb98 

Diff: https://reviews.apache.org/r/40344/diff/


Testing
---


Thanks,

Oleksiy Sayankin



[jira] [Created] (HIVE-12432) Hive on Spark Counter "RECORDS_OUT" always be zero

2015-11-17 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-12432:


 Summary: Hive on Spark Counter "RECORDS_OUT" always  be zero
 Key: HIVE-12432
 URL: https://issues.apache.org/jira/browse/HIVE-12432
 Project: Hive
  Issue Type: Bug
  Components: Spark, Statistics
Affects Versions: 1.2.1
Reporter: Nemon Lou
Assignee: Nemon Lou


A simple way to reproduce :
set hive.execution.engine=spark;
CREATE TABLE  test(id INT);
insert into test values (1) (2);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: hive transaction strange behaviour

2015-11-17 Thread Sanjeev Verma
Thank Elliot, Eugene
I am able to see the Base file created in one of the partition, seems the
Compactor kicked in and created it but it has not created base files in
rest of the partition where delta files still exists.why compactor has not
picked the other partition, when and how these partition will be picked up
for compaction.

Thanks

On Sat, Nov 14, 2015 at 11:01 PM, Eugene Koifman 
wrote:

> When Compaction process runs, it will create base directory.
>
> https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration
>
>
> at a minimum you need hive.compactor.initiator.on
> =true
> and hive.compactor.worker.threads
> 
> >0
>
> Also, see
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/PartitionCompact
>  on
> how to trigger compaction manually.
>
> *Eugene*
>
> From: Sanjeev Verma 
> Reply-To: "u...@hive.apache.org" 
> Date: Thursday, November 12, 2015 at 11:41 PM
> To: "u...@hive.apache.org" , "dev@hive.apache.org" <
> dev@hive.apache.org>
> Subject: hive transaction strange behaviour
>
> I have enable the hive transaction and able to see the delta files created
> for some of the partition but i dont not see any base file created yet.it
> seems strange to me seeing so many delta files without any base file.
> Could somebody let me know when Base file created.
>
> Thanks
>