[jira] [Created] (HIVE-12170) normalize HBase metastore connection configuration

2015-10-13 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-12170:
---

 Summary: normalize HBase metastore connection configuration
 Key: HIVE-12170
 URL: https://issues.apache.org/jira/browse/HIVE-12170
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Priority: Blocker


Right now there are two ways to get HBaseReadWrite instance in metastore. Both 
get a threadlocal instance (is there a good reason for that?).
1) One is w/o conf and only works if someone called the (2) before, from any 
thread.
2) The other blindly sets a static conf and then gets an instance with that 
conf, or if someone already happened to call (1) or (2) from this thread, it 
returns the existing instance with whatever conf was set before (but still 
resets the current conf to new conf).

This doesn't make sense even in single threaded case, and can easily lead to 
bugs as described; the config propagation logic is not good (example - 
HIVE-12167), as some calls just reset config blindly, so there's no point in 
setting staticConf, other than for those who don't have conf and would rely on 
the static (which is bad design).
Having connections with different configs reliably in not possible, and 
multi-threaded cases would also break - you could even set conf, have it reset 
and get instance with somebody else's conf. 

Static should definitely be removed, maybe threadlocal too (HConnection is 
thread-safe).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12171) LLAP: BuddyAllocator failures when querying uncompressed data

2015-10-13 Thread Gopal V (JIRA)
Gopal V created HIVE-12171:
--

 Summary: LLAP: BuddyAllocator failures when querying uncompressed 
data
 Key: HIVE-12171
 URL: https://issues.apache.org/jira/browse/HIVE-12171
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Gopal V
Assignee: Sergey Shelukhin


{code}
hive> select sum(l_extendedprice * l_discount) as revenue from testing.lineitem 
where l_shipdate >= '1993-01-01' and l_shipdate < '1994-01-01' ;

Caused by: 
org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
Failed to allocate 492; at 0 out of 1
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateMultiple(BuddyAllocator.java:176)
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.preReadUncompressedStream(EncodedReaderImpl.java:882)
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:319)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:413)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:194)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:191)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:191)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:74)
at 
org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
... 4 more
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12169) PCR: Should not fetch all the Partitions to Client always

2015-10-13 Thread Gopal V (JIRA)
Gopal V created HIVE-12169:
--

 Summary: PCR: Should not fetch all the Partitions to Client always
 Key: HIVE-12169
 URL: https://issues.apache.org/jira/browse/HIVE-12169
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Gopal V


Even for simple queries which only have a column filter, the PCR does not check 
if PPR has run before & therefore pulls all partition columns down for queries 
which do not have a partition filter at all.

{code}
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.ensureList(Object) 
MetaStoreDirectSql.java:1656
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.loopJoinOrderedResult(TreeMap,
 String, int, MetaStoreDirectSql$ApplyFunc) MetaStoreDirectSql.java:896
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsFromPartitionIds(String,
 String, Boolean, List) MetaStoreDirectSql.java:644
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(String,
 String, Boolean, String, List, List, Integer) MetaStoreDirectSql.java:511
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(String,
 String, List) MetaStoreDirectSql.java:376
org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore$GetHelper)
 ObjectStore.java:2159
org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore$GetHelper)
 ObjectStore.java:2146
org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(boolean) 
ObjectStore.java:2392
org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(String,
 String, byte[], String, short, List, boolean, boolean) ObjectStore.java:2146
org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExpr(String, 
String, byte[], String, short, List) ObjectStore.java:2136
sun.reflect.GeneratedMethodAccessor72.invoke(Object, Object[])
sun.reflect.DelegatingMethodAccessorImpl.invoke(Object, Object[]) 
DelegatingMethodAccessorImpl.java:43
java.lang.reflect.Method.invoke(Object, Object[]) Method.java:497
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(Object, Method, Object[]) 
RawStoreProxy.java:117
com.sun.proxy.$Proxy28.getPartitionsByExpr(String, String, byte[], String, 
short, List)
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_expr(PartitionsByExprRequest)
 HiveMetaStore.java:4545
sun.reflect.GeneratedMethodAccessor71.invoke(Object, Object[])
sun.reflect.DelegatingMethodAccessorImpl.invoke(Object, Object[]) 
DelegatingMethodAccessorImpl.java:43
java.lang.reflect.Method.invoke(Object, Object[]) Method.java:497
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(Object, 
Method, Object[]) RetryingHMSHandler.java:138
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(Object, Method, 
Object[]) RetryingHMSHandler.java:99
com.sun.proxy.$Proxy30.get_partitions_by_expr(PartitionsByExprRequest)
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByExpr(String,
 String, byte[], String, short, List) HiveMetaStoreClient.java:1160
sun.reflect.GeneratedMethodAccessor70.invoke(Object, Object[])
sun.reflect.DelegatingMethodAccessorImpl.invoke(Object, Object[]) 
DelegatingMethodAccessorImpl.java:43
java.lang.reflect.Method.invoke(Object, Object[]) Method.java:497
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(Object, Method, 
Object[]) RetryingMetaStoreClient.java:156
com.sun.proxy.$Proxy31.listPartitionsByExpr(String, String, byte[], String, 
short, List)
org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByExpr(Table, 
ExprNodeGenericFuncDesc, HiveConf, List) Hive.java:2361
org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.getPartitionsFromServer(Table,
 ExprNodeGenericFuncDesc, HiveConf, String, Set, boolean) 
PartitionPruner.java:420
org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(Table, 
ExprNodeDesc, HiveConf, String, Map) PartitionPruner.java:221
org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(TableScanOperator,
 ParseContext, String) PartitionPruner.java:144
org.apache.hadoop.hive.ql.parse.ParseContext.getPrunedPartitions(String, 
TableScanOperator) ParseContext.java:460
org.apache.hadoop.hive.ql.optimizer.pcr.PcrOpProcFactory$FilterPCR.process(Node,
 Stack, NodeProcessorCtx, Object[]) PcrOpProcFactory.java:110
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(Node, Stack, 
Object[]) DefaultRuleDispatcher.java:90
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(Node, Stack) 
DefaultGraphWalker.java:105
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(Node, Stack) 
DefaultGraphWalker.java:89
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(Node) 
DefaultGraphWalker.java:158
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(Collection, 
HashMap) DefaultGraphWalker.java:120

[jira] [Created] (HIVE-12156) expanding view doesn't quote reserved keyword

2015-10-13 Thread Jay Lee (JIRA)
Jay Lee created HIVE-12156:
--

 Summary: expanding view doesn't quote reserved keyword
 Key: HIVE-12156
 URL: https://issues.apache.org/jira/browse/HIVE-12156
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 1.2.1
 Environment: hadoop 2.7
hive 1.2.1
Reporter: Jay Lee


hive> create table testreserved (data struct<`end`:string, id: string>);
OK
Time taken: 0.274 seconds
hive> create view testreservedview as select data.`end` as data_end, data.id as 
data_id from testreserved;
OK
Time taken: 0.769 seconds
hive> select data.`end` from testreserved;
OK
Time taken: 1.852 seconds
hive> select data_id from testreservedview;
NoViableAltException(98@[])
at 
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:10858)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceFieldExpression(HiveParser_IdentifiersParser.java:6438)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnaryPrefixExpression(HiveParser_IdentifiersParser.java:6768)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnarySuffixExpression(HiveParser_IdentifiersParser.java:6828)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceBitwiseXorExpression(HiveParser_IdentifiersParser.java:7012)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceStarExpression(HiveParser_IdentifiersParser.java:7172)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedencePlusExpression(HiveParser_IdentifiersParser.java:7332)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceAmpersandExpression(HiveParser_IdentifiersParser.java:7483)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceBitwiseOrExpression(HiveParser_IdentifiersParser.java:7634)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpression(HiveParser_IdentifiersParser.java:8164)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceNotExpression(HiveParser_IdentifiersParser.java:9177)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceAndExpression(HiveParser_IdentifiersParser.java:9296)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceOrExpression(HiveParser_IdentifiersParser.java:9455)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.expression(HiveParser_IdentifiersParser.java:6105)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.expression(HiveParser.java:45840)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectItem(HiveParser_SelectClauseParser.java:2907)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectList(HiveParser_SelectClauseParser.java:1373)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:1128)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:45827)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:41495)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:41402)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:40413)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:40283)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1590)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1109)
at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:202)
...

When view is expanded, field should be quote with backquote.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12157) select-clause doesn't support unicode alias

2015-10-13 Thread richard du (JIRA)
richard du created HIVE-12157:
-

 Summary: select-clause doesn't support unicode alias
 Key: HIVE-12157
 URL: https://issues.apache.org/jira/browse/HIVE-12157
 Project: Hive
  Issue Type: Improvement
  Components: hpl/sql
Affects Versions: 1.2.1
Reporter: richard du
Priority: Trivial


Parser will throw exception when I use alias in the selected column:
hive> desc test;
OK
a   int 
b   string  
Time taken: 0.135 seconds, Fetched: 2 row(s)
hive> select a as 行1 from test limit 10;
NoViableAltException(302@[134:7: ( ( ( KW_AS )? identifier ) | ( KW_AS LPAREN 
identifier ( COMMA identifier )* RPAREN ) )?])
at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
at org.antlr.runtime.DFA.predict(DFA.java:116)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectItem(HiveParser_SelectClauseParser.java:2915)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectList(HiveParser_SelectClauseParser.java:1373)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:1128)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:45827)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:41495)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:41402)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:40413)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:40283)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1590)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1109)
at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:202)
at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:396)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
FAILED: ParseException line 1:13 cannot recognize input near 'as' '1' 'from' in 
selection target



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12158) Add methods to HCatClient for partition synchronization

2015-10-13 Thread David Maughan (JIRA)
David Maughan created HIVE-12158:


 Summary: Add methods to HCatClient for partition synchronization
 Key: HIVE-12158
 URL: https://issues.apache.org/jira/browse/HIVE-12158
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: David Maughan
Priority: Minor


We have a use case where we have a list of partitions that are created as a 
result of a batch job (new or updated) outside of Hive and would like to 
synchronize them with the Hive MetaStore. We would like to use the HCatalog 
{{HCatClient}} but it currently does not seem to support this. However it is 
possible with the {{HiveMetaStoreClient}} directly. I am proposing to add the 
following methods to {{HCatClient}} and {{HCatClientHMSImpl}}:

1. A method for altering partitions. The implementation would delegate to 
{{HiveMetaStoreClient#alter_partitions}}. I've used "update" instead of "alter" 
in the name so it's consistent with the {{HCatClient#updateTableSchema}} method.

{code}
public void updatePartitions(List partitions) throws 
HCatException
{code}

2. A method for altering or adding partitions depending on whether they already 
exist or not. The implementation would split the given list into a list of 
existing partitions (using {{HiveMetaStoreClient#getPartitionsByNames}} and 
{{Warehouse#makePartName}} to determine existence), and a list of new 
partitions. Then the appropriate add/update calls would be issued:

{code}
public void addOrUpdatePartitions(List partitions) throws 
HCatException
{code}

Are these acceptable? Are there any standards that should be followed here?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: HIVE-TRUNK-JAVA8 #122

2015-10-13 Thread hiveqa
See 


Changes:

[sershe] Commit some initial prototype code

[sershe] Fix build

[sershe] delete wrongly resolved conflicts

[sershe] Refactor prototype with regard to updated design, remove chunk format 
(currently discarded). Builds, but doesn't work in any form; cache is a no-op 
too (layering for cache will be different)

[prasanthj] experimental version of orc metadata cache

[sershe] Preliminary patch for low-level cache, needs few more touches and LRFU 
policy would not be thread-safe

[sershe] Finish reworking LRFU policy for low-level cache (not clear if it's a 
good pick due to concurrency); tests; some pipeline adjustments

[sershe] Separated allocator and cache; unit tests for allocator, fixed a bunch 
of bugs

[sershe] Add tests for cache, clean up allocator debug logging left over

[sershe] Re-add thread pool (in arbitrary place), input format wrapping; add 
pause w/o implementation

[sershe] Fix a pom, some minor changes to visibility

[sershe] Actually fix pom

[sershe] Disable LLAP IO for now

[sershe] HIVE-9418p1 : ORC using low-level cache

[sershe] Follow-up to previous commit - logging and comment

[sershe] Rename some config settings, fix issue

[sershe] Disable LLAP IO again

[gunther] HIVE-9460: LLAP: Fix some static vars in the operator pipeline 
(Gunther Hagleitner)

[gunther] HIVE-9461: LLAP: Enable local mode tests on tez to facilitate llap 
testing (Gunther Hagleitner)

[gunther] HIVE-9506: LLAP: Add an execution daemon (Siddharth Seth via Gunther 
Hagleitner)

[sershe] HIVE-9418p2 : Part of the encoded data production pipeline 
(incomplete, only to allow parallel work)

[sershe] HIVE-9418p3 : Part of the encoded data production pipeline - missing 
index and bugfixing

[sershe] HIVE-9418p4 : Index reading and more bugfixing (mostly from trunk port)

[sershe] HIVE-9418p5 : Yet more bugfixes

[sershe] HIVE-9418p6 : Yet more bugfixes, error handling improvement (another 
pointless commit for parallel work enablement)

[sershe] HIVE-9418p7 : Final bugfixes, at least some queries work thru lower 
level now

[sershe] HIVE-9418p7 : Additional logging; also fix build break from some 
previous commit

[prasanthj] HIVE-9419: LLAP: ORC decoding of row-groups (Partial patch)

[sershe] HIVE-9419p1: LLAP: ORC decoding of row-groups - add stream kind for 
ease of decoding

[sershe] HIVE-9418p8 : Refactor code out of RecordReader

[sershe] Remove VectorReader; roll it into InputFormat, since it's no longer an 
interface to LLAP

[prasanthj] HIVE-9419p2: LLAP: ORC decoding of row-groups (Partial patch2)

[prasanthj] HIVE-9419p3: LLAP: ORC decoding of row-groups (Partial patch3)

[sershe] Bugfixes, comments and the test (doesn't work yet)

[sershe] Amend previous commit

[sershe] Fix one more issue

[sershe] Yet more fixes; right now buffers are permanently locked in cache 
after reading, which I will fix shortly (better than assert)

[gunther] HIVE-9635: LLAP: I'm the decider (Gunther Hagleitner)

[prasanthj] HIVE-9419p4: LLAP: ORC decoding of row-groups (Partial patch4)

[gunther] HIVE-9694: LLAP: add check for udfs/udafs to llapdecider (Gunther 
Hagleitner)

[prasanthj] HIVE-9419p5: LLAP: ORC decoding of row-groups (Partial patch5)

[prasanthj] HIVE-9419p6: LLAP: ORC decoding of row-groups (Partial patch6)

[sershe] HIVE-9728 : LLAP: add heap mode to allocator (for q files, YARN w/o 
direct buffer accounting support)

[prasanthj] HIVE-9419p7: LLAP: ORC decoding of row-groups (Partial patch7)

[sershe] HIVE-9730p1 : LLAP: make sure logging is never called when not needed 
(in LLAP proper)

[prasanthj] HIVE-9419: LLAP: ORC decoding of row-groups (final commit)

[sershe] Change the way DiskRange-s are managed, and fix decref for cache, some 
more bugfixes

[prasanthj] HIVE-9751: LLAP: Fix issue with reading last row group of string 
column (Prasanth Jayachandran)

[sershe] Fix a bug caused by compressed boundary estimates for repeated queries

[sershe] Fix another bug and update the q files

[prasanthj] Adding llap client and server to packaging pom

[gunther] HIVE-9654: LLAP: initialize IO during service startup, with service 
configuration (Gunther Hagleitner)

[sershe] Remove FS cache, fix potential read issue

[sershe] Fix split generation NPE

[sershe] HIVE-9653 : LLAP: create a reasonable q file test for ORC IO

[sershe] HIVE-9759 : LLAP: Update launcher, scheduler to work with Tez changes 
(Siddharth Seth)

[sershe] Another bunch of checks for HDFS

[gunther] HIVE-9782: LLAP: hook up decider + dag utils (Gunther Hagleitner)

[gunther] HIVE-9761: LLAP: Misc fixes to launch scripts, startup error handling 
(Siddharth Seth via Gunther Hagleitner)

[gunther] HIVE-9765: LLAP: uber mode where applicable (Gunther Hagleitner)

[gopalv] HIVE-9764: Update tez dependency in branch & fix build (Siddharth Seth 
via Gopal V)

[gunther] HIVE-9776: LLAP: add simple way to determine wether you're running in 

[jira] [Created] (HIVE-12159) Create vectorized readers for the complex types

2015-10-13 Thread Owen O'Malley (JIRA)
Owen O'Malley created HIVE-12159:


 Summary: Create vectorized readers for the complex types
 Key: HIVE-12159
 URL: https://issues.apache.org/jira/browse/HIVE-12159
 Project: Hive
  Issue Type: Sub-task
Reporter: Owen O'Malley
Assignee: Owen O'Malley


We need vectorized readers for the complex types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12160) Hbase table query execution fails in secured cluster when hive.exec.mode.local.auto is set to true

2015-10-13 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-12160:
---

 Summary: Hbase table query execution fails in secured cluster when 
hive.exec.mode.local.auto is set to true
 Key: HIVE-12160
 URL: https://issues.apache.org/jira/browse/HIVE-12160
 Project: Hive
  Issue Type: Bug
  Components: Security
Affects Versions: 1.1.0
Reporter: Aihua Xu


In a secured cluster with kerberos, a simple query like {{select count(*) from 
hbase_table; }} will fail with the following exception when 
hive.exec.mode.local.auto is set to true.

{noformat}
Error: Error while processing statement: FAILED: Execution Error, return code 
134 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=134)
{noformat}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 39248: Fix Greatest/Least UDF to SQL standard (HIVE-12070 and HIVE-12082)

2015-10-13 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39248/#review102474
---



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNwayCompare.java
 (line 74)


should we limit the types to be numeric? From here I don't know how mixed 
types are compared, especially when there is no common type.



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNwayCompare.java
 (line 88)


instead of two code paths (two method), can we just always do conversions? 
when there is no need for conversion, the converter will be an identity 
converter, which doesn't incur any performance penalty. This way, the code will 
be simpler and cleaner in my opinion.



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLeast.java (line 30)


should this class inherit the base instead of "greatest" from which I don't 
see it gets anything.


- Xuefu Zhang


On Oct. 13, 2015, 12:18 a.m., Szehon Ho wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39248/
> ---
> 
> (Updated Oct. 13, 2015, 12:18 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-12082
> https://issues.apache.org/jira/browse/HIVE-12082
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See HIVE-12070 and HIVE-12082 for discussions and findings.  Refactored the 
> Greatest/Least UDF's to be inline with the SQL-standard spec of comparison 
> operators, and the mysql implementation of greatest/least functional UDF's.
> 
> The main functional changes is that:
> 1.  Different types can be compared now, but used to throw an exception.  
> Comparison uses the same logic as binary comparison operators, ie greaterThan.
> 2.  If any argument is NULL, the result is null.  NULLs used to be ignored in 
> the comparison in favor of non-null values, which violates the SQL-standard.
> 
> Code changes:
> Common logics is captured in the new class 'GenericUDFBaseNWayCompare', which 
> does a linear comparison in the two-cases where arguments are of same type 
> and different type, in the latter it uses Converters.  The class that it uses 
> is ObjectInspectorUtils.compare(), which is the same as the binary comparison 
> operators.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNwayCompare.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFGreatest.java 
> e1eab89 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLeast.java 
> 64a1b47 
>   
> ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFGreatest.java 
> 55d7d5d 
>   ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFLeast.java 
> 47e4801 
>   ql/src/test/queries/clientnegative/udf_greatest_error_2.q b270a1a 
>   ql/src/test/queries/clientnegative/udf_greatest_error_3.q ba21748 
>   ql/src/test/queries/clientnegative/udf_greatest_error_4.q ae6d928 
>   ql/src/test/queries/clientpositive/udf_greatest.q 02c7d3c 
>   ql/src/test/queries/clientpositive/udf_least.q a754ef0 
>   ql/src/test/results/clientnegative/udf_greatest_error_2.q.out 9a6348c 
>   ql/src/test/results/clientnegative/udf_greatest_error_3.q.out 3fb3499 
>   ql/src/test/results/clientnegative/udf_greatest_error_4.q.out 58b4c44 
>   ql/src/test/results/clientpositive/udf_greatest.q.out 10f1c2d 
>   ql/src/test/results/clientpositive/udf_least.q.out 6983137 
> 
> Diff: https://reviews.apache.org/r/39248/diff/
> 
> 
> Testing
> ---
> 
> Added more unit tests, and q tests.
> 
> 
> Thanks,
> 
> Szehon Ho
> 
>



Re: Review Request 39199: HIVE-12084 : Hive queries with ORDER BY and large LIMIT fails with OutOfMemoryError Java heap space

2015-10-13 Thread Hari Sankar Sivarama Subramaniyan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39199/
---

(Updated Oct. 13, 2015, 7:37 p.m.)


Review request for hive, Ashutosh Chauhan and John Pullokkaran.


Repository: hive-git


Description
---

Please look at https://issues.apache.org/jira/browse/HIVE-12084


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/PhysicalOptimizer.java 
49706b1 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/VerifyTopNMemoryUsage.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java a60527b 
  ql/src/test/queries/clientpositive/topn.q PRE-CREATION 
  ql/src/test/results/clientpositive/topn.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/39199/diff/


Testing
---


Thanks,

Hari Sankar Sivarama Subramaniyan



Re: Review Request 39199: HIVE-12084 : Hive queries with ORDER BY and large LIMIT fails with OutOfMemoryError Java heap space

2015-10-13 Thread John Pullokkaran

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39199/#review102527
---



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/VerifyTopNMemoryUsage.java
 (line 51)


1. Memory requirements for hashreducer should be in TopNHash & overridden 
by subclasses if needed.
2. We should use avgtuple size & remove dependency on key size
3. In runtime checks in TopNHash use memory required from #1


- John Pullokkaran


On Oct. 13, 2015, 7:37 p.m., Hari Sankar Sivarama Subramaniyan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39199/
> ---
> 
> (Updated Oct. 13, 2015, 7:37 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Please look at https://issues.apache.org/jira/browse/HIVE-12084
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/PhysicalOptimizer.java
>  49706b1 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/VerifyTopNMemoryUsage.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java a60527b 
>   ql/src/test/queries/clientpositive/topn.q PRE-CREATION 
>   ql/src/test/results/clientpositive/topn.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/39199/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Hari Sankar Sivarama Subramaniyan
> 
>



[jira] [Created] (HIVE-12161) MiniTez test is very slow since LLAP branch merge

2015-10-13 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-12161:
---

 Summary: MiniTez test is very slow since LLAP branch merge
 Key: HIVE-12161
 URL: https://issues.apache.org/jira/browse/HIVE-12161
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin


Before merge, the test took 4~hrs (total time parallelized, not wall clock 
time), after the merge it's taking 12-15hrs. First such build:
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5622/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/

Session reuse patch which used to make them super fast now makes them run in 
2hrs 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
 which is still a lot.

Need to investigate why tests are slow regardless of AM reuse.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12162) MiniTez tests take forever to shut down

2015-10-13 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-12162:
---

 Summary: MiniTez tests take forever to shut down
 Key: HIVE-12162
 URL: https://issues.apache.org/jira/browse/HIVE-12162
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin


Even before LLAP branch merge and with AM reuse, there's this:
{noformat}
estCliDriver_shutdown   1 min 8 sec Passed
testCliDriver_shutdown  1 min 7 sec Passed
testCliDriver_shutdown  1 min 7 sec Passed
testCliDriver_shutdown  1 min 6 sec Passed
testCliDriver_shutdown  1 min 6 sec Passed
testCliDriver_shutdown  1 min 5 sec Passed
testCliDriver_shutdown  1 min 5 sec Passed
testCliDriver_shutdown  1 min 5 sec Passed
testCliDriver_shutdown  1 min 4 sec Passed
testCliDriver_shutdown  1 min 4 sec Passed
testCliDriver_shutdown  1 min 4 sec Passed
testCliDriver_shutdown  1 min 4 sec Passed
testCliDriver_shutdown  1 min 4 sec Passed
testCliDriver_shutdown  1 min 4 sec Passed
testCliDriver_shutdown  1 min 3 sec Passed
testCliDriver_shutdown  1 min 3 sec Passed
testCliDriver_shutdown  1 min 3 sec Passed
testCliDriver_shutdown  1 min 3 sec Passed
testCliDriver_shutdown  1 min 3 sec Passed
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: proxy errors from HiveQA Jenkins?

2015-10-13 Thread Szehon Ho
Yea I also noticed its overloaded sometimes.

I think in this case, it is very large http response.. I can try bump the
timeout, but other than that I dont have that many ideas.

Thanks
Szehon

On Tue, Oct 13, 2015 at 11:58 AM, Sergey Shelukhin 
wrote:

> I am trying to look at slow tests on HiveQA and tests’ history.
> I keep getting 502 proxy error - often from history of any kind, rarely
> from test results, sometimes just from random pages.
> E.g.
>
> Proxy ErrorThe proxy server received an invalid
> response from an upstream server.
> The proxy server could not handle the request GET
> /jenkins/job/PreCommit-HIVE-TRUNK-Build/5623/testReport/org.apache.hadoop.h
> ive.cli/TestMiniTezCliDriver/history/
> <
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HI
> VE-TRUNK-Build/5623/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDri
> ver/history/>.
> Reason: Error reading from remote server
>
>
> Is HiveQA jenkins ok? :)
>
>


Re: proxy errors from HiveQA Jenkins?

2015-10-13 Thread Sergey Shelukhin
Sure… how did you find out about large http response? It seems like a
history page with one chart would not be much larger than any other page.

On 15/10/13, 12:07, "Szehon Ho"  wrote:

>Yea I also noticed its overloaded sometimes.
>
>I think in this case, it is very large http response.. I can try bump the
>timeout, but other than that I dont have that many ideas.
>
>Thanks
>Szehon
>
>On Tue, Oct 13, 2015 at 11:58 AM, Sergey Shelukhin
>
>wrote:
>
>> I am trying to look at slow tests on HiveQA and tests’ history.
>> I keep getting 502 proxy error - often from history of any kind, rarely
>> from test results, sometimes just from random pages.
>> E.g.
>>
>> Proxy ErrorThe proxy server received an invalid
>> response from an upstream server.
>> The proxy server could not handle the request GET
>> 
>>/jenkins/job/PreCommit-HIVE-TRUNK-Build/5623/testReport/org.apache.hadoop
>>.h
>> ive.cli/TestMiniTezCliDriver/history/
>> <
>> 
>>http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-H
>>I
>> 
>>VE-TRUNK-Build/5623/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliD
>>ri
>> ver/history/>.
>> Reason: Error reading from remote server
>>
>>
>> Is HiveQA jenkins ok? :)
>>
>>



proxy errors from HiveQA Jenkins?

2015-10-13 Thread Sergey Shelukhin
I am trying to look at slow tests on HiveQA and tests’ history.
I keep getting 502 proxy error - often from history of any kind, rarely
from test results, sometimes just from random pages.
E.g.

Proxy ErrorThe proxy server received an invalid
response from an upstream server.
The proxy server could not handle the request GET
/jenkins/job/PreCommit-HIVE-TRUNK-Build/5623/testReport/org.apache.hadoop.h
ive.cli/TestMiniTezCliDriver/history/
.
Reason: Error reading from remote server


Is HiveQA jenkins ok? :)



Re: proxy errors from HiveQA Jenkins?

2015-10-13 Thread Sergey Shelukhin
Actually, it is possible to get database dump, or something similar, of
test stats from wherever they are stored?

On 15/10/13, 12:14, "Sergey Shelukhin"  wrote:

>Sure… how did you find out about large http response? It seems like a
>history page with one chart would not be much larger than any other page.
>
>On 15/10/13, 12:07, "Szehon Ho"  wrote:
>
>>Yea I also noticed its overloaded sometimes.
>>
>>I think in this case, it is very large http response.. I can try bump the
>>timeout, but other than that I dont have that many ideas.
>>
>>Thanks
>>Szehon
>>
>>On Tue, Oct 13, 2015 at 11:58 AM, Sergey Shelukhin
>>
>>wrote:
>>
>>> I am trying to look at slow tests on HiveQA and tests’ history.
>>> I keep getting 502 proxy error - often from history of any kind, rarely
>>> from test results, sometimes just from random pages.
>>> E.g.
>>>
>>> Proxy ErrorThe proxy server received an invalid
>>> response from an upstream server.
>>> The proxy server could not handle the request GET
>>> 
>>>/jenkins/job/PreCommit-HIVE-TRUNK-Build/5623/testReport/org.apache.hadoo
>>>p
>>>.h
>>> ive.cli/TestMiniTezCliDriver/history/
>>> <
>>> 
>>>http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-
>>>H
>>>I
>>> 
>>>VE-TRUNK-Build/5623/testReport/org.apache.hadoop.hive.cli/TestMiniTezCli
>>>D
>>>ri
>>> ver/history/>.
>>> Reason: Error reading from remote server
>>>
>>>
>>> Is HiveQA jenkins ok? :)
>>>
>>>
>



[jira] [Created] (HIVE-12165) wrong result when hive.optimize.sampling.orderby=true with some aggregate functions

2015-10-13 Thread ErwanMAS (JIRA)
ErwanMAS created HIVE-12165:
---

 Summary: wrong result when hive.optimize.sampling.orderby=true 
with some aggregate functions
 Key: HIVE-12165
 URL: https://issues.apache.org/jira/browse/HIVE-12165
 Project: Hive
  Issue Type: Bug
 Environment: hortonworks  2.3

Reporter: ErwanMAS
Priority: Critical


This simple query give wrong result , when , i use the parallel order .

{noformat}
select count(*) , count(distinct dummyint ) , min(dummyint),max(dummyint) from 
foobar_1M ;
{noformat}

Current wrong result :

{noformat}
c0  c1  c2  c3
32740   32740   0   163695
113172  113172  163700  729555
54088   54088   729560  95
{noformat}

Right result :
{noformat}
c0  c1  c2  c3
100 100 0   99
{noformat}

The sql script for my test 
{noformat}
drop table foobar_1M ;
create table foobar_1M ( dummyint bigint  , dummystr string ) ;

insert overwrite table foobar_1M
   select val_int  , concat('dummy ',val_int) from
 ( select ((d_1*10)+d_2)*10+d_3)*10+d_4)*10+d_5)*10+d_6) as 
val_int from foobar_1
 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
tbl_1 as d_1
 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
tbl_2 as d_2
 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
tbl_3 as d_3
 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
tbl_4 as d_4
 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
tbl_5 as d_5
 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
tbl_6 as d_6  ) as f ;


set hive.optimize.sampling.orderby.number=1;
set hive.optimize.sampling.orderby.percent=0.1f;
set mapreduce.job.reduces=3 ;

set hive.optimize.sampling.orderby=false;

select count(*) , count(distinct dummyint ) , min(dummyint),max(dummyint) from 
foobar_1M ;

set hive.optimize.sampling.orderby=true;

select count(*) , count(distinct dummyint ) , min(dummyint),max(dummyint) from 
foobar_1M ;
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 38292: HIVE-11768 java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances

2015-10-13 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38292/#review102537
---



ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java (line 309)


I think this method could be re-used in other parts of hive as well. How 
about adding this a FileUtils.deleteTmpFile and a new FileUtils.createTmpFile 
method ?
Can you also add a unit test for those methods ?



ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java (line 871)


same as above comment . A method in FileUtils for use throughout hive would 
be useful.


- Thejas Nair


On Oct. 13, 2015, 1:19 a.m., Navis Ryu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38292/
> ---
> 
> (Updated Oct. 13, 2015, 1:19 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our 
> long running HiveServer2 instances,taken up more than 100MB on heap.
>   Most of the paths contains a suffix of ".pipeout".
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hive/common/util/ShutdownHookManager.java 
> fd2f20a 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 41b4bb1 
>   
> service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java
>  1d1e995 
>   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
> 175348b 
> 
> Diff: https://reviews.apache.org/r/38292/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Navis Ryu
> 
>



[jira] [Created] (HIVE-12163) LLAP: Tez counters for LLAP 2

2015-10-13 Thread Siddharth Seth (JIRA)
Siddharth Seth created HIVE-12163:
-

 Summary: LLAP: Tez counters for LLAP 2
 Key: HIVE-12163
 URL: https://issues.apache.org/jira/browse/HIVE-12163
 Project: Hive
  Issue Type: Sub-task
Reporter: Siddharth Seth


1) Tez counters for LLAP are incorrect.
2) Some counters, such as cache hit ratio for a fragment, are not propagated.

We need to make sure that Tez counters for LLAP are usable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: proxy errors from HiveQA Jenkins?

2015-10-13 Thread Szehon Ho
Hm yea that's true, I don't know why the http response is so large.

The ...TestMiniTezCliDriver link is pretty fast, links to Jenkins test
reports that are probably statically stored.  I guess
...TestMiniTezCliDriver/history might be dynamically generated, and Jenkins
is just slow to calculate it.

All these datas are stored by Jenkins, I dont know exactly how to dump them.

On Tue, Oct 13, 2015 at 1:56 PM, Sergey Shelukhin 
wrote:

> Actually, it is possible to get database dump, or something similar, of
> test stats from wherever they are stored?
>
> On 15/10/13, 12:14, "Sergey Shelukhin"  wrote:
>
> >Sure… how did you find out about large http response? It seems like a
> >history page with one chart would not be much larger than any other page.
> >
> >On 15/10/13, 12:07, "Szehon Ho"  wrote:
> >
> >>Yea I also noticed its overloaded sometimes.
> >>
> >>I think in this case, it is very large http response.. I can try bump the
> >>timeout, but other than that I dont have that many ideas.
> >>
> >>Thanks
> >>Szehon
> >>
> >>On Tue, Oct 13, 2015 at 11:58 AM, Sergey Shelukhin
> >>
> >>wrote:
> >>
> >>> I am trying to look at slow tests on HiveQA and tests’ history.
> >>> I keep getting 502 proxy error - often from history of any kind, rarely
> >>> from test results, sometimes just from random pages.
> >>> E.g.
> >>>
> >>> Proxy ErrorThe proxy server received an invalid
> >>> response from an upstream server.
> >>> The proxy server could not handle the request GET
> >>>
> >>>/jenkins/job/PreCommit-HIVE-TRUNK-Build/5623/testReport/org.apache.hadoo
> >>>p
> >>>.h
> >>> ive.cli/TestMiniTezCliDriver/history/
> >>> <
> >>>
> >>>
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-
> >>>H
> >>>I
> >>>
> >>>VE-TRUNK-Build/5623/testReport/org.apache.hadoop.hive.cli/TestMiniTezCli
> >>>D
> >>>ri
> >>> ver/history/>.
> >>> Reason: Error reading from remote server
> >>>
> >>>
> >>> Is HiveQA jenkins ok? :)
> >>>
> >>>
> >
>
>


[jira] [Created] (HIVE-12164) Remove jdbc stats collection mechanism

2015-10-13 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-12164:
---

 Summary: Remove jdbc stats collection mechanism
 Key: HIVE-12164
 URL: https://issues.apache.org/jira/browse/HIVE-12164
 Project: Hive
  Issue Type: Task
  Components: Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


Though there are some deployments using it, usually its painful to setup since 
a valid hive-site.xml is needed on all task nodes (containing connection 
details) and for large tasks (with thousands of tasks) results in a scalability 
issue with all of them hammering DB at nearly same time.
Because of these pain points alternative stats collection mechanism were added. 
FS stats based system is default for some time.
We should remove jdbc stats collection mechanism as it needlessly adds 
complexity in TS and FS operators w.r.t key handling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 38292: HIVE-11768 java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances

2015-10-13 Thread Navis Ryu


> On 10 13, 2015, 9:35 오후, Thejas Nair wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java, line 309
> > 
> >
> > I think this method could be re-used in other parts of hive as well. 
> > How about adding this a FileUtils.deleteTmpFile and a new 
> > FileUtils.createTmpFile method ?
> > Can you also add a unit test for those methods ?

Sure


> On 10 13, 2015, 9:35 오후, Thejas Nair wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java, line 871
> > 
> >
> > same as above comment . A method in FileUtils for use throughout hive 
> > would be useful.

Consider it done


- Navis


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38292/#review102537
---


On 10 13, 2015, 1:19 오전, Navis Ryu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38292/
> ---
> 
> (Updated 10 13, 2015, 1:19 오전)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our 
> long running HiveServer2 instances,taken up more than 100MB on heap.
>   Most of the paths contains a suffix of ".pipeout".
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hive/common/util/ShutdownHookManager.java 
> fd2f20a 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 41b4bb1 
>   
> service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java
>  1d1e995 
>   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
> 175348b 
> 
> Diff: https://reviews.apache.org/r/38292/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Navis Ryu
> 
>



Re: Review Request 38292: HIVE-11768 java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances

2015-10-13 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38292/
---

(Updated 10 14, 2015, 12:49 오전)


Review request for hive.


Changes
---

Addressed comments & fixed test fails


Repository: hive-git


Description
---

More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our 
long running HiveServer2 instances,taken up more than 100MB on heap.
  Most of the paths contains a suffix of ".pipeout".


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/common/FileUtils.java 7e4f386 
  common/src/java/org/apache/hive/common/util/ShutdownHookManager.java fd2f20a 
  common/src/test/org/apache/hive/common/util/TestShutdownHookManager.java 
fa30f15 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 41b4bb1 
  
service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java
 1d1e995 
  service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
175348b 

Diff: https://reviews.apache.org/r/38292/diff/


Testing
---


Thanks,

Navis Ryu



parallelizing a test in HiveQA

2015-10-13 Thread Sergey Shelukhin
Hi. After the branch merge, we’d like the newly added CliDriver test to be
parallelized like other such tests.
What do we need to provide for this to be configured? The test name is
TestMiniLlapCliDriver; it currently uses the same variable as some other
test in testconfiguration.properties, but I was going to change that.



[jira] [Created] (HIVE-12166) LLAP: Cache read error at 1000 Gb scale tests

2015-10-13 Thread Gopal V (JIRA)
Gopal V created HIVE-12166:
--

 Summary: LLAP: Cache read error at 1000 Gb scale tests
 Key: HIVE-12166
 URL: https://issues.apache.org/jira/browse/HIVE-12166
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V


{code}
hive> select sum(l_extendedprice * l_discount) as revenue from 
tpch_flat_orc_1000.lineitem  ;

Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
at java.util.ArrayList.elementData(ArrayList.java:400)
at java.util.ArrayList.get(ArrayList.java:413)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.determineRgsToRead(OrcEncodedDataReader.java:687)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:264)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:194)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:191)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:191)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:74)
at 
org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
... 4 more
{code}

Disabling the cache allows this to run through without error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 35792: HIVE-10438 - Architecture for ResultSet Compression via external plugin

2015-10-13 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35792/#review102547
---


Besides the comments in the code, I have the following questions:
1. Do we have to support multiple compressors and otherwise the functionality 
is incomplete? Or, can we support one and later enhance if demain arises?
2. As to client telling the server its capability regarding supported 
compressor, what's the practice in the traditional databases? Is JSON string 
the typical practice. I feel that JSON is a little unwieldy.


service/if/TCLIService.thrift (line 406)


What's the use of type and size? Are these not available by looking at 
result set schema?



service/src/java/org/apache/hive/service/cli/compression/ColumnCompressorService.java
 (line 34)


It seems that calling this as a serive is a little confusing. Maybe we can 
just call it "factory"?



service/src/java/org/apache/hive/service/cli/compression/EncodedColumnBasedSet.java
 (line 92)


If hiveConf is mandatory, should we enforce it by requiring it in the 
constructor rather than a set method?



service/src/java/org/apache/hive/service/cli/compression/EncodedColumnBasedSet.java
 (line 110)


Can we do this once per user session instead of per result set?


- Xuefu Zhang


On Aug. 17, 2015, 10:37 p.m., Rohit Dholakia wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35792/
> ---
> 
> (Updated Aug. 17, 2015, 10:37 p.m.)
> 
> 
> Review request for hive, Vaibhav Gumashta, Xiaojian Wang, Xiao Meng, and 
> Xuefu Zhang.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This patch enables ResultSet compression for Hive using external plugins. The 
> patch proposes a plugin architecture that enables using external plugins to 
> compress ResultSets on-the-fly.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 730f5be 
>   jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java bb2b695 
>   service/if/TCLIService.thrift baf583f 
>   service/src/gen/thrift/gen-cpp/TCLIService.h 29a9f4a 
>   service/src/gen/thrift/gen-cpp/TCLIService_types.h 4536b41 
>   service/src/gen/thrift/gen-cpp/TCLIService_types.cpp 742cfdc 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TEnColumn.java
>  PRE-CREATION 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TExecuteStatementReq.java
>  feaed34 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetTablesReq.java
>  805e69f 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TOpenSessionReq.java
>  657f868 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TOpenSessionResp.java
>  48f4b45 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TProtocolVersion.java
>  6e714c6 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TRowSet.java
>  cc1a148 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TStatus.java
>  1cd7980 
>   service/src/gen/thrift/gen-py/TCLIService/ttypes.py efee8ef 
>   service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb bfb2b69 
>   service/src/java/org/apache/hive/service/cli/Column.java 2e21f18 
>   service/src/java/org/apache/hive/service/cli/ColumnBasedSet.java 47a582e 
>   service/src/java/org/apache/hive/service/cli/RowSetFactory.java e8f68ea 
>   
> service/src/java/org/apache/hive/service/cli/compression/ColumnCompressor.java
>  PRE-CREATION 
>   
> service/src/java/org/apache/hive/service/cli/compression/ColumnCompressorService.java
>  PRE-CREATION 
>   
> service/src/java/org/apache/hive/service/cli/compression/EncodedColumnBasedSet.java
>  PRE-CREATION 
>   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
> 67bc778 
>   
> service/src/test/org/apache/hive/service/cli/compression/SnappyIntCompressor.java
>  PRE-CREATION 
>   
> service/src/test/org/apache/hive/service/cli/compression/TestEncodedColumnBasedSet.java
>  PRE-CREATION 
>   
> service/src/test/resources/META-INF/services/org.apache.hive.service.cli.compression.ColumnCompressor
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/35792/diff/
> 
> 
> Testing
> ---
> 
> Testing has been done using a docker container-based query submitter that has 
> an integer decompressor as part of it. Using the integer compressor (also 
> provided) and the decompressor, the end-to-end functionality can be observed.
> 
> 
> 

Re: parallelizing a test in HiveQA

2015-10-13 Thread Szehon Ho
I think there was another thread with Prasanth about this topic, as I
mentioned there the list of tests could be in the current variable or
another variable, maybe you can change it first and then I'll make the
change on the build machine.

Thanks
Szehon

On Tue, Oct 13, 2015 at 3:46 PM, Sergey Shelukhin 
wrote:

> Hi. After the branch merge, we’d like the newly added CliDriver test to be
> parallelized like other such tests.
> What do we need to provide for this to be configured? The test name is
> TestMiniLlapCliDriver; it currently uses the same variable as some other
> test in testconfiguration.properties, but I was going to change that.
>
>


[jira] [Created] (HIVE-12167) HBase metastore causes massive number of ZK exceptions in MiniTez tests

2015-10-13 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-12167:
---

 Summary: HBase metastore causes massive number of ZK exceptions in 
MiniTez tests
 Key: HIVE-12167
 URL: https://issues.apache.org/jira/browse/HIVE-12167
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Daniel Dai


I ran some random test (vectorization_10) with HBase metastore, and I see large 
number of exceptions in hive.log
{noformat}
$ grep -c "ConnectionLoss" hive.log
52
$ grep -c "Connection refused" hive.log
1014
{noformat}
These log lines' count has increased by ~33% since merging llap branch, but it 
is still high before that (39/~700) for the same test). These lines are not 
present if I disable HBase metastore.
The exceptions are:
{noformat}
2015-10-13T17:51:06,232 WARN  [Thread-359-SendThread(localhost:2181)]: 
zookeeper.ClientCnxn (ClientCnxn.java:run(1102)) - Session 0x0 for server null, 
unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
~[?:1.8.0_45]
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) 
~[?:1.8.0_45]
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
 ~[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 
[zookeeper-3.4.6.jar:3.4.6-1569965]
{noformat}
that is retried for some seconds and then
{noformat}
2015-10-13T17:51:22,867 WARN  [Thread-359]: zookeeper.ZKUtil 
(ZKUtil.java:checkExists(544)) - hconnection-0x1da6ef180x0, 
quorum=localhost:2181, baseZNode=/hbase Unable to set watcher on znode 
(/hbase/hbaseid)
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /hbase/hbaseid
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) 
~[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) 
~[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045) 
~[zookeeper-3.4.6.jar:3.4.6-1569965]
at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
 ~[hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:541) 
[hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:879)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.(ConnectionManager.java:635)
 [hbase-client-1.1.1.jar:1.1.1]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method) ~[?:1.8.0_45]
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 [?:1.8.0_45]
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 [?:1.8.0_45]
at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 
[?:1.8.0_45]
at 
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:420)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.client.ConnectionManager.createConnectionInternal(ConnectionManager.java:329)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:144)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hive.metastore.hbase.VanillaHBaseConnection.connect(VanillaHBaseConnection.java:56)
 [hive-metastore-2.0.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.(HBaseReadWrite.java:227)
 [hive-metastore-2.0.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.(HBaseReadWrite.java:83)
 [hive-metastore-2.0.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite$1.initialValue(HBaseReadWrite.java:157)
 [hive-metastore-2.0.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite$1.initialValue(HBaseReadWrite.java:151)
 [hive-metastore-2.0.0-SNAPSHOT.jar:?]
at java.lang.ThreadLocal.setInitialValue(ThreadLocal.java:180) 
[?:1.8.0_45]
at java.lang.ThreadLocal.get(ThreadLocal.java:170) [?:1.8.0_45]
at 

Re: Review Request 39248: Fix Greatest/Least UDF to SQL standard (HIVE-12070 and HIVE-12082)

2015-10-13 Thread Szehon Ho


> On Oct. 13, 2015, 5:24 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNwayCompare.java,
> >  line 74
> > 
> >
> > should we limit the types to be numeric? From here I don't know how 
> > mixed types are compared, especially when there is no common type.

I actually thought about doing that, but after some research I believe that the 
behavior of these UDF's should be consistent with the longstanding comparison 
operators (greater than, less than, greater than equal, less than equal, etc).  
Because the greatest udf is defined to be just running the greater than 
comparison on n-items, similarly for the least udf.

This change makes the class consistent with those classes (see 
GenericUDFBaseCompare, FunctionRegistry.getCommonClassForComparison).


> On Oct. 13, 2015, 5:24 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNwayCompare.java,
> >  line 88
> > 
> >
> > instead of two code paths (two method), can we just always do 
> > conversions? when there is no need for conversion, the converter will be an 
> > identity converter, which doesn't incur any performance penalty. This way, 
> > the code will be simpler and cleaner in my opinion.

There's a slight performance impact (function call), but I fixed it for 
code-cleaniness as you suggested.


- Szehon


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39248/#review102474
---


On Oct. 14, 2015, 1:02 a.m., Szehon Ho wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39248/
> ---
> 
> (Updated Oct. 14, 2015, 1:02 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-12082
> https://issues.apache.org/jira/browse/HIVE-12082
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See HIVE-12070 and HIVE-12082 for discussions and findings.  Refactored the 
> Greatest/Least UDF's to be inline with the SQL-standard spec of comparison 
> operators, and the mysql implementation of greatest/least functional UDF's.
> 
> The main functional changes is that:
> 1.  Different types can be compared now, but used to throw an exception.  
> Comparison uses the same logic as binary comparison operators, ie greaterThan.
> 2.  If any argument is NULL, the result is null.  NULLs used to be ignored in 
> the comparison in favor of non-null values, which violates the SQL-standard.
> 
> Code changes:
> Common logics is captured in the new class 'GenericUDFBaseNWayCompare', which 
> does a linear comparison in the two-cases where arguments are of same type 
> and different type, in the latter it uses Converters.  The class that it uses 
> is ObjectInspectorUtils.compare(), which is the same as the binary comparison 
> operators.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNwayCompare.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFGreatest.java 
> e1eab89 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLeast.java 
> 64a1b47 
>   
> ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFGreatest.java 
> 55d7d5d 
>   ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFLeast.java 
> 47e4801 
>   ql/src/test/queries/clientnegative/udf_greatest_error_2.q b270a1a 
>   ql/src/test/queries/clientnegative/udf_greatest_error_3.q ba21748 
>   ql/src/test/queries/clientnegative/udf_greatest_error_4.q ae6d928 
>   ql/src/test/queries/clientpositive/udf_greatest.q 02c7d3c 
>   ql/src/test/queries/clientpositive/udf_least.q a754ef0 
>   ql/src/test/results/clientnegative/udf_greatest_error_2.q.out 9a6348c 
>   ql/src/test/results/clientnegative/udf_greatest_error_3.q.out 3fb3499 
>   ql/src/test/results/clientnegative/udf_greatest_error_4.q.out 58b4c44 
>   ql/src/test/results/clientpositive/udf_greatest.q.out 10f1c2d 
>   ql/src/test/results/clientpositive/udf_least.q.out 6983137 
> 
> Diff: https://reviews.apache.org/r/39248/diff/
> 
> 
> Testing
> ---
> 
> Added more unit tests, and q tests.
> 
> 
> Thanks,
> 
> Szehon Ho
> 
>



Re: Review Request 39248: Fix Greatest/Least UDF to SQL standard (HIVE-12070 and HIVE-12082)

2015-10-13 Thread Szehon Ho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39248/
---

(Updated Oct. 14, 2015, 1:02 a.m.)


Review request for hive.


Changes
---

Address review comments.


Bugs: HIVE-12082
https://issues.apache.org/jira/browse/HIVE-12082


Repository: hive-git


Description
---

See HIVE-12070 and HIVE-12082 for discussions and findings.  Refactored the 
Greatest/Least UDF's to be inline with the SQL-standard spec of comparison 
operators, and the mysql implementation of greatest/least functional UDF's.

The main functional changes is that:
1.  Different types can be compared now, but used to throw an exception.  
Comparison uses the same logic as binary comparison operators, ie greaterThan.
2.  If any argument is NULL, the result is null.  NULLs used to be ignored in 
the comparison in favor of non-null values, which violates the SQL-standard.

Code changes:
Common logics is captured in the new class 'GenericUDFBaseNWayCompare', which 
does a linear comparison in the two-cases where arguments are of same type and 
different type, in the latter it uses Converters.  The class that it uses is 
ObjectInspectorUtils.compare(), which is the same as the binary comparison 
operators.


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNwayCompare.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFGreatest.java 
e1eab89 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLeast.java 
64a1b47 
  ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFGreatest.java 
55d7d5d 
  ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFLeast.java 
47e4801 
  ql/src/test/queries/clientnegative/udf_greatest_error_2.q b270a1a 
  ql/src/test/queries/clientnegative/udf_greatest_error_3.q ba21748 
  ql/src/test/queries/clientnegative/udf_greatest_error_4.q ae6d928 
  ql/src/test/queries/clientpositive/udf_greatest.q 02c7d3c 
  ql/src/test/queries/clientpositive/udf_least.q a754ef0 
  ql/src/test/results/clientnegative/udf_greatest_error_2.q.out 9a6348c 
  ql/src/test/results/clientnegative/udf_greatest_error_3.q.out 3fb3499 
  ql/src/test/results/clientnegative/udf_greatest_error_4.q.out 58b4c44 
  ql/src/test/results/clientpositive/udf_greatest.q.out 10f1c2d 
  ql/src/test/results/clientpositive/udf_least.q.out 6983137 

Diff: https://reviews.apache.org/r/39248/diff/


Testing
---

Added more unit tests, and q tests.


Thanks,

Szehon Ho



[jira] [Created] (HIVE-12168) Addendum to HIVE-12038

2015-10-13 Thread Szehon Ho (JIRA)
Szehon Ho created HIVE-12168:


 Summary: Addendum to HIVE-12038
 Key: HIVE-12168
 URL: https://issues.apache.org/jira/browse/HIVE-12168
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Szehon Ho
Assignee: Szehon Ho


In HIVE-12038, missed a case of Error.  Originally the assumption that if error 
is true, then it is always a build error.  Apparently there is a 
TestFailedException.

Currently, it incorrectly report failed tests as build errors.  





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)