Re: Invitation for Hive committers to become ORC committers

2017-01-04 Thread Siddharth Seth
Hi Owen,
I'd be interested as well, if not too late.
Thanks,
Sid

On Wed, Jan 4, 2017 at 10:34 AM, Owen O'Malley  wrote:

> Ferd, I've added you.
>
> Suneel, I'm sorry, but the offer is limited to current Hive committers.
> http://people.apache.org/phonebook.html?unix=hive
>
> .. Owen
>
> On Mon, Jan 2, 2017 at 6:39 PM, Suneel Jakka 
> wrote:
>
> > Hi Owen,
> >
> > Am also interested.
> >
> > Regards,
> > Suneel Jakka
> >
> >
> > On Mon, Jan 2, 2017 at 8:19 PM, Xu, Cheng A 
> wrote:
> >
> > > Hi Owen,
> > > Sorry for my late response. I'm also interested.
> > >
> > > Thanks,
> > > Ferd
> > >
> > > -Original Message-
> > > From: Owen O'Malley [mailto:omal...@apache.org]
> > > Sent: Friday, December 23, 2016 11:55 AM
> > > To: dev@hive.apache.org
> > > Subject: Re: Invitation for Hive committers to become ORC committers
> > >
> > > Ok, I believe that I have got everyone. If you don't have karma as
> shown
> > > here: http://people.apache.org/phonebook.html?unix=orc
> > >
> > > Please, let me know. I believe I have also updated the ORC website with
> > > everyone.
> > >
> > > Thanks,
> > >Owen
> > >
> > > On Sat, Dec 17, 2016 at 5:16 AM, Lars Francke 
> > > wrote:
> > >
> > > > Hi Owen,
> > > >
> > > > I'm also interested.
> > > >
> > > > Thanks,
> > > > Lars
> > > >
> > > > On Fri, Dec 16, 2016 at 10:20 PM, Sergio Pena
> > > > 
> > > > wrote:
> > > >
> > > > > Hi Ownen,
> > > > >
> > > > > I'm also interested.
> > > > > - Sergio
> > > > >
> > > > > On Fri, Dec 16, 2016 at 11:39 AM, Daniel Dai <
> da...@hortonworks.com>
> > > > > wrote:
> > > > >
> > > > > > I am interested.
> > > > > >
> > > > > > Thanks,
> > > > > > Daniel
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On 12/15/16, 1:12 PM, "Owen O'Malley" 
> wrote:
> > > > > >
> > > > > > >All,
> > > > > > >   As you are aware, we are in the last stages of removing the
> > > > > > >forked
> > > > > ORC
> > > > > > >code out of Hive. The goal of moving ORC out of Hive was to
> > > > > > >increase
> > > > its
> > > > > > >community and we want to be very deliberately inclusive of the
> > > > > > >Hive development community. Towards that end, the ORC PMC wants
> > > > > > >to welcome anyone who is already a Hive committer to become a
> > > committer on ORC.
> > > > > > >
> > > > > > >  Please respond on this thread to let us know if you are
> > > interested.
> > > > > > >
> > > > > > >Thanks,
> > > > > > >   Owen on behalf of the ORC PMC
> > > > > >
> > > > >
> > > >
> > >
> >
>


[jira] [Created] (HIVE-15544) Support scalar subqueries

2017-01-04 Thread Vineet Garg (JIRA)
Vineet Garg created HIVE-15544:
--

 Summary: Support scalar subqueries
 Key: HIVE-15544
 URL: https://issues.apache.org/jira/browse/HIVE-15544
 Project: Hive
  Issue Type: Sub-task
  Components: SQL
Reporter: Vineet Garg
Assignee: Vineet Garg


Currently HIVE only support IN/EXISTS/NOT IN/NOT EXISTS subqueries. HIVE 
doesn't allow sub-queries such as:

{code}
explain select  a.ca_state state, count(*) cnt
 from customer_address a
 ,customer c
 ,store_sales s
 ,date_dim d
 ,item i
 where   a.ca_address_sk = c.c_current_addr_sk
and c.c_customer_sk = s.ss_customer_sk
and s.ss_sold_date_sk = d.d_date_sk
and s.ss_item_sk = i.i_item_sk
and d.d_month_seq = 
 (select distinct (d_month_seq)
  from date_dim
   where d_year = 2000
and d_moy = 2 )
and i.i_current_price > 1.2 * 
 (select avg(j.i_current_price) 
 from item j 
 where j.i_category = i.i_category)
 group by a.ca_state
 having count(*) >= 10
 order by cnt 
 limit 100;
{code}

We initially plan to support such scalar subqueries in filter i.e. WHERE and 
HAVING



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Hive conditional join question

2017-01-04 Thread Ranjan Banerjee
Hi,
I have two hive tables and I want to do a join only if both the tables have 
data in them. I don't want the join to happen if one of the table is empty. I 
tried exploring case statement with the intention that I would do something like
select count(*) as val case when val > 0 then else end from table2
However it looks like hive won't allow to perform an evaluation within the case 
statement, so this approach does not work. Anyone has any input on how to 
perform this in hive.

Thanks
Ranjan


[jira] [Created] (HIVE-15543) Don't try to get memory/cores to decide parallelism when Spark dynamic allocation is enabled

2017-01-04 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created HIVE-15543:
--

 Summary: Don't try to get memory/cores to decide parallelism when 
Spark dynamic allocation is enabled
 Key: HIVE-15543
 URL: https://issues.apache.org/jira/browse/HIVE-15543
 Project: Hive
  Issue Type: Improvement
  Components: Spark
Affects Versions: 2.2.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


Presently Hive tries to get numbers for memory and cores from the Spark 
application and use them to determine RS parallelism. However, this doesn't 
make sense when Spark dynamic allocation is enabled because the current numbers 
doesn't represent available computing resources, especially when SparkContext 
is initially launched.

Thus, it makes send not to do that when dynamic allocation is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15542) NPE in StatsUtils::getColStatistics when all values in DATE column are NULL

2017-01-04 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created HIVE-15542:
---

 Summary: NPE in StatsUtils::getColStatistics when all values in 
DATE column are NULL
 Key: HIVE-15542
 URL: https://issues.apache.org/jira/browse/HIVE-15542
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan


Observed the following stacktrace, when all the values are NULL in date column.
 
{noformat}
2017-01-04T19:10:37,779 ERROR [46f293ab-1516-429d-aaab-4d5818ef8b82 main] 
ql.Driver: FAILED: NullPointerException null
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.stats.StatsUtils.getColStatistics(StatsUtils.java:759)
at 
org.apache.hadoop.hive.ql.stats.StatsUtils.convertColStats(StatsUtils.java:806)
at 
org.apache.hadoop.hive.ql.stats.StatsUtils.getTableColumnStats(StatsUtils.java:792)
at 
org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:206)
at 
org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:152)
at 
org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:140)
at 
org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$TableScanStatsRule.process(StatsRulesProcFactory.java:128)
at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
at 
org.apache.hadoop.hive.ql.lib.LevelOrderWalker.walk(LevelOrderWalker.java:143)
at 
org.apache.hadoop.hive.ql.lib.LevelOrderWalker.startWalking(LevelOrderWalker.java:122)
at 
org.apache.hadoop.hive.ql.optimizer.stats.annotation.AnnotateWithStatistics.transform(AnnotateWithStatistics.java:78)
at 
org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsAnnotation(TezCompiler.java:260)
at 
org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:129)
at 
org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:140)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11071)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10644)
at 
org.apache.hadoop.hive.ql.parse.ColumnStatsSemanticAnalyzer.analyze(ColumnStatsSemanticAnalyzer.java:412)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:510)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1302)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1442)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1222)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:777)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:715)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:642)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 55194: HIVE-15541: Hive OOM when ATSHook enabled and ATS goes down

2017-01-04 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55194/
---

Review request for hive.


Bugs: HIVE-15541
https://issues.apache.org/jira/browse/HIVE-15541


Repository: hive-git


Description
---

Create the ATSHook executor with a bounded queue capacity


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 47db0c0 
  ql/src/java/org/apache/hadoop/hive/ql/hooks/ATSHook.java 3651c9c 

Diff: https://reviews.apache.org/r/55194/diff/


Testing
---


Thanks,

Jason Dere



[jira] [Created] (HIVE-15541) Hive OOM when ATSHook enabled and ATS goes down

2017-01-04 Thread Jason Dere (JIRA)
Jason Dere created HIVE-15541:
-

 Summary: Hive OOM when ATSHook enabled and ATS goes down
 Key: HIVE-15541
 URL: https://issues.apache.org/jira/browse/HIVE-15541
 Project: Hive
  Issue Type: Bug
  Components: Hooks
Reporter: Jason Dere
Assignee: Jason Dere


The ATS API used by the Hive ATSHook is a blocking call, if ATS goes down this 
can block the ATSHook executor, while the hook continues to submit work to the 
executor with each query.
Over time the buildup of queued items can cause OOM.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 55156: Min-max runtime filtering

2017-01-04 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55156/
---

(Updated Jan. 4, 2017, 10:12 p.m.)


Review request for hive, Gopal V, Gunther Hagleitner, Jason Dere, Prasanth_J, 
and Rajesh Balamohan.


Changes
---

Remove SyntheticJoinPredicates when table scan is NULL.


Bugs: HIVE-15269
https://issues.apache.org/jira/browse/HIVE-15269


Repository: hive-git


Description
---

HIVE-15269 min-max runtime filtering.
The patch also contains the patch for HIVE-15270.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 47db0c0 
  itests/src/test/resources/testconfiguration.properties 1cebc70 
  orc/src/java/org/apache/orc/impl/RecordReaderImpl.java 975804b 
  orc/src/test/org/apache/orc/impl/TestRecordReaderImpl.java cdd62ac 
  pom.xml 376197e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java 
69ba4a2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java 940f2dd 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DynamicValueRegistry.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeColumnEvaluator.java 
24c8281 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeConstantDefaultEvaluator.java
 89a75eb 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeConstantEvaluator.java 
4fe72a0 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeDynamicValueEvaluator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluator.java b8d6ab7 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluatorFactory.java 
0d03d8f 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluatorHead.java 42685fb 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluatorRef.java 0a6b66a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeFieldEvaluator.java 
ff32626 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java 
221abd9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FilterOperator.java bd0d28c 
  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 46f0ecd 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java ac5331e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 9718c48 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ObjectCache.java 440e0a1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ObjectCacheWrapper.java 9768efa 
  ql/src/java/org/apache/hadoop/hive/ql/exec/SelectOperator.java 9049ddd 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ObjectCache.java 008f8a4 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DynamicValueRegistryTez.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/LlapObjectCache.java 0141230 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MapRecordProcessor.java 
955fa80 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ObjectCache.java 06dca00 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordProcessor.java 
d80f201 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
0cb6c8a 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSMBMapJoinOperator.java 
80b0a14 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
f6b6447 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DynamicValueVectorExpression.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/sarg/ConvertAstToSearchArg.java 
9d900e4 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
 26fcc45 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/FixedBucketPruningOptimizer.java
 9e9beb0 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/RedundantDynamicPruningConditionsRemoval.java
 d9ce017 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
 aa1e509 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java e2363eb 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 35f34da 
  ql/src/java/org/apache/hadoop/hive/ql/parse/RuntimeValuesInfo.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java e8b003e 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java cdb9e1b 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BaseWork.java 13a0811 
  ql/src/java/org/apache/hadoop/hive/ql/plan/DynamicValue.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicValueDesc.java 
PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestConvertAstToSearchArg.java 
93b50a6 
  ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java 
8cbc26d 
  ql/src/test/org/apache/hadoop/hive/ql/optimizer/physical/TestVectorizer.java 
3295372 
  ql/src/test/queries/clientpositive/dynamic_semijoin_reduction.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction.q.out 
PRE-CREATION 
  

Re: Review Request 53619: HIVE-15161 migrate ColumnStats to use jackson

2017-01-04 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53619/#review160534
---




common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java (line 190)


what is the difference between NON_DEFAULT and the following NON_EMPTY?



common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java (line 205)


We still need to have a try catch block for "// For backward compatibility, 
if previous value can not be parsed to a json object, it will come here." 
Because we do not have a json object format in very old versions, this will 
throw exception but we would like to return false.



common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java (line 213)


The same for the backward compatibility issue.



common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java (line 287)


startsWith sounds not as good as previous "try catch block"



common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java (line 296)


use TRUE please.



common/src/test/org/apache/hadoop/hive/common/TestStatsSetupConst.java (line 90)


This makes me worry about the difference between golden files for tests 
when we try to release a product... Do you mean that order is not preserved 
when we print them out? Could u add more test cases for "described extended 
[tableName]"? Thanks.


- pengcheng xiong


On Dec. 9, 2016, 1:14 p.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53619/
> ---
> 
> (Updated Dec. 9, 2016, 1:14 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-15161
> https://issues.apache.org/jira/browse/HIVE-15161
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> * json.org has license issues
> * jackson can provide a fully compatible alternative to it
> * there are a few flakiness issues caused by the order of the map entries of 
> the columns...this can be addressed, org.json api was unfriendly in this 
> manner ;)
> * fully backward compatible change
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
> 25c7508f51662773e913a176bee7c8bd223202d4 
>   common/src/test/org/apache/hadoop/hive/common/TestStatsSetupConst.java 
> 7a7ad424a8e53ed89c79592ced86c7c38eaf4e04 
> 
> Diff: https://reviews.apache.org/r/53619/diff/
> 
> 
> Testing
> ---
> 
> added unit test
> 
> 
> Thanks,
> 
> Zoltan Haindrich
> 
>



Re: Review Request 55154: HIVE-15366: REPL LOAD & DUMP support for incremental INSERT events

2017-01-04 Thread Sushanth Sowmyan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55154/#review160533
---




metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java
 (line 53)


I'm not convinced that this is a good method to add, since this is 
repl-specific, and adds complexity. Any presence of checksum must be encoded 
into the uris, so that when we call getFiles(), it contains it. Also, the files 
have no explicit meaning without the checksum, since they will not be stable 
uris. The getFiles() returned by InsertMessage should already be a CM uri that 
encodes the checksum, for eg: cm://hdfs%3A%2F%2Fblah$2Ffile1#abcdef1234567890 
might imply the file hdfs://blah/file1 with checksum "abcdef1234567890". I'm 
not super pick on the actual encoding mechanism used, but we want the 
getFiles() results to be uris that are stable uris - ones which, even if we 
don't have a FileSystem object associated with it directly, we can extract the 
info we want from it at the endpoint when we use it, and generate it when we 
generate it, and all areas in between simply pass it on without doing anything 
additional with it.

Thus, the places I see "generating" this are either DbNotificationListener 
or fireInsertEvent(), or ReplCopyTask during a bootstrap dump. The only place I 
see extracting/consuming this uri would be in ReplCopyTask on destination. All 
other areas should not split this.



metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java
 (line 376)


We should not be adding more of these methods into JSONMessageFactory that 
add field names here. That knowledge should belong to the domain of the message 
itself. The existing methods that do this are currently slated for removal once 
we refactor DbNotificationListener to not depend on them.



ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 
(line 576)


The partspec can be obtained from insertMsg.getPartitionKeyValues() - we 
should'nt make calls to JSONMessageFactory here.

JSONInsertMessage, in its implementation of getPartitionKeyValues, can, in 
turn, then call generic functions from JSONMessageFactory using knowledge it 
has about itself.

There should'nt be any explicit calls to JSONMessageFactory from any class 
which is not a JSON*Message.

See the previous ALTER patch and how it changed the ADD_PTNS/CREATE_TABLE 
processing for a reference.



ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 
(line 595)


We should not be making calls to JSONMessageFactory, or getting fields with 
knowledge of names such as "fileChecksums" or "files". Knowledge of fieldnames 
should be restricted to inside the message itself, which exposes api via its 
parent Message class.

This should simply be a dump of what the InsertMessage.getFiles() returns 
and no more. Any encoding of checksum/etc that we do must happen in 
DbNotificationListener, or even possibly in fireInsertEvent, since the location 
is meaningless without the checksum.


- Sushanth Sowmyan


On Jan. 4, 2017, 12:59 p.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55154/
> ---
> 
> (Updated Jan. 4, 2017, 12:59 p.m.)
> 
> 
> Review request for hive, Daniel Dai, Sushanth Sowmyan, and Thejas Nair.
> 
> 
> Bugs: HIVE-15366
> https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Diffs
> -
> 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestReplicationScenarios.java
>  e29aa22 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java
>  fe747df 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONInsertMessage.java
>  bd9f9ec 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java
>  9954902 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java 4c0f817 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 6e9602f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ExportSemanticAnalyzer.java 
> f61274b 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
> 5561e06 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 
> 9b83407 
> 
> Diff: https://reviews.apache.org/r/55154/diff/
> 
> 
> Testing
> ---
> 
> 
> 

[jira] [Created] (HIVE-15540) java.sql.SQLException: Method not supported is thrown for DatabaseMetadata.getURL() and DatabaseMetadata.getUserName

2017-01-04 Thread QING ZHU (JIRA)
QING ZHU created HIVE-15540:
---

 Summary: java.sql.SQLException: Method not supported is thrown for 
DatabaseMetadata.getURL() and DatabaseMetadata.getUserName
 Key: HIVE-15540
 URL: https://issues.apache.org/jira/browse/HIVE-15540
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 1.2.1
 Environment: Windows
Reporter: QING ZHU


Using Hive_Hadoop version "1.2.1_fromHDP2.5"
1.) hadoop-auth-2.7.3.2.5.0.0-1245.jar
2.) hadoop-common-2.7.3.2.5.0.0-1245.jar
3.) hive-jdbc-1.2.1000.2.5.0.0-1245-standalone.jar

1.) java.sql.SQLException: Method not supported
at 
org.apache.hive.jdbc.HiveDatabaseMetaData.getURL(HiveDatabaseMetaData.java:749)

2.) java.sql.SQLException: Method not supported
at 
org.apache.hive.jdbc.HiveDatabaseMetaData.getUserName(HiveDatabaseMetaData.java:753)





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Invitation for Hive committers to become ORC committers

2017-01-04 Thread Owen O'Malley
Ferd, I've added you.

Suneel, I'm sorry, but the offer is limited to current Hive committers.
http://people.apache.org/phonebook.html?unix=hive

.. Owen

On Mon, Jan 2, 2017 at 6:39 PM, Suneel Jakka  wrote:

> Hi Owen,
>
> Am also interested.
>
> Regards,
> Suneel Jakka
>
>
> On Mon, Jan 2, 2017 at 8:19 PM, Xu, Cheng A  wrote:
>
> > Hi Owen,
> > Sorry for my late response. I'm also interested.
> >
> > Thanks,
> > Ferd
> >
> > -Original Message-
> > From: Owen O'Malley [mailto:omal...@apache.org]
> > Sent: Friday, December 23, 2016 11:55 AM
> > To: dev@hive.apache.org
> > Subject: Re: Invitation for Hive committers to become ORC committers
> >
> > Ok, I believe that I have got everyone. If you don't have karma as shown
> > here: http://people.apache.org/phonebook.html?unix=orc
> >
> > Please, let me know. I believe I have also updated the ORC website with
> > everyone.
> >
> > Thanks,
> >Owen
> >
> > On Sat, Dec 17, 2016 at 5:16 AM, Lars Francke 
> > wrote:
> >
> > > Hi Owen,
> > >
> > > I'm also interested.
> > >
> > > Thanks,
> > > Lars
> > >
> > > On Fri, Dec 16, 2016 at 10:20 PM, Sergio Pena
> > > 
> > > wrote:
> > >
> > > > Hi Ownen,
> > > >
> > > > I'm also interested.
> > > > - Sergio
> > > >
> > > > On Fri, Dec 16, 2016 at 11:39 AM, Daniel Dai 
> > > > wrote:
> > > >
> > > > > I am interested.
> > > > >
> > > > > Thanks,
> > > > > Daniel
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On 12/15/16, 1:12 PM, "Owen O'Malley"  wrote:
> > > > >
> > > > > >All,
> > > > > >   As you are aware, we are in the last stages of removing the
> > > > > >forked
> > > > ORC
> > > > > >code out of Hive. The goal of moving ORC out of Hive was to
> > > > > >increase
> > > its
> > > > > >community and we want to be very deliberately inclusive of the
> > > > > >Hive development community. Towards that end, the ORC PMC wants
> > > > > >to welcome anyone who is already a Hive committer to become a
> > committer on ORC.
> > > > > >
> > > > > >  Please respond on this thread to let us know if you are
> > interested.
> > > > > >
> > > > > >Thanks,
> > > > > >   Owen on behalf of the ORC PMC
> > > > >
> > > >
> > >
> >
>


[jira] [Created] (HIVE-15539) Optimize complex multi-insert queries in Calcite

2017-01-04 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-15539:
--

 Summary: Optimize complex multi-insert queries in Calcite
 Key: HIVE-15539
 URL: https://issues.apache.org/jira/browse/HIVE-15539
 Project: Hive
  Issue Type: Improvement
  Components: Parser
Affects Versions: 2.2.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Currently multi-insert queries are not optimized by Calcite. Proper integration 
with Calcite would include creating a _spool_ operator whose output is reused 
by every _insert_ statement; however, _spool_ operator has not been added to 
Calcite yet (CALCITE-481).

In the meantime, and since complex logic for multi-insert queries is in FROM 
clause, we can optimize the FROM clause with Calcite and connect the optimized 
result to the original query.

Initially, we will recognize three different cases:
- FROM clause is trivial, e.g., table reference, or not supported. No need to 
optimize with Calcite.
- FROM clause is a subquery. Optimize the subquery with Calcite.
- FROM clause is a join. Rewrite join into a subquery and optimize it with 
Calcite. Change references in INSERT statements to refer to subquery columns.

This should be beneficial for MERGE statements processing too, since MERGE 
statements are treated as multi-insert queries by Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15538) Test HIVE-13884 with more complex query predicates

2017-01-04 Thread Marta Kuczora (JIRA)
Marta Kuczora created HIVE-15538:


 Summary: Test HIVE-13884 with more complex query predicates
 Key: HIVE-15538
 URL: https://issues.apache.org/jira/browse/HIVE-15538
 Project: Hive
  Issue Type: Test
Reporter: Marta Kuczora
Assignee: Marta Kuczora


HIVE-13884 introduced a new property hive.metastore.limit.partition.request. It 
would be good to have more tests to cover the cases where the query predicates 
(such as like, in) could not be pushed down to see if the fail back from 
directsql to ORM works properly if hive.metastore.try.direct.sql is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 55154: HIVE-15366: REPL LOAD & DUMP support for incremental INSERT events

2017-01-04 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55154/
---

(Updated Jan. 4, 2017, 12:59 p.m.)


Review request for hive, Daniel Dai, Sushanth Sowmyan, and Thejas Nair.


Bugs: HIVE-15366
https://issues.apache.org/jira/browse/HIVE-15366


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-15366


Diffs (updated)
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestReplicationScenarios.java
 e29aa22 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java
 fe747df 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONInsertMessage.java
 bd9f9ec 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java
 9954902 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java 4c0f817 
  ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 6e9602f 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ExportSemanticAnalyzer.java 
f61274b 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
5561e06 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 
9b83407 

Diff: https://reviews.apache.org/r/55154/diff/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 55154: HIVE-15366: REPL LOAD & DUMP support for incremental INSERT events

2017-01-04 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55154/#review160476
---




ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 


Sorry, this change was left over from some of the debugging I was doing. 
Thanks for catching.


- Vaibhav Gumashta


On Jan. 3, 2017, 10:27 p.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55154/
> ---
> 
> (Updated Jan. 3, 2017, 10:27 p.m.)
> 
> 
> Review request for hive, Daniel Dai, Sushanth Sowmyan, and Thejas Nair.
> 
> 
> Bugs: HIVE-15366
> https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Diffs
> -
> 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java
>  fe747df 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONInsertMessage.java
>  bd9f9ec 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java
>  9954902 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java 4c0f817 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 6e9602f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ExportSemanticAnalyzer.java 
> f61274b 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
> 5561e06 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 
> 9b83407 
> 
> Diff: https://reviews.apache.org/r/55154/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>



[GitHub] hive pull request #128: HIVE-15473 Progress Bar on Beeline client

2017-01-04 Thread anishek
GitHub user anishek opened a pull request:

https://github.com/apache/hive/pull/128

HIVE-15473 Progress Bar on Beeline client

Have a common strategy to rendering the in place updates from both the hive 
cli and beeline. Various summary updates once the tez job is completed are no 
longer rendered with fancy colors. There is a possible condition where the 
logRunnable thread requests progress update from the server before the session 
state is updated with relevant object (TezJobMonitor in this case) to provide 
information. In this case no progress bar will be displayed.

Additionally since on local testing I was doing simple query, due to the 
wait interval time of 1 Second between calls the query finishes faster and the 
progress bar is just stuck in between and we get the results. hopefully for 
larger queries it would not be so off, else we have to do frequent calls to 
server to get the updates which will just overload it.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/anishek/hive master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/128.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #128


commit 7038c0ccc81273cd1b0872735908772d56d764eb
Author: anishek 
Date:   2017-01-04T10:03:05Z

HIVE-15473 Progress Bar on Beeline client

commit a2ff2f9e52174f86227ebb539bfba0a08c830cb2
Author: anishek 
Date:   2017-01-04T10:15:55Z

HIVE-15473  Progress Bar on Beeline client

removing unnecessary dependency and making sure we show progress only if 
available




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] hive pull request #127: HIVE-15473 Progress Bar on Beeline client

2017-01-04 Thread anishek
Github user anishek closed the pull request at:

https://github.com/apache/hive/pull/127


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] hive pull request #127: HIVE-15437 Progress Bar on Beeline client

2017-01-04 Thread anishek
GitHub user anishek opened a pull request:

https://github.com/apache/hive/pull/127

HIVE-15437 Progress Bar on Beeline client

Have a common strategy to rendering the in place updates from both the hive 
cli and beeline. Various summary updates once the tez job is completed are no 
longer rendered with fancy colors. There is a possible condition where the 
logRunnable thread requests progress update from the server before the session 
state is updated with relevant object (TezJobMonitor in this case) to provide 
information. In this case no progress bar will be displayed. 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/anishek/hive master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/127.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #127


commit f432b068dd843d5093da2ba138382e3fe27c6b72
Author: anishek 
Date:   2016-12-26T06:39:37Z

HIVE-15437 Progress Bar on Beeline client

Initial set of changes to add a remote api to get progress bar as a 
serialized object
beeline client side changes to allow printing information

commit 9cb1a3a6b12be8f6b50aafcb13b6958e4a449f7b
Author: anishek 
Date:   2016-12-26T08:57:12Z

HIVE-15437 Progress Bar on Beeline client

DDL's are not trying to show progress bar and no call to server is made if 
operation handle not available

commit 1b32f38afc78762921376af8662cc9b663e859e1
Author: anishek 
Date:   2016-12-28T08:31:23Z

HIVE-15437 Progress Bar on Beeline client

correcting  divide by zero

commit d33341a95b617af887b2a48578833fd4564ffdd3
Author: anishek 
Date:   2016-12-30T09:47:05Z

HIVE-15437 Progress Bar on Beeline client

adding ability to show the progress bar within tez job monitor

commit e50e71f5ac85c14f51d2b312c7f3a7b44ac57c7b
Author: anishek 
Date:   2017-01-03T09:20:48Z

HIVE-15437 Progress Bar on Beeline client

Ability to print correct vertex status, beeline not having any 
configuration for inplace updates its on server side, where if server side is 
off then beeline is on, and stateful progress bar print on hive cli

commit 30ada9d6857995f564a4d1852778894c049b4051
Author: anishek 
Date:   2017-01-04T06:43:46Z

HIVE-15437 Progress Bar on Beeline client

Providing PrintStream to be used for rendering, with sleep at the beginning 
when updating logs we are trying to make sure that the session sate is setup 
correctly by the execute call so we can get the correct progress bar 
information from server.

commit bc39731a24d72675eb04d918d31339a58891a740
Author: anishek 
Date:   2017-01-04T08:16:57Z

HIVE-15437 Progress Bar on Beeline client

only returning null from progressStatus if query execution is complete or 
failed  else throw an exception

commit a8e212a565e17f38e479a68b958de81d2670d6fd
Author: anishek 
Date:   2017-01-04T08:55:36Z

HIVE-15437 Progress Bar on Beeline client

state maintained on client side to render.  trying to render logs at the 
end when execution is over.

commit f4c6dbdc50142b77a1776b7c021a100fe7ca8ab7
Author: anishek 
Date:   2016-12-26T06:39:37Z

HIVE-15437 Progress Bar on Beeline client

Initial set of changes to add a remote api to get progress bar as a 
serialized object
beeline client side changes to allow printing information

commit e6e57a56e13da6ad0e72d742f5f46480486bb41f
Author: anishek 
Date:   2016-12-26T08:57:12Z

HIVE-15437 Progress Bar on Beeline client

DDL's are not trying to show progress bar and no call to server is made if 
operation handle not available

commit 8538919c2c69432cf78135e147eda858ae12d152
Author: anishek 
Date:   2016-12-28T08:31:23Z

HIVE-15437 Progress Bar on Beeline client

correcting  divide by zero

commit 3850384f61feb28aa1de12e5f97c0f63adcd0656
Author: anishek 
Date:   2016-12-30T09:47:05Z

HIVE-15437 Progress Bar on Beeline client

adding ability to show the progress bar within tez job monitor

commit 774e6d911a3173e1e5729fb860cec8dab5883e2c
Author: anishek 
Date:   2017-01-03T09:20:48Z

HIVE-15437 Progress Bar on Beeline client

Ability to print correct vertex status, beeline not having any 
configuration for inplace updates its on server side, where if server side is 
off then beeline is on, and stateful progress bar print on hive cli

commit 87ba333c66b67436858ac148784466f2572d561a
Author: anishek 
Date:   2017-01-04T06:43:46Z

HIVE-15437 Progress Bar on Beeline client

Providing PrintStream to be used for rendering, with