[jira] [Commented] (HIVE-16677) CTAS with no data fails in Druid

2017-10-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198291#comment-16198291
 ] 

Hive QA commented on HIVE-16677:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12891186/HIVE-16677.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 11199 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_basic1] (batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_basic2] 
(batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_intervals] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_timeseries] 
(batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_topn] (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[spark_local_queries] 
(batchId=64)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[varchar_join1] 
(batchId=5)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
 (batchId=171)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=239)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] 
(batchId=239)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7206/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7206/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7206/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12891186 - PreCommit-HIVE-Build

> CTAS with no data fails in Druid
> 
>
> Key: HIVE-16677
> URL: https://issues.apache.org/jira/browse/HIVE-16677
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16677.patch
>
>
> If we create a table in Druid using a CTAS statement and the query executed 
> to create the table produces no data, we fail with the following exception:
> {noformat}
> druid.DruidStorageHandler: Exception while commit
> java.io.FileNotFoundException: File 
> /tmp/workingDirectory/.staging-jcamachorodriguez_20170515053123_835c394b-2157-4f6b-bfed-a2753acd568e/segmentsDescriptorDir
>  does not exist.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-11548) HCatLoader should support predicate pushdown.

2017-10-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198352#comment-16198352
 ] 

Hive QA commented on HIVE-11548:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12891207/HIVE-11548.6-branch-2.2.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 59 failed/errored test(s), 9944 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=244)
TestJdbcDriver2 - did not produce a TEST-*.xml file (likely timed out) 
(batchId=225)
TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=167)
[acid_globallimit.q,alter_merge_2_orc.q]
TestMiniSparkOnYarnCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=173)

[infer_bucket_sort_reducers_power_two.q,list_bucket_dml_10.q,orc_merge9.q,orc_merge6.q,leftsemijoin_mr.q,bucket6.q,bucketmapjoin7.q,uber_reduce.q,empty_dir_in_table.q,vector_outer_join3.q,index_bitmap_auto.q,vector_outer_join2.q,vector_outer_join1.q,orc_merge1.q,orc_merge_diff_fs.q,load_hdfs_file_with_space_in_the_name.q,scriptfile1_win.q,quotedid_smb.q,truncate_column_buckets.q,orc_merge3.q]
TestMiniSparkOnYarnCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=174)

[infer_bucket_sort_num_buckets.q,gen_udf_example_add10.q,insert_overwrite_directory2.q,orc_merge5.q,bucketmapjoin6.q,import_exported_table.q,vector_outer_join0.q,orc_merge4.q,temp_table_external.q,orc_merge_incompat1.q,root_dir_external_table.q,constprog_semijoin.q,auto_sortmerge_join_16.q,schemeAuthority.q,index_bitmap3.q,external_table_with_space_in_location_path.q,parallel_orderby.q,infer_bucket_sort_map_operators.q,bucketizedhiveinputformat.q,remote_script.q]
TestMiniSparkOnYarnCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=175)

[scriptfile1.q,vector_outer_join5.q,file_with_header_footer.q,bucket4.q,input16_cc.q,bucket5.q,infer_bucket_sort_merge.q,constprog_partitioner.q,orc_merge2.q,reduce_deduplicate.q,schemeAuthority2.q,load_fs2.q,orc_merge8.q,orc_merge_incompat2.q,infer_bucket_sort_bucketed_table.q,vector_outer_join4.q,disable_merge_for_bucketing.q,vector_inner_join.q,orc_merge7.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=118)

[bucketmapjoin4.q,bucket_map_join_spark4.q,union21.q,groupby2_noskew.q,timestamp_2.q,date_join1.q,mergejoins.q,smb_mapjoin_11.q,auto_sortmerge_join_3.q,mapjoin_test_outer.q,vectorization_9.q,merge2.q,groupby6_noskew.q,auto_join_without_localtask.q,multi_join_union.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=119)

[join_cond_pushdown_unqual4.q,union_remove_7.q,join13.q,join_vc.q,groupby_cube1.q,bucket_map_join_spark2.q,sample3.q,smb_mapjoin_19.q,stats16.q,union23.q,union.q,union31.q,cbo_udf_udaf.q,ptf_decimal.q,bucketmapjoin2.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=120)

[parallel_join1.q,union27.q,union12.q,groupby7_map_multi_single_reducer.q,varchar_join1.q,join7.q,join_reorder4.q,skewjoinopt2.q,bucketsortoptimize_insert_2.q,smb_mapjoin_17.q,script_env_var1.q,groupby7_map.q,groupby3.q,bucketsortoptimize_insert_8.q,union20.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=121)

[ptf_general_queries.q,auto_join_reordering_values.q,sample2.q,join1.q,decimal_join.q,mapjoin_subquery2.q,join32_lessSize.q,mapjoin1.q,order2.q,skewjoinopt18.q,union_remove_18.q,join25.q,groupby9.q,bucketsortoptimize_insert_6.q,ctas.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=122)

[groupby_map_ppr.q,nullgroup4_multi_distinct.q,join_rc.q,union14.q,smb_mapjoin_12.q,vector_cast_constant.q,union_remove_4.q,auto_join11.q,load_dyn_part7.q,udaf_collect_set.q,vectorization_12.q,groupby_sort_skew_1.q,groupby_sort_skew_1_23.q,smb_mapjoin_25.q,skewjoinopt12.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=123)

[skewjoinopt15.q,auto_join18.q,list_bucket_dml_2.q,input1_limit.q,load_dyn_part3.q,union_remove_14.q,auto_sortmerge_join_14.q,auto_sortmerge_join_15.q,union10.q,bucket_map_join_tez2.q,groupby5_map_skew.q,join_reorder.q,sample1.q,bucketmapjoin8.q,union34.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=124)

[avro_joins.q,skewjoinopt16.q,auto_join14.q,vectorization_14.q,auto_join26.q,stats1.q,cbo_stats.q,auto_sortmerge_join_6.q,union22.q,union_remove_24.q,union_view.q,smb_mapjoin_22.q,stats15.q,ptf_matchpath.q,transform_ppr1.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=125)

[limit_pushdown2.q,skewjoin_no

[jira] [Updated] (HIVE-17746) Regenerate spark_explainuser_1.q.out

2017-10-10 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-17746:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master.
Thanks for the quick review [~vgarg]!

> Regenerate spark_explainuser_1.q.out
> 
>
> Key: HIVE-17746
> URL: https://issues.apache.org/jira/browse/HIVE-17746
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Fix For: 3.0.0
>
> Attachments: HIVE-17746.patch
>
>
> There is 2 changes in  spark_explainuser_1.q.out:
> 1., After HIVE-17465, the row numbers are different in the explain plans. 
> [~vgarg], [~ashutoshc]: Could you please check, wether it is an intended 
> change?
> 2., After HIVE-17535, CBO optimization turned on and the output of the 
> following query changed:
> {code:title=Query}
> explain select explode(array('a', 'b'));
> {code}
> {code:title=Original}
>  POSTHOOK: query: explain select explode(array('a', 'b'))
>  POSTHOOK: type: QUERY
>  Plan not optimized by CBO.
>  
>  Stage-0
>Fetch Operator
>  limit:-1
> UDTF Operator [UDTF_2]
>   function name:explode
>   Select Operator [SEL_1]
> Output:["_col0"]
> TableScan [TS_0]
> {code}
> {code:title=New}
>  POSTHOOK: query: explain select explode(array('a', 'b'))
>  POSTHOOK: type: QUERY
>  Plan optimized by CBO.
>  
>  Stage-0
>Fetch Operator
>  limit:-1
> Select Operator [SEL_3]
>   Output:["_col0"]
>   UDTF Operator [UDTF_2]
> function name:explode
> Select Operator [SEL_1]
>   Output:["_col0"]
>   TableScan [TS_0]
> {code}
> This 2nd change does not look like a successful optimization for me. Is it 
> planned :)
> If you think these are planned changes, then I think it would be good to 
> update the golden file.
> Thanks,
> Peter



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17733) Move RawStore to standalone metastore

2017-10-10 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198365#comment-16198365
 ] 

Zoltan Haindrich commented on HIVE-17733:
-

[~alangates] I guess metastoreutils is ok for now...I think we should try to 
remove as much stuff from that class as possible later (last time I played with 
metastore client separation; the metastoreutils ended up as a public api part) 
- I think the most important is to avoid the appearance of HMSHandler in 
classes which will be later part of the public api will be benefical.

I think probably later we should file some followups to remove undesired 
options from the metastore ( which will cause trouble later like this one)
https://github.com/apache/hive/pull/258/files?w=1#diff-61434f2dd7a5fe334625e19ef2405ed3L592
I feel that statistics gathering is kinda out of scope for the metastore.

I've seen a possibly notable thing:
https://github.com/apache/hive/pull/258/files?w=1#diff-3d07754b1de8dc5a50a3395b7d6f9045L513
the old one permitted to pass all kind of settings to the persistence layer; 
the new version adds an extra "filtering" on top of it; I think instead of 
re-declaring all kind of jdo/jpa/xxx args in metastore's conf; we should 
provide some generic way to pass props to the jpa implementation; I would like 
to propose that we might pass a subtree of settings to the jpa implementation ( 
without the sub-tree prefix ) - although I'm not sure how often these 
properties are being used by the enduser
beyond this, everything looks good.


> Move RawStore to standalone metastore
> -
>
> Key: HIVE-17733
> URL: https://issues.apache.org/jira/browse/HIVE-17733
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: pull-request-available
> Attachments: HIVE-17733.2.patch, HIVE-17733.patch
>
>
> This includes moving implementations of RawStore (like ObjectStore), 
> MetastoreDirectSql, and stats related classes like ColumnStatsAggregator and 
> the NDV classes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17756) Introduce subquery test case for Hive on Spark

2017-10-10 Thread Dapeng Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun reassigned HIVE-17756:
-


> Introduce subquery test case for Hive on Spark
> --
>
> Key: HIVE-17756
> URL: https://issues.apache.org/jira/browse/HIVE-17756
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17756) Introduce subquery test case for Hive on Spark

2017-10-10 Thread Dapeng Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-17756:
--
Status: Patch Available  (was: Open)

> Introduce subquery test case for Hive on Spark
> --
>
> Key: HIVE-17756
> URL: https://issues.apache.org/jira/browse/HIVE-17756
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17756) Introduce subquery test case for Hive on Spark

2017-10-10 Thread Dapeng Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-17756:
--
Attachment: HIVE-17756.001.patch

Attached the patch

> Introduce subquery test case for Hive on Spark
> --
>
> Key: HIVE-17756
> URL: https://issues.apache.org/jira/browse/HIVE-17756
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-17756.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15860) RemoteSparkJobMonitor may hang when RemoteDriver exits abnormally

2017-10-10 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198434#comment-16198434
 ] 

Rui Li commented on HIVE-15860:
---

Hi [~stakiar], I agree it's good to make QUEUED/SENT fail faster. But I still 
want to avoid the check in "normal" cases because as you said, each RPC call is 
doing the check already. Anyway, please feel free to open the JIRA.

> RemoteSparkJobMonitor may hang when RemoteDriver exits abnormally
> -
>
> Key: HIVE-15860
> URL: https://issues.apache.org/jira/browse/HIVE-15860
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
> Fix For: 2.3.0
>
> Attachments: HIVE-15860.1.patch, HIVE-15860.2.patch, 
> HIVE-15860.2.patch
>
>
> It happens when RemoteDriver crashes between {{JobStarted}} and 
> {{JobSubmitted}}, e.g. killed by {{kill -9}}. RemoteSparkJobMonitor will 
> consider the job has started, however it can't get the job info because it 
> hasn't received the JobId. Then the monitor will loop forever.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17139) Conditional expressions optimization: skip the expression evaluation if the condition is not satisfied for vectorization engine.

2017-10-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198420#comment-16198420
 ] 

Hive QA commented on HIVE-17139:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12890998/HIVE-17139.20.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 11149 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[spark_local_queries] 
(batchId=64)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=162)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=239)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] 
(batchId=239)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.org.apache.hadoop.hive.ql.parse.TestReplicationScenarios
 (batchId=219)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7208/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7208/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7208/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12890998 - PreCommit-HIVE-Build

> Conditional expressions optimization: skip the expression evaluation if the 
> condition is not satisfied for vectorization engine.
> 
>
> Key: HIVE-17139
> URL: https://issues.apache.org/jira/browse/HIVE-17139
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ke Jia
>Assignee: Ke Jia
> Attachments: HIVE-17139.1.patch, HIVE-17139.10.patch, 
> HIVE-17139.11.patch, HIVE-17139.12.patch, HIVE-17139.13.patch, 
> HIVE-17139.13.patch, HIVE-17139.14.patch, HIVE-17139.15.patch, 
> HIVE-17139.16.patch, HIVE-17139.17.patch, HIVE-17139.18.patch, 
> HIVE-17139.18.patch, HIVE-17139.19.patch, HIVE-17139.2.patch, 
> HIVE-17139.20.patch, HIVE-17139.3.patch, HIVE-17139.4.patch, 
> HIVE-17139.5.patch, HIVE-17139.6.patch, HIVE-17139.7.patch, 
> HIVE-17139.8.patch, HIVE-17139.9.patch
>
>
> The case when and if statement execution for Hive vectorization is not 
> optimal, which all the conditional and else expressions are evaluated for 
> current implementation. The optimized approach is to update the selected 
> array of batch parameter after the conditional expression is executed. Then 
> the else expression will only do the selected rows instead of all.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17757) REPL LOAD should use customised configurations to execute distcp/remote copy.

2017-10-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan reassigned HIVE-17757:
---


> REPL LOAD should use customised configurations to execute distcp/remote copy.
> -
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [*WITH (key1=value1, 
> key2=value2)*];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17757) REPL LOAD should use customised configurations to execute distcp/remote copy.

2017-10-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17757:

Description: 
As REPL LOAD command needs to read repl dump directory and data files from 
source cluster, it needs to use some of the configurations to read data 
securely through distcp.
Some of the HDFS configurations cannot be added to whitelist as they pose 
security threat. So, it is necessary for REPL LOAD command to take such configs 
as input and use it when trigger distcp.
*Proposed syntax:*
REPL LOAD [.] FROM  [WITH (key1=value1, 
key2=value2)];

  was:
As REPL LOAD command needs to read repl dump directory and data files from 
source cluster, it needs to use some of the configurations to read data 
securely through distcp.
Some of the HDFS configurations cannot be added to whitelist as they pose 
security threat. So, it is necessary for REPL LOAD command to take such configs 
as input and use it when trigger distcp.
*Proposed syntax:*
REPL LOAD [.] FROM  [*WITH (key1=value1, 
key2=value2)*];


> REPL LOAD should use customised configurations to execute distcp/remote copy.
> -
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH (key1=value1, 
> key2=value2)];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work started] (HIVE-17757) REPL LOAD should use customised configurations to execute distcp/remote copy.

2017-10-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17757 started by Sankar Hariappan.
---
> REPL LOAD should use customised configurations to execute distcp/remote copy.
> -
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH (key1=value1, 
> key2=value2)];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17754) InputJobInfo in Pig UDFContext is heavyweight, and causes OOMs in Tez AMs

2017-10-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198504#comment-16198504
 ] 

Hive QA commented on HIVE-17754:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12891216/HIVE-17754.1.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 11197 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[spark_local_queries] 
(batchId=64)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=162)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=240)
org.apache.hive.hcatalog.pig.TestAvroHCatLoader.testGetInputBytes (batchId=184)
org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testGetInputBytes (batchId=184)
org.apache.hive.hcatalog.pig.TestParquetHCatLoader.testGetInputBytes 
(batchId=184)
org.apache.hive.hcatalog.pig.TestRCFileHCatLoader.testGetInputBytes 
(batchId=184)
org.apache.hive.hcatalog.pig.TestSequenceFileHCatLoader.testGetInputBytes 
(batchId=184)
org.apache.hive.hcatalog.pig.TestTextFileHCatLoader.testGetInputBytes 
(batchId=184)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7209/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7209/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7209/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12891216 - PreCommit-HIVE-Build

> InputJobInfo in Pig UDFContext is heavyweight, and causes OOMs in Tez AMs
> -
>
> Key: HIVE-17754
> URL: https://issues.apache.org/jira/browse/HIVE-17754
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17754.1.patch
>
>
> HIVE-9845 dealt with reducing the size of HCat split-info, to improve 
> job-launch times for Pig/HCat jobs.
> For large Pig queries that scan a large number of Hive partitions, it was 
> found that the Pig {{UDFContext}} stored full-fat HCat {{InputJobInfo}} 
> objects, thus blowing out the Pig Tez AM. Since this information is already 
> stored in the {{HCatSplit}}, the serialization of {{InputJobInfo}} can be 
> spared.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17756) Introduce subquery test case for Hive on Spark

2017-10-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198554#comment-16198554
 ] 

Hive QA commented on HIVE-17756:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12891226/HIVE-17756.001.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 11201 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[spark_local_queries] 
(batchId=64)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=163)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=240)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7210/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7210/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7210/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12891226 - PreCommit-HIVE-Build

> Introduce subquery test case for Hive on Spark
> --
>
> Key: HIVE-17756
> URL: https://issues.apache.org/jira/browse/HIVE-17756
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-17756.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS

2017-10-10 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198558#comment-16198558
 ] 

Rui Li commented on HIVE-16395:
---

Hi [~asherman], sorry for the late response, just returned from a long holiday. 
Cloning the job conf sounds good to me.

> ConcurrentModificationException on config object in HoS
> ---
>
> Key: HIVE-16395
> URL: https://issues.apache.org/jira/browse/HIVE-16395
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> Looks like this is happening inside spark executors, looks to be some race 
> condition when modifying {{Configuration}} objects.
> Stack-Trace:
> {code}
> java.io.IOException: java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:267)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:334)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:682)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:240)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211)
>   at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:253)
>   ... 21 more
> Caused by: java.util.ConcurrentModificationException
>   at java.util.Hashtable$Enumerator.next(Hashtable.java:1167)
>   at 
> org.apache.hadoop.conf.Configuration.iterator(Configuration.java:2455)
>   at 
> org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:716)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:181)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387)
>   at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
>   at 
> org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108)
>   at 
> org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:68)
>   ... 26 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17759) Raise class security around HiveConf.ConfVars.default fields to prevent misuses

2017-10-10 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198615#comment-16198615
 ] 

Zoltan Haindrich commented on HIVE-17759:
-

I would like to recommend to evaluate the following options:

* rethink the whole HiveConf...to use some annotation level thing like:
{code}
@HiveConfVariable{ name="hive.main.asd",
description = "long story",
altNames = {"hadoop.sql.main.sql","hadoop.hive.sql" }
}
Integer asd = 5; // this is the default value
// the type is the field type...can go wrong..
{code}
altought these fields are not entirely used just to collect the informations; I 
think it would feel natural...

* introduce some templating if possible(should HiveConf be tied to an enum?
* oradd some getters which check the current "type" of the conf variable


> Raise class security around HiveConf.ConfVars.default fields to prevent 
> misuses
> ---
>
> Key: HIVE-17759
> URL: https://issues.apache.org/jira/browse/HIVE-17759
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>
> issues like HIVE-17758 can be prevented if these fields wouldn't be directly 
> accessible



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17759) Prevent the misuses of HiveConf.ConfVars.default fields

2017-10-10 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-17759:

Summary: Prevent the misuses of HiveConf.ConfVars.default fields  (was: 
Raise class security around HiveConf.ConfVars.default fields to prevent misuses)

> Prevent the misuses of HiveConf.ConfVars.default fields
> ---
>
> Key: HIVE-17759
> URL: https://issues.apache.org/jira/browse/HIVE-17759
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>
> issues like HIVE-17758 can be prevented if these fields wouldn't be directly 
> accessible



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17759) Prevent the misuses of HiveConf.ConfVars.default fields

2017-10-10 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198615#comment-16198615
 ] 

Zoltan Haindrich edited comment on HIVE-17759 at 10/10/17 1:01 PM:
---

I would like to recommend to evaluate the following options:

* rethink the whole HiveConf...to use some annotation level thing like:
{code}
@HiveConfVariable{ name="hive.main.asd",
description = "long story",
altNames = {"hadoop.sql.main.sql","hadoop.hive.sql" }
}
Integer asd = 5; // this is the default value
// the type is the field type...can go wrong..
{code}
altought these fields are not entirely used just to collect the informations; I 
think it would feel natural...
this also might open up the possibility to add some validators to the conf 
values...because currently we can't easily specify that something can't be 
negative...
* introduce some templating if possible(should HiveConf be tied to an enum?
* oradd some getters which check the current "type" of the conf variable



was (Author: kgyrtkirk):
I would like to recommend to evaluate the following options:

* rethink the whole HiveConf...to use some annotation level thing like:
{code}
@HiveConfVariable{ name="hive.main.asd",
description = "long story",
altNames = {"hadoop.sql.main.sql","hadoop.hive.sql" }
}
Integer asd = 5; // this is the default value
// the type is the field type...can go wrong..
{code}
altought these fields are not entirely used just to collect the informations; I 
think it would feel natural...

* introduce some templating if possible(should HiveConf be tied to an enum?
* oradd some getters which check the current "type" of the conf variable


> Prevent the misuses of HiveConf.ConfVars.default fields
> ---
>
> Key: HIVE-17759
> URL: https://issues.apache.org/jira/browse/HIVE-17759
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>
> issues like HIVE-17758 can be prevented if these fields wouldn't be directly 
> accessible



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17759) Prevent the misuses of HiveConf.ConfVars.default fields

2017-10-10 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198615#comment-16198615
 ] 

Zoltan Haindrich edited comment on HIVE-17759 at 10/10/17 1:01 PM:
---

I would like to recommend to evaluate the following options:

* rethink the whole HiveConf...to use some annotation level thing like:
{code}
@HiveConfVariable{ name="hive.main.asd",
description = "long story",
altNames = {"hadoop.sql.main.sql","hadoop.hive.sql" }
}
Integer asd = 5; // this is the default value
// the type is the field type...can go wrong..
{code}
altought these fields are not entirely used just to collect the informations; I 
think it would feel natural...
this might alsp open up the possibility to add some validators to the conf 
values...because currently we can't easily specify that something can't be 
negative...
* introduce some templating if possible(should HiveConf be tied to an enum?
* oradd some getters which check the current "type" of the conf variable



was (Author: kgyrtkirk):
I would like to recommend to evaluate the following options:

* rethink the whole HiveConf...to use some annotation level thing like:
{code}
@HiveConfVariable{ name="hive.main.asd",
description = "long story",
altNames = {"hadoop.sql.main.sql","hadoop.hive.sql" }
}
Integer asd = 5; // this is the default value
// the type is the field type...can go wrong..
{code}
altought these fields are not entirely used just to collect the informations; I 
think it would feel natural...
this also might open up the possibility to add some validators to the conf 
values...because currently we can't easily specify that something can't be 
negative...
* introduce some templating if possible(should HiveConf be tied to an enum?
* oradd some getters which check the current "type" of the conf variable


> Prevent the misuses of HiveConf.ConfVars.default fields
> ---
>
> Key: HIVE-17759
> URL: https://issues.apache.org/jira/browse/HIVE-17759
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>
> issues like HIVE-17758 can be prevented if these fields wouldn't be directly 
> accessible



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17758) NOTIFICATION_SEQUENCE_LOCK_RETRY_SLEEP_INTERVAL.defaultLongVal is -1

2017-10-10 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-17758:
---

Assignee: Zoltan Haindrich

> NOTIFICATION_SEQUENCE_LOCK_RETRY_SLEEP_INTERVAL.defaultLongVal is -1
> 
>
> Key: HIVE-17758
> URL: https://issues.apache.org/jira/browse/HIVE-17758
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17758.01.patch
>
>
> HIVE-16886 introduced a retry logic; which has a configurable retry interval. 
> unfortunately {{HiveConf}} has some public fields which at first glance seems 
> to be usefull to pass as arguments to other methods - but in this case the 
> default value is not even loaded into the field read by the code.. and 
> because of that the innocent client code 
> [here|https://github.com/apache/hive/blob/a974a9e6c4659f511e0b5edb97ce340a023a2e26/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L8554]
>   have used a {{-1}} value incorrectly which eventually caused an exception 
> [here|https://github.com/apache/hive/blob/a974a9e6c4659f511e0b5edb97ce340a023a2e26/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L8581]:
> {code}
> 2017-10-10 11:22:37,638 ERROR [load-dynamic-partitions-12]: 
> metastore.ObjectStore (ObjectStore.java:addNotificationEvent(7444)) - could 
> not get lock for update
> java.lang.IllegalArgumentException: timeout value is negative
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$RetryingExecutor.run(ObjectStore.java:7407)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.lockForUpdate(ObjectStore.java:7361)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:7424)
> at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> [...]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17758) NOTIFICATION_SEQUENCE_LOCK_RETRY_SLEEP_INTERVAL.defaultLongVal is -1

2017-10-10 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-17758:

Attachment: HIVE-17758.01.patch

#1) add missing {{L}}

> NOTIFICATION_SEQUENCE_LOCK_RETRY_SLEEP_INTERVAL.defaultLongVal is -1
> 
>
> Key: HIVE-17758
> URL: https://issues.apache.org/jira/browse/HIVE-17758
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
> Attachments: HIVE-17758.01.patch
>
>
> HIVE-16886 introduced a retry logic; which has a configurable retry interval. 
> unfortunately {{HiveConf}} has some public fields which at first glance seems 
> to be usefull to pass as arguments to other methods - but in this case the 
> default value is not even loaded into the field read by the code.. and 
> because of that the innocent client code 
> [here|https://github.com/apache/hive/blob/a974a9e6c4659f511e0b5edb97ce340a023a2e26/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L8554]
>   have used a {{-1}} value incorrectly which eventually caused an exception 
> [here|https://github.com/apache/hive/blob/a974a9e6c4659f511e0b5edb97ce340a023a2e26/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L8581]:
> {code}
> 2017-10-10 11:22:37,638 ERROR [load-dynamic-partitions-12]: 
> metastore.ObjectStore (ObjectStore.java:addNotificationEvent(7444)) - could 
> not get lock for update
> java.lang.IllegalArgumentException: timeout value is negative
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$RetryingExecutor.run(ObjectStore.java:7407)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.lockForUpdate(ObjectStore.java:7361)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:7424)
> at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> [...]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16677) CTAS with no data fails in Druid

2017-10-10 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-16677:
---
Attachment: HIVE-16677.01.patch

> CTAS with no data fails in Druid
> 
>
> Key: HIVE-16677
> URL: https://issues.apache.org/jira/browse/HIVE-16677
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16677.01.patch, HIVE-16677.patch
>
>
> If we create a table in Druid using a CTAS statement and the query executed 
> to create the table produces no data, we fail with the following exception:
> {noformat}
> druid.DruidStorageHandler: Exception while commit
> java.io.FileNotFoundException: File 
> /tmp/workingDirectory/.staging-jcamachorodriguez_20170515053123_835c394b-2157-4f6b-bfed-a2753acd568e/segmentsDescriptorDir
>  does not exist.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16677) CTAS with no data fails in Druid

2017-10-10 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-16677:
---
Attachment: (was: HIVE-16677.01.patch)

> CTAS with no data fails in Druid
> 
>
> Key: HIVE-16677
> URL: https://issues.apache.org/jira/browse/HIVE-16677
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16677.01.patch, HIVE-16677.patch
>
>
> If we create a table in Druid using a CTAS statement and the query executed 
> to create the table produces no data, we fail with the following exception:
> {noformat}
> druid.DruidStorageHandler: Exception while commit
> java.io.FileNotFoundException: File 
> /tmp/workingDirectory/.staging-jcamachorodriguez_20170515053123_835c394b-2157-4f6b-bfed-a2753acd568e/segmentsDescriptorDir
>  does not exist.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16677) CTAS with no data fails in Druid

2017-10-10 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-16677:
---
Attachment: HIVE-16677.01.patch

> CTAS with no data fails in Druid
> 
>
> Key: HIVE-16677
> URL: https://issues.apache.org/jira/browse/HIVE-16677
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16677.01.patch, HIVE-16677.patch
>
>
> If we create a table in Druid using a CTAS statement and the query executed 
> to create the table produces no data, we fail with the following exception:
> {noformat}
> druid.DruidStorageHandler: Exception while commit
> java.io.FileNotFoundException: File 
> /tmp/workingDirectory/.staging-jcamachorodriguez_20170515053123_835c394b-2157-4f6b-bfed-a2753acd568e/segmentsDescriptorDir
>  does not exist.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16677) CTAS with no data fails in Druid

2017-10-10 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-16677:
---
Attachment: HIVE-16677.02.patch

> CTAS with no data fails in Druid
> 
>
> Key: HIVE-16677
> URL: https://issues.apache.org/jira/browse/HIVE-16677
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16677.01.patch, HIVE-16677.02.patch, 
> HIVE-16677.patch
>
>
> If we create a table in Druid using a CTAS statement and the query executed 
> to create the table produces no data, we fail with the following exception:
> {noformat}
> druid.DruidStorageHandler: Exception while commit
> java.io.FileNotFoundException: File 
> /tmp/workingDirectory/.staging-jcamachorodriguez_20170515053123_835c394b-2157-4f6b-bfed-a2753acd568e/segmentsDescriptorDir
>  does not exist.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17617) Rollup of an empty resultset should contain the grouping of the empty grouping set

2017-10-10 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198864#comment-16198864
 ] 

Zoltan Haindrich commented on HIVE-17617:
-

about how it worked earlier:

* in case of a simple {{select count(1) from x}} there is the implict {{()}} 
grouping.. in which case only 1 reducer is spawned ... I don't think it would 
make sense to spawn any more than one.
** the summary row was served by the Reducer based on that there were no 
inputrows and it have been closed and there were no grouping keys.
* in case grouping sets: earlier when there were at least one input row which 
made thru the Mapper; at the output it emitted 1 row for each grouping set
 ** if the () set was present; there were a grouping which collected those 
- and it just worked 

however in case of grouping sets; it is possible that multiple reducers can 
effectively split up the work... even in a simple case when there is one 
grouping field.

I'm afraid setting {{numReducers=1}} would possibly add some performance 
penalties; I will peek into the code - and try to set it only if the empty 
grouping set is present.


> Rollup of an empty resultset should contain the grouping of the empty 
> grouping set
> --
>
> Key: HIVE-17617
> URL: https://issues.apache.org/jira/browse/HIVE-17617
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17617.01.patch, HIVE-17617.03.patch, 
> HIVE-17617.04.patch
>
>
> running
> {code}
> drop table if exists tx1;
> create table tx1 (a integer,b integer,c integer);
> select  sum(c),
> grouping(b)
> fromtx1
> group by rollup (b);
> {code}
> returns 0 rows; however 
> according to the standard:
> The  is regarded as the shortest such initial sublist. 
> For example, “ROLLUP ( (A, B), (C, D) )”
> is equivalent to “GROUPING SETS ( (A, B, C, D), (A, B), () )”.
> so I think the totals row (the grouping for {{()}} should be present)  - psql 
> returns it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16827) Merge stats task and column stats task into a single task

2017-10-10 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-16827:

Attachment: HIVE-16827.05wip01.patch

> Merge stats task and column stats task into a single task
> -
>
> Key: HIVE-16827
> URL: https://issues.apache.org/jira/browse/HIVE-16827
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Zoltan Haindrich
> Attachments: HIVE-16827.01.patch, HIVE-16827.02.patch, 
> HIVE-16827.03.patch, HIVE-16827.04wip01.patch, HIVE-16827.04wip02.patch, 
> HIVE-16827.04wip03.patch, HIVE-16827.04wip04.patch, HIVE-16827.04wip05.patch, 
> HIVE-16827.04wip06.patch, HIVE-16827.04wip07.patch, HIVE-16827.04wip08.patch, 
> HIVE-16827.04wip09.patch, HIVE-16827.04wip10.patch, HIVE-16827.05wip01.patch, 
> HIVE-16827.4.patch
>
>
> Within the task, we can specify whether to compute basic stats only or column 
> stats only or both.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16677) CTAS with no data fails in Druid

2017-10-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1619#comment-1619
 ] 

Hive QA commented on HIVE-16677:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12891275/HIVE-16677.01.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 11199 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[spark_local_queries] 
(batchId=64)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=162)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=239)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] 
(batchId=239)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7211/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7211/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7211/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12891275 - PreCommit-HIVE-Build

> CTAS with no data fails in Druid
> 
>
> Key: HIVE-16677
> URL: https://issues.apache.org/jira/browse/HIVE-16677
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16677.01.patch, HIVE-16677.02.patch, 
> HIVE-16677.patch
>
>
> If we create a table in Druid using a CTAS statement and the query executed 
> to create the table produces no data, we fail with the following exception:
> {noformat}
> druid.DruidStorageHandler: Exception while commit
> java.io.FileNotFoundException: File 
> /tmp/workingDirectory/.staging-jcamachorodriguez_20170515053123_835c394b-2157-4f6b-bfed-a2753acd568e/segmentsDescriptorDir
>  does not exist.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17757) REPL LOAD should use customised configurations to execute distcp/remote copy.

2017-10-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17757:

Description: 
As REPL LOAD command needs to read repl dump directory and data files from 
source cluster, it needs to use some of the configurations to read data 
securely through distcp.
Some of the HDFS configurations cannot be added to whitelist as they pose 
security threat. So, it is necessary for REPL LOAD command to take such configs 
as input and use it when trigger distcp.
*Proposed syntax:*
REPL LOAD [.] FROM  [WITH ('key1'='value1', 
'key2'='value2')];

  was:
As REPL LOAD command needs to read repl dump directory and data files from 
source cluster, it needs to use some of the configurations to read data 
securely through distcp.
Some of the HDFS configurations cannot be added to whitelist as they pose 
security threat. So, it is necessary for REPL LOAD command to take such configs 
as input and use it when trigger distcp.
*Proposed syntax:*
REPL LOAD [.] FROM  [WITH (key1=value1, 
key2=value2)];


> REPL LOAD should use customised configurations to execute distcp/remote copy.
> -
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH ('key1'='value1', 
> 'key2'='value2')];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work stopped] (HIVE-17757) REPL LOAD should use customised configurations to execute distcp/remote copy.

2017-10-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17757 stopped by Sankar Hariappan.
---
> REPL LOAD should use customised configurations to execute distcp/remote copy.
> -
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH (key1=value1, 
> key2=value2)];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17757) REPL LOAD should use customised configurations to execute distcp/remote copy.

2017-10-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17757:

Attachment: HIVE-17757.01.patch

Added 01.patch with support for WITH clause in REPL LOAD to get the 
configurations to be used for ReplCopyTask and copy repl dump dir.

Request [~thejas]/[~anishek] to please review the same.

> REPL LOAD should use customised configurations to execute distcp/remote copy.
> -
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17757.01.patch
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH ('key1'='value1', 
> 'key2'='value2')];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17757) REPL LOAD should use customised configurations to execute distcp/remote copy.

2017-10-10 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-17757:
--
Labels: DR pull-request-available replication  (was: DR replication)

> REPL LOAD should use customised configurations to execute distcp/remote copy.
> -
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17757.01.patch
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH ('key1'='value1', 
> 'key2'='value2')];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17757) REPL LOAD should use customised configurations to execute distcp/remote copy.

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198899#comment-16198899
 ] 

ASF GitHub Bot commented on HIVE-17757:
---

GitHub user sankarh opened a pull request:

https://github.com/apache/hive/pull/260

HIVE-17757: REPL LOAD should use customised configurations to execute 
distcp/remote copy.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sankarh/hive HIVE-17757

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/260.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #260






> REPL LOAD should use customised configurations to execute distcp/remote copy.
> -
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17757.01.patch
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH ('key1'='value1', 
> 'key2'='value2')];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17757) REPL LOAD need to use customised configurations to execute distcp/remote copy.

2017-10-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17757:

Summary: REPL LOAD need to use customised configurations to execute 
distcp/remote copy.  (was: REPL LOAD should use customised configurations to 
execute distcp/remote copy.)

> REPL LOAD need to use customised configurations to execute distcp/remote copy.
> --
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17757.01.patch
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH ('key1'='value1', 
> 'key2'='value2')];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17757) REPL LOAD need to use customised configurations to execute distcp/remote copy.

2017-10-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17757:

Status: Patch Available  (was: Open)

> REPL LOAD need to use customised configurations to execute distcp/remote copy.
> --
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17757.01.patch
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH ('key1'='value1', 
> 'key2'='value2')];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15016) Run tests with Hadoop 3.0.0-beta1

2017-10-10 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198956#comment-16198956
 ] 

Aihua Xu commented on HIVE-15016:
-

Looks like some older version of jackson-annotation.jar is used and seems I 
need to exclude it from some dependency jars. I will look into that.

> jar -ft hive-druid-handler-3.0.0-SNAPSHOT.jar | grep JsonInclude
org/apache/hive/druid/com/fasterxml/jackson/annotation/JsonInclude$Include.class
org/apache/hive/druid/com/fasterxml/jackson/annotation/JsonInclude.class


> Run tests with Hadoop 3.0.0-beta1
> -
>
> Key: HIVE-15016
> URL: https://issues.apache.org/jira/browse/HIVE-15016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sergio Peña
>Assignee: Aihua Xu
> Attachments: HIVE-15016.2.patch, HIVE-15016.3.patch, 
> HIVE-15016.patch, Hadoop3Upstream.patch
>
>
> Hadoop 3.0.0-alpha1 was released back on Sep/16 to allow other components run 
> tests against this new version before GA.
> We should start running tests with Hive to validate compatibility against 
> Hadoop 3.0.
> NOTE: The patch used to test must not be committed to Hive until Hadoop 3.0 
> GA is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17754) InputJobInfo in Pig UDFContext is heavyweight, and causes OOMs in Tez AMs

2017-10-10 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198244#comment-16198244
 ] 

Mithun Radhakrishnan edited comment on HIVE-17754 at 10/10/17 4:59 PM:
---

This fix depends on HIVE-11548. The attached patch contains both the fix for 
HIVE-11548 and the one for HIVE-17754. Submitting for tests...


was (Author: mithun):
This fix depends on HIVE-11548. The attached patch contains both the fix for 
HIVE-11548 and HIVE-17754. Submitting for tests...

> InputJobInfo in Pig UDFContext is heavyweight, and causes OOMs in Tez AMs
> -
>
> Key: HIVE-17754
> URL: https://issues.apache.org/jira/browse/HIVE-17754
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17754.1.patch
>
>
> HIVE-9845 dealt with reducing the size of HCat split-info, to improve 
> job-launch times for Pig/HCat jobs.
> For large Pig queries that scan a large number of Hive partitions, it was 
> found that the Pig {{UDFContext}} stored full-fat HCat {{InputJobInfo}} 
> objects, thus blowing out the Pig Tez AM. Since this information is already 
> stored in the {{HCatSplit}}, the serialization of {{InputJobInfo}} can be 
> spared.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17757) REPL LOAD need to use customised configurations to execute distcp/remote copy.

2017-10-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17757:

Status: Open  (was: Patch Available)

> REPL LOAD need to use customised configurations to execute distcp/remote copy.
> --
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17757.01.patch
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH ('key1'='value1', 
> 'key2'='value2')];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17402) Provide object location in the HMS notification messages

2017-10-10 Thread Dan Burkert (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198986#comment-16198986
 ] 

Dan Burkert commented on HIVE-17402:


This patch is adding the location parameter to the [hcatalog notification 
message 
classes|https://github.com/apache/hive/tree/master/hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/messaging/json],
 which were deprecated as part of 
[HIVE-15180|https://issues.apache.org/jira/browse/HIVE-15180?focusedCommentId=15682862&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15682862].
  The replacement [metastore notification message 
classes|https://github.com/apache/hive/tree/master/metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json]
 in many cases include a serialized full {{Table}} object, which internally 
includes the location URI, so adding the location parameters should not be 
necessary if Sentry switches over to use the new notification event classes.  
This patch, as I understand it, will not have the intended effect of adding the 
location URI to notification log entries, since the modified classes are no 
longer used in the HMS.

> Provide object location in the HMS notification messages
> 
>
> Key: HIVE-17402
> URL: https://issues.apache.org/jira/browse/HIVE-17402
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
> Attachments: HIVE-17402.01.patch
>
>
> While working on the Apache Sentry project that uses HMS notifications we 
> noticed that these notifications are using some useful data - e.g. location 
> information for the objects. To get around these, ApacheSentry implemented 
> its own version of events 
> (https://github.com/apache/sentry/tree/master/sentry-binding/sentry-binding-hive-follower/src/main/java/org/apache/sentry/binding/metastore/messaging/json).
> It seems to be a useful information for Hive as well, so why not add it 
> directly into the standard message factory?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17757) REPL LOAD need to use customised configurations to execute distcp/remote copy.

2017-10-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17757:

Attachment: (was: HIVE-17757.01.patch)

> REPL LOAD need to use customised configurations to execute distcp/remote copy.
> --
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17757.01.patch
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH ('key1'='value1', 
> 'key2'='value2')];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17757) REPL LOAD need to use customised configurations to execute distcp/remote copy.

2017-10-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17757:

Attachment: HIVE-17757.01.patch

Reattached 01.patch with UT added.

> REPL LOAD need to use customised configurations to execute distcp/remote copy.
> --
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17757.01.patch
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH ('key1'='value1', 
> 'key2'='value2')];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17757) REPL LOAD need to use customised configurations to execute distcp/remote copy.

2017-10-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17757:

Status: Patch Available  (was: Open)

> REPL LOAD need to use customised configurations to execute distcp/remote copy.
> --
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17757.01.patch
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH ('key1'='value1', 
> 'key2'='value2')];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17757) REPL LOAD need to use customised configurations to execute distcp/remote copy.

2017-10-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17757:

Status: Patch Available  (was: Open)

> REPL LOAD need to use customised configurations to execute distcp/remote copy.
> --
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17757.01.patch
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH ('key1'='value1', 
> 'key2'='value2')];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17757) REPL LOAD need to use customised configurations to execute distcp/remote copy.

2017-10-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17757:

Status: Open  (was: Patch Available)

> REPL LOAD need to use customised configurations to execute distcp/remote copy.
> --
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17757.01.patch
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH ('key1'='value1', 
> 'key2'='value2')];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16677) CTAS with no data fails in Druid

2017-10-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199007#comment-16199007
 ] 

Hive QA commented on HIVE-16677:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12891277/HIVE-16677.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 11199 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[spark_local_queries] 
(batchId=64)
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_timestamp] 
(batchId=97)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=162)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=239)
org.apache.hadoop.hive.common.metrics.metrics2.TestCodahaleReportersConf.testFallbackToDeprecatedConfig
 (batchId=249)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7212/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7212/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7212/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12891277 - PreCommit-HIVE-Build

> CTAS with no data fails in Druid
> 
>
> Key: HIVE-16677
> URL: https://issues.apache.org/jira/browse/HIVE-16677
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16677.01.patch, HIVE-16677.02.patch, 
> HIVE-16677.patch
>
>
> If we create a table in Druid using a CTAS statement and the query executed 
> to create the table produces no data, we fail with the following exception:
> {noformat}
> druid.DruidStorageHandler: Exception while commit
> java.io.FileNotFoundException: File 
> /tmp/workingDirectory/.staging-jcamachorodriguez_20170515053123_835c394b-2157-4f6b-bfed-a2753acd568e/segmentsDescriptorDir
>  does not exist.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17760) Create a unit test which validates HIVE-9423 does not regress

2017-10-10 Thread Andrew Sherman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman reassigned HIVE-17760:
-


> Create a unit test which validates HIVE-9423 does not regress 
> --
>
> Key: HIVE-17760
> URL: https://issues.apache.org/jira/browse/HIVE-17760
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
>
> During [HIVE-9423] we verified that when the Thrift server pool is exhausted, 
> then Beeline connection times out, and provide a meaningful error message.
> Create a unit test which verifies this, and helps to keep this feature working



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17761) Deprecate hive.druid.select.distribute property for Druid

2017-10-10 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-17761:
--


> Deprecate hive.druid.select.distribute property for Druid
> -
>
> Key: HIVE-17761
> URL: https://issues.apache.org/jira/browse/HIVE-17761
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Execution of SELECT queries is distributed among the different 
> historical/realtime nodes containing the data when the property is true. This 
> is the default mode and one that has been extensively tested.
> Previously SELECT queries were split and sent in parallel to the broker 
> nodes, but this mode is not recommended anymore and deprecated. Thus, that 
> code can be removed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17762) Exclude older jackson-annotation.jar from druid-handler shaded jar

2017-10-10 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-17762:
---


> Exclude older jackson-annotation.jar from druid-handler shaded jar
> --
>
> Key: HIVE-17762
> URL: https://issues.apache.org/jira/browse/HIVE-17762
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> hive-druid-handler.jar is shading jackson core dependencies in hive-17468 but 
> older versions are brought in from the transitive dependencies. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17762) Exclude older jackson-annotation.jar from druid-handler shaded jar

2017-10-10 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-17762:

Attachment: HIVE-17762.1.patch

> Exclude older jackson-annotation.jar from druid-handler shaded jar
> --
>
> Key: HIVE-17762
> URL: https://issues.apache.org/jira/browse/HIVE-17762
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17762.1.patch
>
>
> hive-druid-handler.jar is shading jackson core dependencies in hive-17468 but 
> older versions are brought in from the transitive dependencies. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17757) REPL LOAD need to use customised configurations to execute distcp/remote copy.

2017-10-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17757:

Status: Open  (was: Patch Available)

> REPL LOAD need to use customised configurations to execute distcp/remote copy.
> --
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17757.01.patch
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH ('key1'='value1', 
> 'key2'='value2')];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17762) Exclude older jackson-annotation.jar from druid-handler shaded jar

2017-10-10 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199030#comment-16199030
 ] 

Aihua Xu commented on HIVE-17762:
-

patch-1: druid-processing brings in 2.4.6 version. We need to exclude them as 
well from pom.xml.


{noformat}
 +- io.druid:druid-processing:jar:0.10.1:compile
[INFO] |  +- io.druid:druid-common:jar:0.10.1:compile
[INFO] |  |  +- io.druid:druid-api:jar:0.10.1:compile
[INFO] |  |  |  \- io.airlift:airline:jar:0.7:compile
[INFO] |  |  +- org.apache.commons:commons-dbcp2:jar:2.0.1:compile
[INFO] |  |  |  \- org.apache.commons:commons-pool2:jar:2.2:compile
[INFO] |  |  +- org.hibernate:hibernate-validator:jar:5.1.3.Final:compile
[INFO] |  |  |  +- org.jboss.logging:jboss-logging:jar:3.1.3.GA:compile
[INFO] |  |  |  \- com.fasterxml:classmate:jar:1.0.0:compile
[INFO] |  |  +- javax.el:javax.el-api:jar:3.0.0:compile
[INFO] |  |  +- javax.validation:validation-api:jar:1.1.0.Final:compile
[INFO] |  |  +- com.fasterxml.jackson.core:jackson-core:jar:2.4.6:compile
[INFO] |  |  +- com.fasterxml.jackson.core:jackson-annotations:jar:2.4.6:compile
{noformat}

> Exclude older jackson-annotation.jar from druid-handler shaded jar
> --
>
> Key: HIVE-17762
> URL: https://issues.apache.org/jira/browse/HIVE-17762
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17762.1.patch
>
>
> hive-druid-handler.jar is shading jackson core dependencies in hive-17468 but 
> older versions are brought in from the transitive dependencies. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17757) REPL LOAD need to use customised configurations to execute distcp/remote copy.

2017-10-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17757:

Attachment: HIVE-17757.02.patch

Added 02.patch fixing comments from [~thejas]

> REPL LOAD need to use customised configurations to execute distcp/remote copy.
> --
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17757.01.patch, HIVE-17757.02.patch
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH ('key1'='value1', 
> 'key2'='value2')];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17757) REPL LOAD need to use customised configurations to execute distcp/remote copy.

2017-10-10 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17757:

Status: Patch Available  (was: Open)

> REPL LOAD need to use customised configurations to execute distcp/remote copy.
> --
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17757.01.patch, HIVE-17757.02.patch
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH ('key1'='value1', 
> 'key2'='value2')];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17762) Exclude older jackson-annotation.jar from druid-handler shaded jar

2017-10-10 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-17762:

Status: Patch Available  (was: Open)

> Exclude older jackson-annotation.jar from druid-handler shaded jar
> --
>
> Key: HIVE-17762
> URL: https://issues.apache.org/jira/browse/HIVE-17762
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17762.1.patch
>
>
> hive-druid-handler.jar is shading jackson core dependencies in hive-17468 but 
> older versions are brought in from the transitive dependencies. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15016) Run tests with Hadoop 3.0.0-beta1

2017-10-10 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199039#comment-16199039
 ] 

Aihua Xu commented on HIVE-15016:
-

I just filed HIVE-17762 to fix such dependency issue since current jira may 
take a while to get fully fixed.

[~ashutoshc]  Can you take a look at HIVE-17762? At mean time I will check 
other failures.

> Run tests with Hadoop 3.0.0-beta1
> -
>
> Key: HIVE-15016
> URL: https://issues.apache.org/jira/browse/HIVE-15016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sergio Peña
>Assignee: Aihua Xu
> Attachments: HIVE-15016.2.patch, HIVE-15016.3.patch, 
> HIVE-15016.patch, Hadoop3Upstream.patch
>
>
> Hadoop 3.0.0-alpha1 was released back on Sep/16 to allow other components run 
> tests against this new version before GA.
> We should start running tests with Hive to validate compatibility against 
> Hadoop 3.0.
> NOTE: The patch used to test must not be committed to Hive until Hadoop 3.0 
> GA is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17757) REPL LOAD need to use customised configurations to execute distcp/remote copy.

2017-10-10 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199043#comment-16199043
 ] 

Thejas M Nair commented on HIVE-17757:
--

+1


> REPL LOAD need to use customised configurations to execute distcp/remote copy.
> --
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17757.01.patch, HIVE-17757.02.patch
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH ('key1'='value1', 
> 'key2'='value2')];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17761) Deprecate hive.druid.select.distribute property for Druid

2017-10-10 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17761:
---
Status: Patch Available  (was: In Progress)

> Deprecate hive.druid.select.distribute property for Druid
> -
>
> Key: HIVE-17761
> URL: https://issues.apache.org/jira/browse/HIVE-17761
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Execution of SELECT queries is distributed among the different 
> historical/realtime nodes containing the data when the property is true. This 
> is the default mode and one that has been extensively tested.
> Previously SELECT queries were split and sent in parallel to the broker 
> nodes, but this mode is not recommended anymore and deprecated. Thus, that 
> code can be removed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work started] (HIVE-17761) Deprecate hive.druid.select.distribute property for Druid

2017-10-10 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17761 started by Jesus Camacho Rodriguez.
--
> Deprecate hive.druid.select.distribute property for Druid
> -
>
> Key: HIVE-17761
> URL: https://issues.apache.org/jira/browse/HIVE-17761
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Execution of SELECT queries is distributed among the different 
> historical/realtime nodes containing the data when the property is true. This 
> is the default mode and one that has been extensively tested.
> Previously SELECT queries were split and sent in parallel to the broker 
> nodes, but this mode is not recommended anymore and deprecated. Thus, that 
> code can be removed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17761) Deprecate hive.druid.select.distribute property for Druid

2017-10-10 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17761:
---
Attachment: HIVE-17761.patch

> Deprecate hive.druid.select.distribute property for Druid
> -
>
> Key: HIVE-17761
> URL: https://issues.apache.org/jira/browse/HIVE-17761
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17761.patch
>
>
> Execution of SELECT queries is distributed among the different 
> historical/realtime nodes containing the data when the property is true. This 
> is the default mode and one that has been extensively tested.
> Previously SELECT queries were split and sent in parallel to the broker 
> nodes, but this mode is not recommended anymore and deprecated. Thus, that 
> code can be removed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17733) Move RawStore to standalone metastore

2017-10-10 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17733:
--
Status: Open  (was: Patch Available)

> Move RawStore to standalone metastore
> -
>
> Key: HIVE-17733
> URL: https://issues.apache.org/jira/browse/HIVE-17733
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: pull-request-available
> Attachments: HIVE-17733.2.patch, HIVE-17733.patch
>
>
> This includes moving implementations of RawStore (like ObjectStore), 
> MetastoreDirectSql, and stats related classes like ColumnStatsAggregator and 
> the NDV classes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17733) Move RawStore to standalone metastore

2017-10-10 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17733:
--
Attachment: HIVE-17733.3.patch

Patch 3 includes updates necessary after the checkin of HIVE-17629.

> Move RawStore to standalone metastore
> -
>
> Key: HIVE-17733
> URL: https://issues.apache.org/jira/browse/HIVE-17733
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: pull-request-available
> Attachments: HIVE-17733.2.patch, HIVE-17733.3.patch, HIVE-17733.patch
>
>
> This includes moving implementations of RawStore (like ObjectStore), 
> MetastoreDirectSql, and stats related classes like ColumnStatsAggregator and 
> the NDV classes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17733) Move RawStore to standalone metastore

2017-10-10 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17733:
--
Status: Patch Available  (was: Open)

> Move RawStore to standalone metastore
> -
>
> Key: HIVE-17733
> URL: https://issues.apache.org/jira/browse/HIVE-17733
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: pull-request-available
> Attachments: HIVE-17733.2.patch, HIVE-17733.3.patch, HIVE-17733.patch
>
>
> This includes moving implementations of RawStore (like ObjectStore), 
> MetastoreDirectSql, and stats related classes like ColumnStatsAggregator and 
> the NDV classes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17561) Move TxnStore and implementations to standalone metastore

2017-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199063#comment-16199063
 ] 

ASF GitHub Bot commented on HIVE-17561:
---

Github user alanfgates closed the pull request at:

https://github.com/apache/hive/pull/253


> Move TxnStore and implementations to standalone metastore
> -
>
> Key: HIVE-17561
> URL: https://issues.apache.org/jira/browse/HIVE-17561
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore, Transactions
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: pull-request-available
> Attachments: HIVE-17561.4.patch, HIVE-17561.5.patch, 
> HIVE-17561.6.patch, HIVE-17561.patch
>
>
> We need to move the metastore handling of transactions into the standalone 
> metastore.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17761) Deprecate hive.druid.select.distribute property for Druid

2017-10-10 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199073#comment-16199073
 ] 

Ashutosh Chauhan commented on HIVE-17761:
-

Instead of deprecation, we shall just remove this config.

> Deprecate hive.druid.select.distribute property for Druid
> -
>
> Key: HIVE-17761
> URL: https://issues.apache.org/jira/browse/HIVE-17761
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17761.patch
>
>
> Execution of SELECT queries is distributed among the different 
> historical/realtime nodes containing the data when the property is true. This 
> is the default mode and one that has been extensively tested.
> Previously SELECT queries were split and sent in parallel to the broker 
> nodes, but this mode is not recommended anymore and deprecated. Thus, that 
> code can be removed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17762) Exclude older jackson-annotation.jar from druid-handler shaded jar

2017-10-10 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199078#comment-16199078
 ] 

Ashutosh Chauhan commented on HIVE-17762:
-

Which version of jackson will be used and from where?
cc: [~bslim]

> Exclude older jackson-annotation.jar from druid-handler shaded jar
> --
>
> Key: HIVE-17762
> URL: https://issues.apache.org/jira/browse/HIVE-17762
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17762.1.patch
>
>
> hive-druid-handler.jar is shading jackson core dependencies in hive-17468 but 
> older versions are brought in from the transitive dependencies. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17747) HMS DropTableMessage should include the full table object

2017-10-10 Thread Dan Burkert (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199117#comment-16199117
 ] 

Dan Burkert commented on HIVE-17747:


CC [~akolb], I think you may be interested in this.  The motivation is similar 
to HIVE-17402.

> HMS DropTableMessage should include the full table object
> -
>
> Key: HIVE-17747
> URL: https://issues.apache.org/jira/browse/HIVE-17747
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog, Metastore
>Affects Versions: 2.3.0
>Reporter: Dan Burkert
>Assignee: Dan Burkert
> Attachments: HIVE-17747.0.patch
>
>
> I have a notification log follower use-case which requires accessing the 
> parameters of dropped tables, so it would be useful if the {{DROP_TABLE}} 
> events in the notification log included the full table object, as the create 
> and alter events do.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-11548) HCatLoader should support predicate pushdown.

2017-10-10 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-11548:

Attachment: HIVE-11548.7.patch

> HCatLoader should support predicate pushdown.
> -
>
> Key: HIVE-11548
> URL: https://issues.apache.org/jira/browse/HIVE-11548
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog
>Affects Versions: 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-11548.1.patch, HIVE-11548.2.patch, 
> HIVE-11548.3.patch, HIVE-11548.4.patch, HIVE-11548.5.patch, 
> HIVE-11548.6-branch-2.2.patch, HIVE-11548.6-branch-2.patch, 
> HIVE-11548.6.patch, HIVE-11548.7.patch
>
>
> When one uses {{HCatInputFormat}}/{{HCatLoader}} to read from file-formats 
> that support predicate pushdown (such as ORC, with 
> {{hive.optimize.index.filter=true}}), one sees that the predicates aren't 
> actually pushed down into the storage layer.
> The forthcoming patch should allow for filter-pushdown, if any of the 
> partitions being scanned with {{HCatLoader}} support the functionality. The 
> patch should technically allow the same for users of {{HCatInputFormat}}, but 
> I don't currently have a neat interface to build a compound 
> predicate-expression. Will add this separately, if required.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-11548) HCatLoader should support predicate pushdown.

2017-10-10 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-11548:

Attachment: HIVE-11548.7-branch-2.patch

> HCatLoader should support predicate pushdown.
> -
>
> Key: HIVE-11548
> URL: https://issues.apache.org/jira/browse/HIVE-11548
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog
>Affects Versions: 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-11548.1.patch, HIVE-11548.2.patch, 
> HIVE-11548.3.patch, HIVE-11548.4.patch, HIVE-11548.5.patch, 
> HIVE-11548.6-branch-2.2.patch, HIVE-11548.6-branch-2.patch, 
> HIVE-11548.6.patch, HIVE-11548.7-branch-2.patch, HIVE-11548.7.patch
>
>
> When one uses {{HCatInputFormat}}/{{HCatLoader}} to read from file-formats 
> that support predicate pushdown (such as ORC, with 
> {{hive.optimize.index.filter=true}}), one sees that the predicates aren't 
> actually pushed down into the storage layer.
> The forthcoming patch should allow for filter-pushdown, if any of the 
> partitions being scanned with {{HCatLoader}} support the functionality. The 
> patch should technically allow the same for users of {{HCatInputFormat}}, but 
> I don't currently have a neat interface to build a compound 
> predicate-expression. Will add this separately, if required.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17762) Exclude older jackson-annotation.jar from druid-handler shaded jar

2017-10-10 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199147#comment-16199147
 ] 

slim bouguerra commented on HIVE-17762:
---

Druid should be using Jackson 2.4.6, jars should not come from transitive 
dependencies though.
[~aihuaxu] can you please add which version are you excluding and the diffs 
before and after on the dependency tree? Thanks.


> Exclude older jackson-annotation.jar from druid-handler shaded jar
> --
>
> Key: HIVE-17762
> URL: https://issues.apache.org/jira/browse/HIVE-17762
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17762.1.patch
>
>
> hive-druid-handler.jar is shading jackson core dependencies in hive-17468 but 
> older versions are brought in from the transitive dependencies. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-11548) HCatLoader should support predicate pushdown.

2017-10-10 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-11548:

Attachment: HIVE-11548.7-branch-2.2.patch

> HCatLoader should support predicate pushdown.
> -
>
> Key: HIVE-11548
> URL: https://issues.apache.org/jira/browse/HIVE-11548
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog
>Affects Versions: 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-11548.1.patch, HIVE-11548.2.patch, 
> HIVE-11548.3.patch, HIVE-11548.4.patch, HIVE-11548.5.patch, 
> HIVE-11548.6-branch-2.2.patch, HIVE-11548.6-branch-2.patch, 
> HIVE-11548.6.patch, HIVE-11548.7-branch-2.2.patch, 
> HIVE-11548.7-branch-2.patch, HIVE-11548.7.patch
>
>
> When one uses {{HCatInputFormat}}/{{HCatLoader}} to read from file-formats 
> that support predicate pushdown (such as ORC, with 
> {{hive.optimize.index.filter=true}}), one sees that the predicates aren't 
> actually pushed down into the storage layer.
> The forthcoming patch should allow for filter-pushdown, if any of the 
> partitions being scanned with {{HCatLoader}} support the functionality. The 
> patch should technically allow the same for users of {{HCatInputFormat}}, but 
> I don't currently have a neat interface to build a compound 
> predicate-expression. Will add this separately, if required.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16827) Merge stats task and column stats task into a single task

2017-10-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199170#comment-16199170
 ] 

Hive QA commented on HIVE-16827:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12891290/HIVE-16827.05wip01.patch

{color:green}SUCCESS:{color} +1 due to 15 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 30 failed/errored test(s), 11199 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[colstats_all_nulls] 
(batchId=239)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=242)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table]
 (batchId=242)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=242)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table]
 (batchId=242)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_10] 
(batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_1] 
(batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_2] 
(batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_5a] 
(batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[basicstat_partval] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_partlvl] 
(batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_partlvl_dp] 
(batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[deleteAnalyze] 
(batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[materialized_view_rewrite_ssb]
 (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[materialized_view_rewrite_ssb_2]
 (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[spark_local_queries] 
(batchId=64)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_display_colstats_tbllvl]
 (batchId=75)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_const] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_if_expr_2] 
(batchId=58)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_mapjoin2] 
(batchId=23)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_values]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[autoColumnStats_10]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[autoColumnStats_1]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[deleteAnalyze]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_nway_join]
 (batchId=163)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[avro_decimal_native]
 (batchId=114)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=239)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=202)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7213/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7213/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7213/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 30 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12891290 - PreCommit-HIVE-Build

> Merge stats task and column stats task into a single task
> -
>
> Key: HIVE-16827
> URL: https://issues.apache.org/jira/browse/HIVE-16827
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Zoltan Haindrich
> Attachments: HIVE-16827.01.patch, HIVE-16827.02.patch, 
> HIVE-16827.03.patch, HIVE-16827.04wip01.patch, HIVE-16827.04wip02.patch, 
> HIVE-16827.04wip03.patch, HIVE-16827.04wip04.patch, HIVE-16827.04wip05.patch, 
> HIVE-16827.04wip06.patch, HIVE-16827.04wip07.patch, HIVE-16827.04wip08.patch, 
> HIVE-16827.04wip09.patch, HIVE-16827.04wip10.patch, HIVE-16827.05wip01.patch, 
> HIVE-16827.4.patch
>
>
> Within the task, we can specify whether to compute basic stats only or column 
> stats only or both.



--
This message was

[jira] [Commented] (HIVE-17762) Exclude older jackson-annotation.jar from druid-handler shaded jar

2017-10-10 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199200#comment-16199200
 ] 

Aihua Xu commented on HIVE-17762:
-

[~bslim] I see. Then instead of including the wrong jackson-annotation.jar, we 
may include a newer version of jackson-databind.jar which  will require newer 
version jackson-annotation.jar.

Currently  {{com.fasterxml.jackson.core:jackson-databind:jar:2.7.8:compile}} is 
included. 

Should druid also shade jackson-databind.jar as well?  

> Exclude older jackson-annotation.jar from druid-handler shaded jar
> --
>
> Key: HIVE-17762
> URL: https://issues.apache.org/jira/browse/HIVE-17762
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17762.1.patch
>
>
> hive-druid-handler.jar is shading jackson core dependencies in hive-17468 but 
> older versions are brought in from the transitive dependencies. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15267) Make query length calculation logic more accurate in TxnUtils.needNewQuery()

2017-10-10 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199205#comment-16199205
 ] 

Eugene Koifman commented on HIVE-15267:
---

[~steveyeom2017] could upload the patch to ReviewBoard

> Make query length calculation logic more accurate in TxnUtils.needNewQuery()
> 
>
> Key: HIVE-15267
> URL: https://issues.apache.org/jira/browse/HIVE-15267
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Wei Zheng
>Assignee: Steve Yeom
> Attachments: HIVE-15267.01.patch, HIVE-15267.02.patch
>
>
> In HIVE-15181 there's such review comment, for which this ticket will handle
> {code}
> in TxnUtils.needNewQuery() "sizeInBytes / 1024 > queryMemoryLimit" doesn't do 
> the right thing.
> If the user sets METASTORE_DIRECT_SQL_MAX_QUERY_LENGTH to 1K, they most 
> likely want each SQL string to be at most 1K.
> But if sizeInBytes=2047, this still returns false.
> It should include length of "suffix" in computation of sizeInBytes
> Along the same lines: the check for max query length is done after each batch 
> is already added to the query. Suppose there are 1000 9-digit txn IDs in each 
> IN(...). That's, conservatively, 18KB of text. So the length of each query is 
> increasing in 18KB chunks. 
> I think the check for query length should be done for each item in IN clause.
> If some DB has a limit on query length of X, then any query > X will fail. So 
> I think this must ensure not to produce any queries > X, even by 1 char.
> For example, case 3.1 of the UT generates a query of almost 4000 characters - 
> this is clearly > 1KB.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17762) Exclude older jackson-annotation.jar from druid-handler shaded jar

2017-10-10 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199219#comment-16199219
 ] 

Aihua Xu commented on HIVE-17762:
-

Seems the older version of jackson-databind.jar is included in druid jar. We 
just need to exclude the new version from the hadoop-common transitive 
dependency. 

> Exclude older jackson-annotation.jar from druid-handler shaded jar
> --
>
> Key: HIVE-17762
> URL: https://issues.apache.org/jira/browse/HIVE-17762
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17762.1.patch
>
>
> hive-druid-handler.jar is shading jackson core dependencies in hive-17468 but 
> older versions are brought in from the transitive dependencies. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17762) Exclude older jackson-annotation.jar from druid-handler shaded jar

2017-10-10 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-17762:

Attachment: HIVE-17762.2.patch

> Exclude older jackson-annotation.jar from druid-handler shaded jar
> --
>
> Key: HIVE-17762
> URL: https://issues.apache.org/jira/browse/HIVE-17762
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17762.1.patch, HIVE-17762.2.patch
>
>
> hive-druid-handler.jar is shading jackson core dependencies in hive-17468 but 
> older versions are brought in from the transitive dependencies. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17762) Exclude older jackson-annotation.jar from druid-handler shaded jar

2017-10-10 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199231#comment-16199231
 ] 

Aihua Xu commented on HIVE-17762:
-

patch-2: exclude the jackson-databind.jar from hadoop-common dependency so the 
one from druid will be selected.



> Exclude older jackson-annotation.jar from druid-handler shaded jar
> --
>
> Key: HIVE-17762
> URL: https://issues.apache.org/jira/browse/HIVE-17762
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17762.1.patch, HIVE-17762.2.patch
>
>
> hive-druid-handler.jar is shading jackson core dependencies in hive-17468 but 
> older versions are brought in from the transitive dependencies. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17762) Exclude older jackson-annotation.jar from druid-handler shaded jar

2017-10-10 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-17762:

Attachment: new_dependency.txt

> Exclude older jackson-annotation.jar from druid-handler shaded jar
> --
>
> Key: HIVE-17762
> URL: https://issues.apache.org/jira/browse/HIVE-17762
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17762.1.patch, HIVE-17762.2.patch, 
> new_dependency.txt
>
>
> hive-druid-handler.jar is shading jackson core dependencies in hive-17468 but 
> older versions are brought in from the transitive dependencies. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17743) Add InterfaceAudience and InterfaceStability annotations for Thrift generated APIs

2017-10-10 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17743:

Attachment: HIVE-17743.2.patch

Attaching new patch that adds annotations for Java interfaces created by the 
Thrift files.

> Add InterfaceAudience and InterfaceStability annotations for Thrift generated 
> APIs
> --
>
> Key: HIVE-17743
> URL: https://issues.apache.org/jira/browse/HIVE-17743
> Project: Hive
>  Issue Type: Sub-task
>  Components: Thrift API
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17743.1.patch, HIVE-17743.2.patch
>
>
> The Thrift generated files don't have {{InterfaceAudience}} or 
> {{InterfaceStability}} annotations on them, mainly because all the files are 
> auto-generated.
> We should add some code that auto-tags all the Java Thrift generated files 
> with these annotations. This way even when they are re-generated, they still 
> contain the annotations.
> We should be able to do this using the 
> {{com.google.code.maven-replacer-plugin}} similar to what we do in 
> {{standalone-metastore/pom.xml}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15016) Run tests with Hadoop 3.0.0-beta1

2017-10-10 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199254#comment-16199254
 ] 

Aihua Xu commented on HIVE-15016:
-

We also need to upgrade hbase version to 2.0 since the current dependent hbase 
1.1.1 will not work with hadoop .  

> Run tests with Hadoop 3.0.0-beta1
> -
>
> Key: HIVE-15016
> URL: https://issues.apache.org/jira/browse/HIVE-15016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sergio Peña
>Assignee: Aihua Xu
> Attachments: HIVE-15016.2.patch, HIVE-15016.3.patch, 
> HIVE-15016.patch, Hadoop3Upstream.patch
>
>
> Hadoop 3.0.0-alpha1 was released back on Sep/16 to allow other components run 
> tests against this new version before GA.
> We should start running tests with Hive to validate compatibility against 
> Hadoop 3.0.
> NOTE: The patch used to test must not be committed to Hive until Hadoop 3.0 
> GA is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17760) Create a unit test which validates HIVE-9423 does not regress

2017-10-10 Thread Andrew Sherman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman updated HIVE-17760:
--
Attachment: HIVE-17760.1.patch

> Create a unit test which validates HIVE-9423 does not regress 
> --
>
> Key: HIVE-17760
> URL: https://issues.apache.org/jira/browse/HIVE-17760
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Attachments: HIVE-17760.1.patch
>
>
> During [HIVE-9423] we verified that when the Thrift server pool is exhausted, 
> then Beeline connection times out, and provide a meaningful error message.
> Create a unit test which verifies this, and helps to keep this feature working



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17760) Create a unit test which validates HIVE-9423 does not regress

2017-10-10 Thread Andrew Sherman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman updated HIVE-17760:
--
Status: Patch Available  (was: Open)

> Create a unit test which validates HIVE-9423 does not regress 
> --
>
> Key: HIVE-17760
> URL: https://issues.apache.org/jira/browse/HIVE-17760
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Attachments: HIVE-17760.1.patch
>
>
> During [HIVE-9423] we verified that when the Thrift server pool is exhausted, 
> then Beeline connection times out, and provide a meaningful error message.
> Create a unit test which verifies this, and helps to keep this feature working



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15267) Make query length calculation logic more accurate in TxnUtils.needNewQuery()

2017-10-10 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199267#comment-16199267
 ] 

Eugene Koifman commented on HIVE-15267:
---

One of the issues is (not introduced by this patch) is that you cannot split 
NOT IN query into multiple queries.
For example if the input IN list is (5,6) and buildQueryWithINClause() produces 
2 queries
"delete from T where a not in(5)"
"delete from T where a not in(6)"
the net effect will be to delete all rows including those with a = 6 and those 
with a = 5.

Could these be named in a more meaningful way (as opposed (or in addition) to 
comments)?
{noformat}
int i = 0,  // cursor for the "inList" array.
j = 0,  // cursor for an element list per an 'IN'/'NOT IN'-clause
k = 0;  // cursor for in-clause lists per a query
{noformat}

{noformat}
if (newInclausePrefixJustAppended) {
  buf.delete(buf.length()-newInclausePrefix.length(), 
buf.length());
} 
{noformat}
is problematic if _ maxQueryLength_ is set very low for some reason.  The worst 
case if the query returned doesn't have any IN clause at all, i.e. it would 
look like "delete from T" which will delete everything - this should probably 
throw.
Maybe better to check the size before adding more chars to the query (like it's 
done for each item using _ nextItemNeeded_)




> Make query length calculation logic more accurate in TxnUtils.needNewQuery()
> 
>
> Key: HIVE-15267
> URL: https://issues.apache.org/jira/browse/HIVE-15267
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Wei Zheng
>Assignee: Steve Yeom
> Attachments: HIVE-15267.01.patch, HIVE-15267.02.patch
>
>
> In HIVE-15181 there's such review comment, for which this ticket will handle
> {code}
> in TxnUtils.needNewQuery() "sizeInBytes / 1024 > queryMemoryLimit" doesn't do 
> the right thing.
> If the user sets METASTORE_DIRECT_SQL_MAX_QUERY_LENGTH to 1K, they most 
> likely want each SQL string to be at most 1K.
> But if sizeInBytes=2047, this still returns false.
> It should include length of "suffix" in computation of sizeInBytes
> Along the same lines: the check for max query length is done after each batch 
> is already added to the query. Suppose there are 1000 9-digit txn IDs in each 
> IN(...). That's, conservatively, 18KB of text. So the length of each query is 
> increasing in 18KB chunks. 
> I think the check for query length should be done for each item in IN clause.
> If some DB has a limit on query length of X, then any query > X will fail. So 
> I think this must ensure not to produce any queries > X, even by 1 char.
> For example, case 3.1 of the UT generates a query of almost 4000 characters - 
> this is clearly > 1KB.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17757) REPL LOAD need to use customised configurations to execute distcp/remote copy.

2017-10-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199283#comment-16199283
 ] 

Hive QA commented on HIVE-17757:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12891306/HIVE-17757.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 11193 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[spark_local_queries] 
(batchId=64)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=170)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=239)
org.apache.hadoop.hive.llap.security.TestLlapSignerImpl.testSigning 
(batchId=291)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7214/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7214/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7214/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12891306 - PreCommit-HIVE-Build

> REPL LOAD need to use customised configurations to execute distcp/remote copy.
> --
>
> Key: HIVE-17757
> URL: https://issues.apache.org/jira/browse/HIVE-17757
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17757.01.patch, HIVE-17757.02.patch
>
>
> As REPL LOAD command needs to read repl dump directory and data files from 
> source cluster, it needs to use some of the configurations to read data 
> securely through distcp.
> Some of the HDFS configurations cannot be added to whitelist as they pose 
> security threat. So, it is necessary for REPL LOAD command to take such 
> configs as input and use it when trigger distcp.
> *Proposed syntax:*
> REPL LOAD [.] FROM  [WITH ('key1'='value1', 
> 'key2'='value2')];



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15267) Make query length calculation logic more accurate in TxnUtils.needNewQuery()

2017-10-10 Thread Steve Yeom (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199288#comment-16199288
 ] 

Steve Yeom commented on HIVE-15267:
---

1. The patch is supposed to work for "NOT IN" cases. But I can double check. 
2. Changing local variable names are not a problem. 
3. I think minimum value of _maxQueryLength_ is 1, which is 1K bytes. But here 
I can check and cover the case when a single query with a single 
   "IN CLAUSE' value String length is > 1K. 
   Also I can check and generate Exception if the value is 0  or minus for 
example. 
   Also for the case of "DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE". 

> Make query length calculation logic more accurate in TxnUtils.needNewQuery()
> 
>
> Key: HIVE-15267
> URL: https://issues.apache.org/jira/browse/HIVE-15267
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Wei Zheng
>Assignee: Steve Yeom
> Attachments: HIVE-15267.01.patch, HIVE-15267.02.patch
>
>
> In HIVE-15181 there's such review comment, for which this ticket will handle
> {code}
> in TxnUtils.needNewQuery() "sizeInBytes / 1024 > queryMemoryLimit" doesn't do 
> the right thing.
> If the user sets METASTORE_DIRECT_SQL_MAX_QUERY_LENGTH to 1K, they most 
> likely want each SQL string to be at most 1K.
> But if sizeInBytes=2047, this still returns false.
> It should include length of "suffix" in computation of sizeInBytes
> Along the same lines: the check for max query length is done after each batch 
> is already added to the query. Suppose there are 1000 9-digit txn IDs in each 
> IN(...). That's, conservatively, 18KB of text. So the length of each query is 
> increasing in 18KB chunks. 
> I think the check for query length should be done for each item in IN clause.
> If some DB has a limit on query length of X, then any query > X will fail. So 
> I think this must ensure not to produce any queries > X, even by 1 char.
> For example, case 3.1 of the UT generates a query of almost 4000 characters - 
> this is clearly > 1KB.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17762) Exclude older jackson-annotation.jar from druid-handler shaded jar

2017-10-10 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199295#comment-16199295
 ] 

Ashutosh Chauhan commented on HIVE-17762:
-

+1

> Exclude older jackson-annotation.jar from druid-handler shaded jar
> --
>
> Key: HIVE-17762
> URL: https://issues.apache.org/jira/browse/HIVE-17762
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17762.1.patch, HIVE-17762.2.patch, 
> new_dependency.txt
>
>
> hive-druid-handler.jar is shading jackson core dependencies in hive-17468 but 
> older versions are brought in from the transitive dependencies. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15016) Run tests with Hadoop 3.0.0-beta1

2017-10-10 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199305#comment-16199305
 ] 

Ashutosh Chauhan commented on HIVE-15016:
-

Lets upgrade hbase to 2.0 too.

> Run tests with Hadoop 3.0.0-beta1
> -
>
> Key: HIVE-15016
> URL: https://issues.apache.org/jira/browse/HIVE-15016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sergio Peña
>Assignee: Aihua Xu
> Attachments: HIVE-15016.2.patch, HIVE-15016.3.patch, 
> HIVE-15016.patch, Hadoop3Upstream.patch
>
>
> Hadoop 3.0.0-alpha1 was released back on Sep/16 to allow other components run 
> tests against this new version before GA.
> We should start running tests with Hive to validate compatibility against 
> Hadoop 3.0.
> NOTE: The patch used to test must not be committed to Hive until Hadoop 3.0 
> GA is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17754) InputJobInfo in Pig UDFContext is heavyweight, and causes OOMs in Tez AMs

2017-10-10 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17754:

Attachment: HIVE-17754.2.patch

Addressing {{Test*HCatLoader}} test failures.

> InputJobInfo in Pig UDFContext is heavyweight, and causes OOMs in Tez AMs
> -
>
> Key: HIVE-17754
> URL: https://issues.apache.org/jira/browse/HIVE-17754
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17754.1.patch, HIVE-17754.2.patch
>
>
> HIVE-9845 dealt with reducing the size of HCat split-info, to improve 
> job-launch times for Pig/HCat jobs.
> For large Pig queries that scan a large number of Hive partitions, it was 
> found that the Pig {{UDFContext}} stored full-fat HCat {{InputJobInfo}} 
> objects, thus blowing out the Pig Tez AM. Since this information is already 
> stored in the {{HCatSplit}}, the serialization of {{InputJobInfo}} can be 
> spared.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17733) Move RawStore to standalone metastore

2017-10-10 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199334#comment-16199334
 ] 

Sergey Shelukhin commented on HIVE-17733:
-

Looks good to me, pending tests. The comment about datanucleus arguments makes 
sense to me, IIRC it's sometimes useful to override the ones not defined in 
HiveConf/MetastoreConf.

> Move RawStore to standalone metastore
> -
>
> Key: HIVE-17733
> URL: https://issues.apache.org/jira/browse/HIVE-17733
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: pull-request-available
> Attachments: HIVE-17733.2.patch, HIVE-17733.3.patch, HIVE-17733.patch
>
>
> This includes moving implementations of RawStore (like ObjectStore), 
> MetastoreDirectSql, and stats related classes like ColumnStatsAggregator and 
> the NDV classes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17758) NOTIFICATION_SEQUENCE_LOCK_RETRY_SLEEP_INTERVAL.defaultLongVal is -1

2017-10-10 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-17758:

Status: Patch Available  (was: In Progress)

> NOTIFICATION_SEQUENCE_LOCK_RETRY_SLEEP_INTERVAL.defaultLongVal is -1
> 
>
> Key: HIVE-17758
> URL: https://issues.apache.org/jira/browse/HIVE-17758
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17758.01.patch, HIVE-17758.02.patch
>
>
> HIVE-16886 introduced a retry logic; which has a configurable retry interval. 
> unfortunately {{HiveConf}} has some public fields which at first glance seems 
> to be usefull to pass as arguments to other methods - but in this case the 
> default value is not even loaded into the field read by the code.. and 
> because of that the innocent client code 
> [here|https://github.com/apache/hive/blob/a974a9e6c4659f511e0b5edb97ce340a023a2e26/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L8554]
>   have used a {{-1}} value incorrectly which eventually caused an exception 
> [here|https://github.com/apache/hive/blob/a974a9e6c4659f511e0b5edb97ce340a023a2e26/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L8581]:
> {code}
> 2017-10-10 11:22:37,638 ERROR [load-dynamic-partitions-12]: 
> metastore.ObjectStore (ObjectStore.java:addNotificationEvent(7444)) - could 
> not get lock for update
> java.lang.IllegalArgumentException: timeout value is negative
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$RetryingExecutor.run(ObjectStore.java:7407)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.lockForUpdate(ObjectStore.java:7361)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:7424)
> at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> [...]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17758) NOTIFICATION_SEQUENCE_LOCK_RETRY_SLEEP_INTERVAL.defaultLongVal is -1

2017-10-10 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-17758:

Attachment: HIVE-17758.02.patch

#2) also add test...to ensure sleepinterval is non-negative

> NOTIFICATION_SEQUENCE_LOCK_RETRY_SLEEP_INTERVAL.defaultLongVal is -1
> 
>
> Key: HIVE-17758
> URL: https://issues.apache.org/jira/browse/HIVE-17758
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17758.01.patch, HIVE-17758.02.patch
>
>
> HIVE-16886 introduced a retry logic; which has a configurable retry interval. 
> unfortunately {{HiveConf}} has some public fields which at first glance seems 
> to be usefull to pass as arguments to other methods - but in this case the 
> default value is not even loaded into the field read by the code.. and 
> because of that the innocent client code 
> [here|https://github.com/apache/hive/blob/a974a9e6c4659f511e0b5edb97ce340a023a2e26/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L8554]
>   have used a {{-1}} value incorrectly which eventually caused an exception 
> [here|https://github.com/apache/hive/blob/a974a9e6c4659f511e0b5edb97ce340a023a2e26/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L8581]:
> {code}
> 2017-10-10 11:22:37,638 ERROR [load-dynamic-partitions-12]: 
> metastore.ObjectStore (ObjectStore.java:addNotificationEvent(7444)) - could 
> not get lock for update
> java.lang.IllegalArgumentException: timeout value is negative
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$RetryingExecutor.run(ObjectStore.java:7407)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.lockForUpdate(ObjectStore.java:7361)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:7424)
> at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> [...]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work started] (HIVE-17758) NOTIFICATION_SEQUENCE_LOCK_RETRY_SLEEP_INTERVAL.defaultLongVal is -1

2017-10-10 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17758 started by Zoltan Haindrich.
---
> NOTIFICATION_SEQUENCE_LOCK_RETRY_SLEEP_INTERVAL.defaultLongVal is -1
> 
>
> Key: HIVE-17758
> URL: https://issues.apache.org/jira/browse/HIVE-17758
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17758.01.patch, HIVE-17758.02.patch
>
>
> HIVE-16886 introduced a retry logic; which has a configurable retry interval. 
> unfortunately {{HiveConf}} has some public fields which at first glance seems 
> to be usefull to pass as arguments to other methods - but in this case the 
> default value is not even loaded into the field read by the code.. and 
> because of that the innocent client code 
> [here|https://github.com/apache/hive/blob/a974a9e6c4659f511e0b5edb97ce340a023a2e26/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L8554]
>   have used a {{-1}} value incorrectly which eventually caused an exception 
> [here|https://github.com/apache/hive/blob/a974a9e6c4659f511e0b5edb97ce340a023a2e26/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L8581]:
> {code}
> 2017-10-10 11:22:37,638 ERROR [load-dynamic-partitions-12]: 
> metastore.ObjectStore (ObjectStore.java:addNotificationEvent(7444)) - could 
> not get lock for update
> java.lang.IllegalArgumentException: timeout value is negative
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$RetryingExecutor.run(ObjectStore.java:7407)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.lockForUpdate(ObjectStore.java:7361)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:7424)
> at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> [...]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17747) HMS DropTableMessage should include the full table object

2017-10-10 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199339#comment-16199339
 ] 

Vihang Karajgaonkar commented on HIVE-17747:


LGTM +1

> HMS DropTableMessage should include the full table object
> -
>
> Key: HIVE-17747
> URL: https://issues.apache.org/jira/browse/HIVE-17747
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog, Metastore
>Affects Versions: 2.3.0
>Reporter: Dan Burkert
>Assignee: Dan Burkert
> Attachments: HIVE-17747.0.patch
>
>
> I have a notification log follower use-case which requires accessing the 
> parameters of dropped tables, so it would be useful if the {{DROP_TABLE}} 
> events in the notification log included the full table object, as the create 
> and alter events do.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17758) NOTIFICATION_SEQUENCE_LOCK_RETRY_SLEEP_INTERVAL.defaultLongVal is -1

2017-10-10 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199341#comment-16199341
 ] 

Zoltan Haindrich commented on HIVE-17758:
-

[~anishek] could you please take a look?

> NOTIFICATION_SEQUENCE_LOCK_RETRY_SLEEP_INTERVAL.defaultLongVal is -1
> 
>
> Key: HIVE-17758
> URL: https://issues.apache.org/jira/browse/HIVE-17758
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17758.01.patch, HIVE-17758.02.patch
>
>
> HIVE-16886 introduced a retry logic; which has a configurable retry interval. 
> unfortunately {{HiveConf}} has some public fields which at first glance seems 
> to be usefull to pass as arguments to other methods - but in this case the 
> default value is not even loaded into the field read by the code.. and 
> because of that the innocent client code 
> [here|https://github.com/apache/hive/blob/a974a9e6c4659f511e0b5edb97ce340a023a2e26/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L8554]
>   have used a {{-1}} value incorrectly which eventually caused an exception 
> [here|https://github.com/apache/hive/blob/a974a9e6c4659f511e0b5edb97ce340a023a2e26/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L8581]:
> {code}
> 2017-10-10 11:22:37,638 ERROR [load-dynamic-partitions-12]: 
> metastore.ObjectStore (ObjectStore.java:addNotificationEvent(7444)) - could 
> not get lock for update
> java.lang.IllegalArgumentException: timeout value is negative
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$RetryingExecutor.run(ObjectStore.java:7407)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.lockForUpdate(ObjectStore.java:7361)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:7424)
> at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> [...]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17763) HCatLoader should fetch delegation tokens for partitions on remote HDFS

2017-10-10 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan reassigned HIVE-17763:
---


> HCatLoader should fetch delegation tokens for partitions on remote HDFS
> ---
>
> Key: HIVE-17763
> URL: https://issues.apache.org/jira/browse/HIVE-17763
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Security
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>
> The Hive metastore might store partition-info for data stored on a remote 
> HDFS (i.e. different from what's defined by {{fs.default.name}}. 
> {{HCatLoader}} should automatically fetch delegation-tokens for all remote 
> HDFSes that participate in an HCat-based query.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16677) CTAS with no data fails in Druid

2017-10-10 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-16677:
---
Attachment: HIVE-16677.03.patch

> CTAS with no data fails in Druid
> 
>
> Key: HIVE-16677
> URL: https://issues.apache.org/jira/browse/HIVE-16677
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16677.01.patch, HIVE-16677.02.patch, 
> HIVE-16677.03.patch, HIVE-16677.patch
>
>
> If we create a table in Druid using a CTAS statement and the query executed 
> to create the table produces no data, we fail with the following exception:
> {noformat}
> druid.DruidStorageHandler: Exception while commit
> java.io.FileNotFoundException: File 
> /tmp/workingDirectory/.staging-jcamachorodriguez_20170515053123_835c394b-2157-4f6b-bfed-a2753acd568e/segmentsDescriptorDir
>  does not exist.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17747) HMS DropTableMessage should include the full table object

2017-10-10 Thread Alexander Kolbasov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199347#comment-16199347
 ] 

Alexander Kolbasov commented on HIVE-17747:
---

Yes, looks like HIVE-17402 isn't needed any more.

> HMS DropTableMessage should include the full table object
> -
>
> Key: HIVE-17747
> URL: https://issues.apache.org/jira/browse/HIVE-17747
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog, Metastore
>Affects Versions: 2.3.0
>Reporter: Dan Burkert
>Assignee: Dan Burkert
> Attachments: HIVE-17747.0.patch
>
>
> I have a notification log follower use-case which requires accessing the 
> parameters of dropped tables, so it would be useful if the {{DROP_TABLE}} 
> events in the notification log included the full table object, as the create 
> and alter events do.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16677) CTAS with no data fails in Druid

2017-10-10 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-16677:
---
Attachment: HIVE-16677.04.patch

> CTAS with no data fails in Druid
> 
>
> Key: HIVE-16677
> URL: https://issues.apache.org/jira/browse/HIVE-16677
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16677.01.patch, HIVE-16677.02.patch, 
> HIVE-16677.04.patch, HIVE-16677.patch
>
>
> If we create a table in Druid using a CTAS statement and the query executed 
> to create the table produces no data, we fail with the following exception:
> {noformat}
> druid.DruidStorageHandler: Exception while commit
> java.io.FileNotFoundException: File 
> /tmp/workingDirectory/.staging-jcamachorodriguez_20170515053123_835c394b-2157-4f6b-bfed-a2753acd568e/segmentsDescriptorDir
>  does not exist.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17758) NOTIFICATION_SEQUENCE_LOCK_RETRY_SLEEP_INTERVAL.defaultLongVal is -1

2017-10-10 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199352#comment-16199352
 ] 

Ashutosh Chauhan commented on HIVE-17758:
-

+1

> NOTIFICATION_SEQUENCE_LOCK_RETRY_SLEEP_INTERVAL.defaultLongVal is -1
> 
>
> Key: HIVE-17758
> URL: https://issues.apache.org/jira/browse/HIVE-17758
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17758.01.patch, HIVE-17758.02.patch
>
>
> HIVE-16886 introduced a retry logic; which has a configurable retry interval. 
> unfortunately {{HiveConf}} has some public fields which at first glance seems 
> to be usefull to pass as arguments to other methods - but in this case the 
> default value is not even loaded into the field read by the code.. and 
> because of that the innocent client code 
> [here|https://github.com/apache/hive/blob/a974a9e6c4659f511e0b5edb97ce340a023a2e26/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L8554]
>   have used a {{-1}} value incorrectly which eventually caused an exception 
> [here|https://github.com/apache/hive/blob/a974a9e6c4659f511e0b5edb97ce340a023a2e26/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L8581]:
> {code}
> 2017-10-10 11:22:37,638 ERROR [load-dynamic-partitions-12]: 
> metastore.ObjectStore (ObjectStore.java:addNotificationEvent(7444)) - could 
> not get lock for update
> java.lang.IllegalArgumentException: timeout value is negative
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$RetryingExecutor.run(ObjectStore.java:7407)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.lockForUpdate(ObjectStore.java:7361)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:7424)
> at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> [...]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16677) CTAS with no data fails in Druid

2017-10-10 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-16677:
---
Attachment: (was: HIVE-16677.03.patch)

> CTAS with no data fails in Druid
> 
>
> Key: HIVE-16677
> URL: https://issues.apache.org/jira/browse/HIVE-16677
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16677.01.patch, HIVE-16677.02.patch, 
> HIVE-16677.04.patch, HIVE-16677.patch
>
>
> If we create a table in Druid using a CTAS statement and the query executed 
> to create the table produces no data, we fail with the following exception:
> {noformat}
> druid.DruidStorageHandler: Exception while commit
> java.io.FileNotFoundException: File 
> /tmp/workingDirectory/.staging-jcamachorodriguez_20170515053123_835c394b-2157-4f6b-bfed-a2753acd568e/segmentsDescriptorDir
>  does not exist.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17759) Prevent the misuses of HiveConf.ConfVars.default fields

2017-10-10 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199366#comment-16199366
 ] 

Sergey Shelukhin commented on HIVE-17759:
-

HiveConf already supports validators (see confvars that declare one).
HiveConf.ConfVars is also already an ENUM :)
It also has getIntVar/etc. getters (incl. static ones that don't require a 
HiveConf object).
The pattern of usage in the linked jira is clearly very brittle, so I think we 
should just hide these fields and add a getter with a type check.

> Prevent the misuses of HiveConf.ConfVars.default fields
> ---
>
> Key: HIVE-17759
> URL: https://issues.apache.org/jira/browse/HIVE-17759
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>
> issues like HIVE-17758 can be prevented if these fields wouldn't be directly 
> accessible



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >