date:20141201

[jira] [Updated] (HIVE-9005) HiveSever2 error with "Illegal Operation state transition from CLOSED to ERROR"

2014-12-01 Thread Binglin Chang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HIVE-9005:

Attachment: HIVE-9005.1.patch

When server close a operation(for example because of session timeout), it set 
the state to CLOSED, then the background operation is canceled, hive driver 
failed and try to set the state to ERROR, but it is illegal to do that. So 
exception occurs. 
The patch simply ignore driver error when the current state is CLOSED(or 
CANCELED).
 

> HiveSever2 error with "Illegal Operation state transition from CLOSED to 
> ERROR"
> ---
>
> Key: HIVE-9005
> URL: https://issues.apache.org/jira/browse/HIVE-9005
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
>Reporter: Binglin Chang
> Attachments: HIVE-9005.1.patch
>
>
> {noformat}
> 2014-12-02 11:25:40,855 WARN  [HiveServer2-Background-Pool: Thread-17]: 
> ql.Driver (DriverContext.java:shutdown(137)) - Shutting down task : 
> Stage-1:MAPRED
> 2014-12-02 11:25:41,898 INFO  [HiveServer2-Background-Pool: Thread-30]: 
> exec.Task (SessionState.java:printInfo(536)) - Hadoop job information for 
> Stage-1: number of mappers: 0; number of reducers: 0
> 2014-12-02 11:25:41,942 WARN  [HiveServer2-Background-Pool: Thread-30]: 
> mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group 
> org.apache.hadoop.mapred.Task$Counter is deprecated. Use 
> org.apache.hadoop.mapreduce.TaskCounter instead
> 2014-12-02 11:25:41,942 INFO  [HiveServer2-Background-Pool: Thread-30]: 
> exec.Task (SessionState.java:printInfo(536)) - 2014-12-02 11:25:41,939 
> Stage-1 map = 0%,  reduce = 0%
> 2014-12-02 11:25:41,945 WARN  [HiveServer2-Background-Pool: Thread-30]: 
> mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group 
> org.apache.hadoop.mapred.Task$Counter is deprecated. Use 
> org.apache.hadoop.mapreduce.TaskCounter instead
> 2014-12-02 11:25:41,952 ERROR [HiveServer2-Background-Pool: Thread-30]: 
> exec.Task (SessionState.java:printError(545)) - Ended Job = 
> job_1413717733669_207982 with errors
> 2014-12-02 11:25:41,954 ERROR [Thread-39]: exec.Task 
> (SessionState.java:printError(545)) - Error during job, obtaining debugging 
> information...
> 2014-12-02 11:25:41,957 ERROR [HiveServer2-Background-Pool: Thread-30]: 
> ql.Driver (SessionState.java:printError(545)) - FAILED: Operation cancelled
> 2014-12-02 11:25:41,957 INFO  [HiveServer2-Background-Pool: Thread-30]: 
> ql.Driver (SessionState.java:printInfo(536)) - MapReduce Jobs Launched:
> 2014-12-02 11:25:41,960 WARN  [HiveServer2-Background-Pool: Thread-30]: 
> mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group 
> FileSystemCounters is deprecated. Use 
> org.apache.hadoop.mapreduce.FileSystemCounter instead
> 2014-12-02 11:25:41,961 INFO  [HiveServer2-Background-Pool: Thread-30]: 
> ql.Driver (SessionState.java:printInfo(536)) - Stage-Stage-1:  HDFS Read: 0 
> HDFS Write: 0 FAIL
> 2014-12-02 11:25:41,961 INFO  [HiveServer2-Background-Pool: Thread-30]: 
> ql.Driver (SessionState.java:printInfo(536)) - Total MapReduce CPU Time 
> Spent: 0 msec
> 2014-12-02 11:25:41,965 ERROR [HiveServer2-Background-Pool: Thread-30]: 
> operation.Operation (SQLOperation.java:run(205)) - Error running hive query:
> org.apache.hive.service.cli.HiveSQLException: Illegal Operation state 
> transition from CLOSED to ERROR
>   at 
> org.apache.hive.service.cli.OperationState.validateTransition(OperationState.java:91)
>   at 
> org.apache.hive.service.cli.OperationState.validateTransition(OperationState.java:97)
>   at 
> org.apache.hive.service.cli.operation.Operation.setState(Operation.java:116)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:161)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation.java:71)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:202)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1589)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:504)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:215)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang

[jira] [Updated] (HIVE-9005) HiveSever2 error with "Illegal Operation state transition from CLOSED to ERROR"

2014-12-01 Thread Binglin Chang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HIVE-9005:

Status: Patch Available  (was: Open)

> HiveSever2 error with "Illegal Operation state transition from CLOSED to 
> ERROR"
> ---
>
> Key: HIVE-9005
> URL: https://issues.apache.org/jira/browse/HIVE-9005
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
>Reporter: Binglin Chang
>
> {noformat}
> 2014-12-02 11:25:40,855 WARN  [HiveServer2-Background-Pool: Thread-17]: 
> ql.Driver (DriverContext.java:shutdown(137)) - Shutting down task : 
> Stage-1:MAPRED
> 2014-12-02 11:25:41,898 INFO  [HiveServer2-Background-Pool: Thread-30]: 
> exec.Task (SessionState.java:printInfo(536)) - Hadoop job information for 
> Stage-1: number of mappers: 0; number of reducers: 0
> 2014-12-02 11:25:41,942 WARN  [HiveServer2-Background-Pool: Thread-30]: 
> mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group 
> org.apache.hadoop.mapred.Task$Counter is deprecated. Use 
> org.apache.hadoop.mapreduce.TaskCounter instead
> 2014-12-02 11:25:41,942 INFO  [HiveServer2-Background-Pool: Thread-30]: 
> exec.Task (SessionState.java:printInfo(536)) - 2014-12-02 11:25:41,939 
> Stage-1 map = 0%,  reduce = 0%
> 2014-12-02 11:25:41,945 WARN  [HiveServer2-Background-Pool: Thread-30]: 
> mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group 
> org.apache.hadoop.mapred.Task$Counter is deprecated. Use 
> org.apache.hadoop.mapreduce.TaskCounter instead
> 2014-12-02 11:25:41,952 ERROR [HiveServer2-Background-Pool: Thread-30]: 
> exec.Task (SessionState.java:printError(545)) - Ended Job = 
> job_1413717733669_207982 with errors
> 2014-12-02 11:25:41,954 ERROR [Thread-39]: exec.Task 
> (SessionState.java:printError(545)) - Error during job, obtaining debugging 
> information...
> 2014-12-02 11:25:41,957 ERROR [HiveServer2-Background-Pool: Thread-30]: 
> ql.Driver (SessionState.java:printError(545)) - FAILED: Operation cancelled
> 2014-12-02 11:25:41,957 INFO  [HiveServer2-Background-Pool: Thread-30]: 
> ql.Driver (SessionState.java:printInfo(536)) - MapReduce Jobs Launched:
> 2014-12-02 11:25:41,960 WARN  [HiveServer2-Background-Pool: Thread-30]: 
> mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group 
> FileSystemCounters is deprecated. Use 
> org.apache.hadoop.mapreduce.FileSystemCounter instead
> 2014-12-02 11:25:41,961 INFO  [HiveServer2-Background-Pool: Thread-30]: 
> ql.Driver (SessionState.java:printInfo(536)) - Stage-Stage-1:  HDFS Read: 0 
> HDFS Write: 0 FAIL
> 2014-12-02 11:25:41,961 INFO  [HiveServer2-Background-Pool: Thread-30]: 
> ql.Driver (SessionState.java:printInfo(536)) - Total MapReduce CPU Time 
> Spent: 0 msec
> 2014-12-02 11:25:41,965 ERROR [HiveServer2-Background-Pool: Thread-30]: 
> operation.Operation (SQLOperation.java:run(205)) - Error running hive query:
> org.apache.hive.service.cli.HiveSQLException: Illegal Operation state 
> transition from CLOSED to ERROR
>   at 
> org.apache.hive.service.cli.OperationState.validateTransition(OperationState.java:91)
>   at 
> org.apache.hive.service.cli.OperationState.validateTransition(OperationState.java:97)
>   at 
> org.apache.hive.service.cli.operation.Operation.setState(Operation.java:116)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:161)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation.java:71)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:202)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1589)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:504)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:215)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6421) abs() should preserve precision/scale of decimal input

2014-12-01 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14231105#comment-14231105
 ] 

Jason Dere commented on HIVE-6421:
--

Not really, more of a bug fix

> abs() should preserve precision/scale of decimal input
> --
>
> Key: HIVE-6421
> URL: https://issues.apache.org/jira/browse/HIVE-6421
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 0.15.0
>
> Attachments: HIVE-6421.1.txt, HIVE-6421.2.patch, HIVE-6421.3.patch
>
>
> {noformat}
> hive> describe dec1;
> OK
> c1decimal(10,2)   None 
> hive> explain select c1, abs(c1) from dec1;
>  ...
> Select Operator
>   expressions: c1 (type: decimal(10,2)), abs(c1) (type: 
> decimal(38,18))
> {noformat}
> Given that abs() is a GenericUDF it should be possible for the return type 
> precision/scale to match the input precision/scale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-860) Persistent distributed cache

2014-12-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14231102#comment-14231102
 ] 

Hive QA commented on HIVE-860:
--



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12684514/HIVE-860.4.patch

{color:red}ERROR:{color} -1 due to 63 failed/errored test(s), 6636 tests 
executed
*Failed tests:*
{noformat}
TestCliDriver-authorization_create_table_owner_privs.q-create_func1.q-partition_wise_fileformat.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-udf_second.q-bucketcontext_5.q-leadlag_queries.q-and-12-more - 
did not produce a TEST-*.xml file
TestHWISessionManager - did not produce a TEST-*.xml file
TestParseNegative - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_gby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_semijoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_windowing
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cluster
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_or_replace_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_distinct_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_dependency
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_logical
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_rc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compact
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input41
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_acid_dynamic_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_insert_move_tasks_share_dependencies
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonmr_fetch
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullformatCTAS
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_timestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat16
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_constant_expr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_reduce_deduplicate_extended
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_statsfs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_temp_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_touch
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union33
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_update_where_non_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_update_where_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_aggregate
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_elt
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_varchar_simple
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_15
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_mapjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing_rank
org.apache.hadoop.hive.cli.TestCompareCliDriver.testCompareCliDriver_vectorized_math_funcs
org.apache.hadoop.hive

[jira] [Updated] (HIVE-9006) hiveserver thrift api version is still 6

2014-12-01 Thread Binglin Chang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HIVE-9006:

Attachment: HIVE-9006.1.patch

> hiveserver thrift api version is still 6
> 
>
> Key: HIVE-9006
> URL: https://issues.apache.org/jira/browse/HIVE-9006
> Project: Hive
>  Issue Type: Bug
>Reporter: Binglin Chang
> Attachments: HIVE-9006.1.patch
>
>
> Look at the TCLIService.thrift, when open session, the protocol version info 
> is still v6.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9006) hiveserver thrift api version is still 6

2014-12-01 Thread Binglin Chang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HIVE-9006:

Status: Patch Available  (was: Open)

> hiveserver thrift api version is still 6
> 
>
> Key: HIVE-9006
> URL: https://issues.apache.org/jira/browse/HIVE-9006
> Project: Hive
>  Issue Type: Bug
>Reporter: Binglin Chang
>
> Look at the TCLIService.thrift, when open session, the protocol version info 
> is still v6.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9006) hiveserver thrift api version is still 6

2014-12-01 Thread Binglin Chang (JIRA)

Binglin Chang created HIVE-9006:
---

 Summary: hiveserver thrift api version is still 6
 Key: HIVE-9006
 URL: https://issues.apache.org/jira/browse/HIVE-9006
 Project: Hive
  Issue Type: Bug
Reporter: Binglin Chang


Look at the TCLIService.thrift, when open session, the protocol version info is 
still v6.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8916) Handle user@domain username under LDAP authentication

2014-12-01 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-8916:
-
Labels: TODOC15  (was: )

> Handle user@domain username under LDAP authentication
> -
>
> Key: HIVE-8916
> URL: https://issues.apache.org/jira/browse/HIVE-8916
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>  Labels: TODOC15
> Fix For: 0.15.0
>
> Attachments: HIVE-8916.2.patch, HIVE-8916.3.patch, HIVE-8916.patch
>
>
> If LDAP is configured with multiple domains for authentication, users can be 
> in different domains.
> Currently, LdapAuthenticationProviderImpl blindly appends the domain 
> configured "hive.server2.authentication.ldap.Domain" to the username, which 
> limits user to that domain. However, under multi-domain authentication, the 
> username may already include the domain (ex:  u...@domain.foo.com). We should 
> not append a domain if one is already present.
> Also, if username already includes the domain, rest of Hive and authorization 
> providers still expects the "short name" ("user" and not 
> "u...@domain.foo.com") for looking up privilege rules, etc.  As such, any 
> domain info in the username should be stripped off.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8995) Find thread leak in RSC Tests

2014-12-01 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-8995:
-
Status: Patch Available  (was: Open)

> Find thread leak in RSC Tests
> -
>
> Key: HIVE-8995
> URL: https://issues.apache.org/jira/browse/HIVE-8995
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Rui Li
> Attachments: HIVE-8995.1-spark.patch
>
>
> I was regenerating output as part of the merge:
> {noformat}
> mvn test -Dtest=TestSparkCliDriver -Phadoop-2 -Dtest.output.overwrite=true 
> -Dqfile=annotate_stats_join.q,auto_join0.q,auto_join1.q,auto_join10.q,auto_join11.q,auto_join12.q,auto_join13.q,auto_join14.q,auto_join15.q,auto_join16.q,auto_join17.q,auto_join18.q,auto_join18_multi_distinct.q,auto_join19.q,auto_join2.q,auto_join20.q,auto_join21.q,auto_join22.q,auto_join23.q,auto_join24.q,auto_join26.q,auto_join27.q,auto_join28.q,auto_join29.q,auto_join3.q,auto_join30.q,auto_join31.q,auto_join32.q,auto_join9.q,auto_join_reordering_values.q
>  
> auto_join_without_localtask.q,auto_smb_mapjoin_14.q,auto_sortmerge_join_1.q,auto_sortmerge_join_10.q,auto_sortmerge_join_11.q,auto_sortmerge_join_12.q,auto_sortmerge_join_14.q,auto_sortmerge_join_15.q,auto_sortmerge_join_2.q,auto_sortmerge_join_3.q,auto_sortmerge_join_4.q,auto_sortmerge_join_5.q,auto_sortmerge_join_6.q,auto_sortmerge_join_7.q,auto_sortmerge_join_8.q,auto_sortmerge_join_9.q,bucket_map_join_1.q,bucket_map_join_2.q,bucket_map_join_tez1.q,bucket_map_join_tez2.q,bucketmapjoin1.q,bucketmapjoin10.q,bucketmapjoin11.q,bucketmapjoin12.q,bucketmapjoin13.q,bucketmapjoin2.q,bucketmapjoin3.q,bucketmapjoin4.q,bucketmapjoin5.q,bucketmapjoin7.q
>  
> bucketmapjoin8.q,bucketmapjoin9.q,bucketmapjoin_negative.q,bucketmapjoin_negative2.q,bucketmapjoin_negative3.q,column_access_stats.q,cross_join.q,ctas.q,custom_input_output_format.q,groupby4.q,groupby7_noskew_multi_single_reducer.q,groupby_complex_types.q,groupby_complex_types_multi_single_reducer.q,groupby_multi_single_reducer2.q,groupby_multi_single_reducer3.q,groupby_position.q,groupby_sort_1_23.q,groupby_sort_skew_1_23.q,having.q,index_auto_self_join.q,infer_bucket_sort_convert_join.q,innerjoin.q,input12.q,join0.q,join1.q,join11.q,join12.q,join13.q,join14.q,join15.q
>  
> join17.q,join18.q,join18_multi_distinct.q,join19.q,join2.q,join20.q,join21.q,join22.q,join23.q,join25.q,join26.q,join27.q,join28.q,join29.q,join3.q,join30.q,join31.q,join32.q,join32_lessSize.q,join33.q,join35.q,join36.q,join37.q,join38.q,join39.q,join40.q,join41.q,join9.q,join_alt_syntax.q,join_cond_pushdown_1.q
>  
> join_cond_pushdown_2.q,join_cond_pushdown_3.q,join_cond_pushdown_4.q,join_cond_pushdown_unqual1.q,join_cond_pushdown_unqual2.q,join_cond_pushdown_unqual3.q,join_cond_pushdown_unqual4.q,join_filters_overlap.q,join_hive_626.q,join_map_ppr.q,join_merge_multi_expressions.q,join_merging.q,join_nullsafe.q,join_rc.q,join_reorder.q,join_reorder2.q,join_reorder3.q,join_reorder4.q,join_star.q,join_thrift.q,join_vc.q,join_view.q,limit_pushdown.q,load_dyn_part13.q,load_dyn_part14.q,louter_join_ppr.q,mapjoin1.q,mapjoin_decimal.q,mapjoin_distinct.q,mapjoin_filter_on_outerjoin.q
>  
> mapjoin_hook.q,mapjoin_mapjoin.q,mapjoin_memcheck.q,mapjoin_subquery.q,mapjoin_subquery2.q,mapjoin_test_outer.q,mergejoins.q,mergejoins_mixed.q,multi_insert.q,multi_insert_gby.q,multi_insert_gby2.q,multi_insert_gby3.q,multi_insert_lateral_view.q,multi_insert_mixed.q,multi_insert_move_tasks_share_dependencies.q,multi_join_union.q,optimize_nullscan.q,outer_join_ppr.q,parallel.q,parallel_join0.q,parallel_join1.q,parquet_join.q,pcr.q,ppd_gby_join.q,ppd_join.q,ppd_join2.q,ppd_join3.q,ppd_join4.q,ppd_join5.q,ppd_join_filter.q
>  
> ppd_multi_insert.q,ppd_outer_join1.q,ppd_outer_join2.q,ppd_outer_join3.q,ppd_outer_join4.q,ppd_outer_join5.q,ppd_transform.q,reduce_deduplicate_exclude_join.q,router_join_ppr.q,sample10.q,sample8.q,script_pipe.q,semijoin.q,skewjoin.q,skewjoin_noskew.q,skewjoin_union_remove_1.q,skewjoin_union_remove_2.q,skewjoinopt1.q,skewjoinopt10.q,skewjoinopt11.q,skewjoinopt12.q,skewjoinopt13.q,skewjoinopt14.q,skewjoinopt15.q,skewjoinopt16.q,skewjoinopt17.q,skewjoinopt18.q,skewjoinopt19.q,skewjoinopt2.q,skewjoinopt20.q
>  
> skewjoinopt3.q,skewjoinopt4.q,skewjoinopt5.q,skewjoinopt6.q,skewjoinopt7.q,skewjoinopt8.q,skewjoinopt9.q,smb_mapjoin9.q,smb_mapjoin_1.q,smb_mapjoin_10.q,smb_mapjoin_13.q,smb_mapjoin_14.q,smb_mapjoin_15.q,smb_mapjoin_16.q,smb_mapjoin_17.q,smb_mapjoin_2.q,smb_mapjoin_25.q,smb_mapjoin_3.q,smb_mapjoin_4.q,smb_mapjoin_5.q,smb_mapjoin_6.q,smb_mapjoin_7.q,sort_merge_join_desc_1.q,sort_merge_join_desc_2.q,sort_merge_join_desc_3.q,sort_merge_join_desc_4.q,sort_merge_join_desc_5.q,sort_merge_join_desc_6.q,sort_merge_join_desc_7.q,sort_merge_join_desc_8.q
>  
> stats1.q,subquery_in.q,subquery_multiinsert.q,t

[jira] [Updated] (HIVE-8995) Find thread leak in RSC Tests

2014-12-01 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-8995:
-
Attachment: HIVE-8995.1-spark.patch

I tried the tests [~brocknoland] mentioned with this patch. All the tests 
passed except {{custom_input_output_format.q}} and {{parquet_join.q}}.

> Find thread leak in RSC Tests
> -
>
> Key: HIVE-8995
> URL: https://issues.apache.org/jira/browse/HIVE-8995
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Rui Li
> Attachments: HIVE-8995.1-spark.patch
>
>
> I was regenerating output as part of the merge:
> {noformat}
> mvn test -Dtest=TestSparkCliDriver -Phadoop-2 -Dtest.output.overwrite=true 
> -Dqfile=annotate_stats_join.q,auto_join0.q,auto_join1.q,auto_join10.q,auto_join11.q,auto_join12.q,auto_join13.q,auto_join14.q,auto_join15.q,auto_join16.q,auto_join17.q,auto_join18.q,auto_join18_multi_distinct.q,auto_join19.q,auto_join2.q,auto_join20.q,auto_join21.q,auto_join22.q,auto_join23.q,auto_join24.q,auto_join26.q,auto_join27.q,auto_join28.q,auto_join29.q,auto_join3.q,auto_join30.q,auto_join31.q,auto_join32.q,auto_join9.q,auto_join_reordering_values.q
>  
> auto_join_without_localtask.q,auto_smb_mapjoin_14.q,auto_sortmerge_join_1.q,auto_sortmerge_join_10.q,auto_sortmerge_join_11.q,auto_sortmerge_join_12.q,auto_sortmerge_join_14.q,auto_sortmerge_join_15.q,auto_sortmerge_join_2.q,auto_sortmerge_join_3.q,auto_sortmerge_join_4.q,auto_sortmerge_join_5.q,auto_sortmerge_join_6.q,auto_sortmerge_join_7.q,auto_sortmerge_join_8.q,auto_sortmerge_join_9.q,bucket_map_join_1.q,bucket_map_join_2.q,bucket_map_join_tez1.q,bucket_map_join_tez2.q,bucketmapjoin1.q,bucketmapjoin10.q,bucketmapjoin11.q,bucketmapjoin12.q,bucketmapjoin13.q,bucketmapjoin2.q,bucketmapjoin3.q,bucketmapjoin4.q,bucketmapjoin5.q,bucketmapjoin7.q
>  
> bucketmapjoin8.q,bucketmapjoin9.q,bucketmapjoin_negative.q,bucketmapjoin_negative2.q,bucketmapjoin_negative3.q,column_access_stats.q,cross_join.q,ctas.q,custom_input_output_format.q,groupby4.q,groupby7_noskew_multi_single_reducer.q,groupby_complex_types.q,groupby_complex_types_multi_single_reducer.q,groupby_multi_single_reducer2.q,groupby_multi_single_reducer3.q,groupby_position.q,groupby_sort_1_23.q,groupby_sort_skew_1_23.q,having.q,index_auto_self_join.q,infer_bucket_sort_convert_join.q,innerjoin.q,input12.q,join0.q,join1.q,join11.q,join12.q,join13.q,join14.q,join15.q
>  
> join17.q,join18.q,join18_multi_distinct.q,join19.q,join2.q,join20.q,join21.q,join22.q,join23.q,join25.q,join26.q,join27.q,join28.q,join29.q,join3.q,join30.q,join31.q,join32.q,join32_lessSize.q,join33.q,join35.q,join36.q,join37.q,join38.q,join39.q,join40.q,join41.q,join9.q,join_alt_syntax.q,join_cond_pushdown_1.q
>  
> join_cond_pushdown_2.q,join_cond_pushdown_3.q,join_cond_pushdown_4.q,join_cond_pushdown_unqual1.q,join_cond_pushdown_unqual2.q,join_cond_pushdown_unqual3.q,join_cond_pushdown_unqual4.q,join_filters_overlap.q,join_hive_626.q,join_map_ppr.q,join_merge_multi_expressions.q,join_merging.q,join_nullsafe.q,join_rc.q,join_reorder.q,join_reorder2.q,join_reorder3.q,join_reorder4.q,join_star.q,join_thrift.q,join_vc.q,join_view.q,limit_pushdown.q,load_dyn_part13.q,load_dyn_part14.q,louter_join_ppr.q,mapjoin1.q,mapjoin_decimal.q,mapjoin_distinct.q,mapjoin_filter_on_outerjoin.q
>  
> mapjoin_hook.q,mapjoin_mapjoin.q,mapjoin_memcheck.q,mapjoin_subquery.q,mapjoin_subquery2.q,mapjoin_test_outer.q,mergejoins.q,mergejoins_mixed.q,multi_insert.q,multi_insert_gby.q,multi_insert_gby2.q,multi_insert_gby3.q,multi_insert_lateral_view.q,multi_insert_mixed.q,multi_insert_move_tasks_share_dependencies.q,multi_join_union.q,optimize_nullscan.q,outer_join_ppr.q,parallel.q,parallel_join0.q,parallel_join1.q,parquet_join.q,pcr.q,ppd_gby_join.q,ppd_join.q,ppd_join2.q,ppd_join3.q,ppd_join4.q,ppd_join5.q,ppd_join_filter.q
>  
> ppd_multi_insert.q,ppd_outer_join1.q,ppd_outer_join2.q,ppd_outer_join3.q,ppd_outer_join4.q,ppd_outer_join5.q,ppd_transform.q,reduce_deduplicate_exclude_join.q,router_join_ppr.q,sample10.q,sample8.q,script_pipe.q,semijoin.q,skewjoin.q,skewjoin_noskew.q,skewjoin_union_remove_1.q,skewjoin_union_remove_2.q,skewjoinopt1.q,skewjoinopt10.q,skewjoinopt11.q,skewjoinopt12.q,skewjoinopt13.q,skewjoinopt14.q,skewjoinopt15.q,skewjoinopt16.q,skewjoinopt17.q,skewjoinopt18.q,skewjoinopt19.q,skewjoinopt2.q,skewjoinopt20.q
>  
> skewjoinopt3.q,skewjoinopt4.q,skewjoinopt5.q,skewjoinopt6.q,skewjoinopt7.q,skewjoinopt8.q,skewjoinopt9.q,smb_mapjoin9.q,smb_mapjoin_1.q,smb_mapjoin_10.q,smb_mapjoin_13.q,smb_mapjoin_14.q,smb_mapjoin_15.q,smb_mapjoin_16.q,smb_mapjoin_17.q,smb_mapjoin_2.q,smb_mapjoin_25.q,smb_mapjoin_3.q,smb_mapjoin_4.q,smb_mapjoin_5.q,smb_mapjoin_6.q,smb_mapjoin_7.q,sort_merge_join_desc_1.q,sort_merge_join_desc_2.q,sort_merge_join_desc_3.q,sort_merge_join_desc_4.q,sort_me

[jira] [Commented] (HIVE-6421) abs() should preserve precision/scale of decimal input

2014-12-01 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14231061#comment-14231061
 ] 

Lefty Leverenz commented on HIVE-6421:
--

No doc needed?

> abs() should preserve precision/scale of decimal input
> --
>
> Key: HIVE-6421
> URL: https://issues.apache.org/jira/browse/HIVE-6421
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 0.15.0
>
> Attachments: HIVE-6421.1.txt, HIVE-6421.2.patch, HIVE-6421.3.patch
>
>
> {noformat}
> hive> describe dec1;
> OK
> c1decimal(10,2)   None 
> hive> explain select c1, abs(c1) from dec1;
>  ...
> Select Operator
>   expressions: c1 (type: decimal(10,2)), abs(c1) (type: 
> decimal(38,18))
> {noformat}
> Given that abs() is a GenericUDF it should be possible for the return type 
> precision/scale to match the input precision/scale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8192) Check DDL's writetype in DummyTxnManager

2014-12-01 Thread Wan Chang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14231053#comment-14231053
 ] 

Wan Chang commented on HIVE-8192:
-

The failed tests are not related to this. Would anyone review it?

> Check DDL's writetype in DummyTxnManager
> 
>
> Key: HIVE-8192
> URL: https://issues.apache.org/jira/browse/HIVE-8192
> Project: Hive
>  Issue Type: Bug
>  Components: Locking
>Affects Versions: 0.13.0, 0.13.1
> Environment: hive0.13.1
>Reporter: Wan Chang
>Priority: Minor
>  Labels: patch
> Attachments: HIVE-8192.2.patch, HIVE-8192.3.patch, HIVE-8192.4.patch, 
> HIVE-8192.5.patch
>
>
> The patch of HIVE-6734 added some DDL writetypes and checked DDL writetype in 
> DbTxnManager.java.
> We use DummyTxnManager as the default value of hive.txn.manager in 
> hive-site.xml. We noticed that the operation of CREATE TEMPORARY FUNCTION has 
> a DLL_NO_LOCK writetype but it requires a EXCLUSIVE lock. If we try to create 
> a temporary function while there's a SELECT is processing at the same 
> database, then the console will print 'conflicting lock present for default 
> mode EXCLUSIVE' and the CREATE TEMPORARY FUNCTION operation won't get the 
> lock until the SELECT is done. Maybe it's a good idea to check the DDL's 
> writetype in DummyTxnManager too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8374) schematool fails on Postgres versions < 9.2

2014-12-01 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14231048#comment-14231048
 ] 

Lefty Leverenz commented on HIVE-8374:
--

bq.  document it when we add options for this arg

Agreed.  Thanks [~mohitsabharwal].

> schematool fails on Postgres versions < 9.2
> ---
>
> Key: HIVE-8374
> URL: https://issues.apache.org/jira/browse/HIVE-8374
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Fix For: 0.15.0, 0.14.1
>
> Attachments: HIVE-8374.1.patch, HIVE-8374.2.patch, HIVE-8374.3.patch, 
> HIVE-8374.patch
>
>
> The upgrade script for HIVE-5700 creates an UDF with language 'plpgsql',
> which is available by default only for Postgres 9.2+.
> For older Postgres versions, the language must be explicitly created,
> otherwise schematool fails with the error:
> {code}
> Error: ERROR: language "plpgsql" does not exist
>   Hint: Use CREATE LANGUAGE to load the language into the database. 
> (state=42704,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9005) HiveSever2 error with "Illegal Operation state transition from CLOSED to ERROR"

2014-12-01 Thread Binglin Chang (JIRA)

Binglin Chang created HIVE-9005:
---

 Summary: HiveSever2 error with "Illegal Operation state transition 
from CLOSED to ERROR"
 Key: HIVE-9005
 URL: https://issues.apache.org/jira/browse/HIVE-9005
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Binglin Chang


{noformat}
2014-12-02 11:25:40,855 WARN  [HiveServer2-Background-Pool: Thread-17]: 
ql.Driver (DriverContext.java:shutdown(137)) - Shutting down task : 
Stage-1:MAPRED
2014-12-02 11:25:41,898 INFO  [HiveServer2-Background-Pool: Thread-30]: 
exec.Task (SessionState.java:printInfo(536)) - Hadoop job information for 
Stage-1: number of mappers: 0; number of reducers: 0
2014-12-02 11:25:41,942 WARN  [HiveServer2-Background-Pool: Thread-30]: 
mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group 
org.apache.hadoop.mapred.Task$Counter is deprecated. Use 
org.apache.hadoop.mapreduce.TaskCounter instead
2014-12-02 11:25:41,942 INFO  [HiveServer2-Background-Pool: Thread-30]: 
exec.Task (SessionState.java:printInfo(536)) - 2014-12-02 11:25:41,939 Stage-1 
map = 0%,  reduce = 0%
2014-12-02 11:25:41,945 WARN  [HiveServer2-Background-Pool: Thread-30]: 
mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group 
org.apache.hadoop.mapred.Task$Counter is deprecated. Use 
org.apache.hadoop.mapreduce.TaskCounter instead
2014-12-02 11:25:41,952 ERROR [HiveServer2-Background-Pool: Thread-30]: 
exec.Task (SessionState.java:printError(545)) - Ended Job = 
job_1413717733669_207982 with errors
2014-12-02 11:25:41,954 ERROR [Thread-39]: exec.Task 
(SessionState.java:printError(545)) - Error during job, obtaining debugging 
information...
2014-12-02 11:25:41,957 ERROR [HiveServer2-Background-Pool: Thread-30]: 
ql.Driver (SessionState.java:printError(545)) - FAILED: Operation cancelled
2014-12-02 11:25:41,957 INFO  [HiveServer2-Background-Pool: Thread-30]: 
ql.Driver (SessionState.java:printInfo(536)) - MapReduce Jobs Launched:
2014-12-02 11:25:41,960 WARN  [HiveServer2-Background-Pool: Thread-30]: 
mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group 
FileSystemCounters is deprecated. Use 
org.apache.hadoop.mapreduce.FileSystemCounter instead
2014-12-02 11:25:41,961 INFO  [HiveServer2-Background-Pool: Thread-30]: 
ql.Driver (SessionState.java:printInfo(536)) - Stage-Stage-1:  HDFS Read: 0 
HDFS Write: 0 FAIL
2014-12-02 11:25:41,961 INFO  [HiveServer2-Background-Pool: Thread-30]: 
ql.Driver (SessionState.java:printInfo(536)) - Total MapReduce CPU Time Spent: 
0 msec
2014-12-02 11:25:41,965 ERROR [HiveServer2-Background-Pool: Thread-30]: 
operation.Operation (SQLOperation.java:run(205)) - Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Illegal Operation state 
transition from CLOSED to ERROR
at 
org.apache.hive.service.cli.OperationState.validateTransition(OperationState.java:91)
at 
org.apache.hive.service.cli.OperationState.validateTransition(OperationState.java:97)
at 
org.apache.hive.service.cli.operation.Operation.setState(Operation.java:116)
at 
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:161)
at 
org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation.java:71)
at 
org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1589)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:504)
at 
org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:215)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-8995) Find thread leak in RSC Tests

2014-12-01 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230942#comment-14230942
 ] 

Brock Noland edited comment on HIVE-8995 at 12/2/14 4:07 AM:
-

bq. seems to be a good place to initialize and destroy SparkClientFactory.

+1


was (Author: brocknoland):
bq. which mean we initialize SparkClientFactory only if we instantiate 
SparkSessionManger instance.

+1

> Find thread leak in RSC Tests
> -
>
> Key: HIVE-8995
> URL: https://issues.apache.org/jira/browse/HIVE-8995
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Rui Li
>
> I was regenerating output as part of the merge:
> {noformat}
> mvn test -Dtest=TestSparkCliDriver -Phadoop-2 -Dtest.output.overwrite=true 
> -Dqfile=annotate_stats_join.q,auto_join0.q,auto_join1.q,auto_join10.q,auto_join11.q,auto_join12.q,auto_join13.q,auto_join14.q,auto_join15.q,auto_join16.q,auto_join17.q,auto_join18.q,auto_join18_multi_distinct.q,auto_join19.q,auto_join2.q,auto_join20.q,auto_join21.q,auto_join22.q,auto_join23.q,auto_join24.q,auto_join26.q,auto_join27.q,auto_join28.q,auto_join29.q,auto_join3.q,auto_join30.q,auto_join31.q,auto_join32.q,auto_join9.q,auto_join_reordering_values.q
>  
> auto_join_without_localtask.q,auto_smb_mapjoin_14.q,auto_sortmerge_join_1.q,auto_sortmerge_join_10.q,auto_sortmerge_join_11.q,auto_sortmerge_join_12.q,auto_sortmerge_join_14.q,auto_sortmerge_join_15.q,auto_sortmerge_join_2.q,auto_sortmerge_join_3.q,auto_sortmerge_join_4.q,auto_sortmerge_join_5.q,auto_sortmerge_join_6.q,auto_sortmerge_join_7.q,auto_sortmerge_join_8.q,auto_sortmerge_join_9.q,bucket_map_join_1.q,bucket_map_join_2.q,bucket_map_join_tez1.q,bucket_map_join_tez2.q,bucketmapjoin1.q,bucketmapjoin10.q,bucketmapjoin11.q,bucketmapjoin12.q,bucketmapjoin13.q,bucketmapjoin2.q,bucketmapjoin3.q,bucketmapjoin4.q,bucketmapjoin5.q,bucketmapjoin7.q
>  
> bucketmapjoin8.q,bucketmapjoin9.q,bucketmapjoin_negative.q,bucketmapjoin_negative2.q,bucketmapjoin_negative3.q,column_access_stats.q,cross_join.q,ctas.q,custom_input_output_format.q,groupby4.q,groupby7_noskew_multi_single_reducer.q,groupby_complex_types.q,groupby_complex_types_multi_single_reducer.q,groupby_multi_single_reducer2.q,groupby_multi_single_reducer3.q,groupby_position.q,groupby_sort_1_23.q,groupby_sort_skew_1_23.q,having.q,index_auto_self_join.q,infer_bucket_sort_convert_join.q,innerjoin.q,input12.q,join0.q,join1.q,join11.q,join12.q,join13.q,join14.q,join15.q
>  
> join17.q,join18.q,join18_multi_distinct.q,join19.q,join2.q,join20.q,join21.q,join22.q,join23.q,join25.q,join26.q,join27.q,join28.q,join29.q,join3.q,join30.q,join31.q,join32.q,join32_lessSize.q,join33.q,join35.q,join36.q,join37.q,join38.q,join39.q,join40.q,join41.q,join9.q,join_alt_syntax.q,join_cond_pushdown_1.q
>  
> join_cond_pushdown_2.q,join_cond_pushdown_3.q,join_cond_pushdown_4.q,join_cond_pushdown_unqual1.q,join_cond_pushdown_unqual2.q,join_cond_pushdown_unqual3.q,join_cond_pushdown_unqual4.q,join_filters_overlap.q,join_hive_626.q,join_map_ppr.q,join_merge_multi_expressions.q,join_merging.q,join_nullsafe.q,join_rc.q,join_reorder.q,join_reorder2.q,join_reorder3.q,join_reorder4.q,join_star.q,join_thrift.q,join_vc.q,join_view.q,limit_pushdown.q,load_dyn_part13.q,load_dyn_part14.q,louter_join_ppr.q,mapjoin1.q,mapjoin_decimal.q,mapjoin_distinct.q,mapjoin_filter_on_outerjoin.q
>  
> mapjoin_hook.q,mapjoin_mapjoin.q,mapjoin_memcheck.q,mapjoin_subquery.q,mapjoin_subquery2.q,mapjoin_test_outer.q,mergejoins.q,mergejoins_mixed.q,multi_insert.q,multi_insert_gby.q,multi_insert_gby2.q,multi_insert_gby3.q,multi_insert_lateral_view.q,multi_insert_mixed.q,multi_insert_move_tasks_share_dependencies.q,multi_join_union.q,optimize_nullscan.q,outer_join_ppr.q,parallel.q,parallel_join0.q,parallel_join1.q,parquet_join.q,pcr.q,ppd_gby_join.q,ppd_join.q,ppd_join2.q,ppd_join3.q,ppd_join4.q,ppd_join5.q,ppd_join_filter.q
>  
> ppd_multi_insert.q,ppd_outer_join1.q,ppd_outer_join2.q,ppd_outer_join3.q,ppd_outer_join4.q,ppd_outer_join5.q,ppd_transform.q,reduce_deduplicate_exclude_join.q,router_join_ppr.q,sample10.q,sample8.q,script_pipe.q,semijoin.q,skewjoin.q,skewjoin_noskew.q,skewjoin_union_remove_1.q,skewjoin_union_remove_2.q,skewjoinopt1.q,skewjoinopt10.q,skewjoinopt11.q,skewjoinopt12.q,skewjoinopt13.q,skewjoinopt14.q,skewjoinopt15.q,skewjoinopt16.q,skewjoinopt17.q,skewjoinopt18.q,skewjoinopt19.q,skewjoinopt2.q,skewjoinopt20.q
>  
> skewjoinopt3.q,skewjoinopt4.q,skewjoinopt5.q,skewjoinopt6.q,skewjoinopt7.q,skewjoinopt8.q,skewjoinopt9.q,smb_mapjoin9.q,smb_mapjoin_1.q,smb_mapjoin_10.q,smb_mapjoin_13.q,smb_mapjoin_14.q,smb_mapjoin_15.q,smb_mapjoin_16.q,smb_mapjoin_17.q,smb_mapjoin_2.q,smb_mapjoin_25.q,smb_mapjoin_3.q,smb_mapjoin_4.q,smb_mapjoin_5.q,smb_mapjoin_6.q,smb_mapjoin_7.q,sort_merge_

[jira] [Commented] (HIVE-8995) Find thread leak in RSC Tests

2014-12-01 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230942#comment-14230942
 ] 

Brock Noland commented on HIVE-8995:


bq. which mean we initialize SparkClientFactory only if we instantiate 
SparkSessionManger instance.

+1

> Find thread leak in RSC Tests
> -
>
> Key: HIVE-8995
> URL: https://issues.apache.org/jira/browse/HIVE-8995
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Rui Li
>
> I was regenerating output as part of the merge:
> {noformat}
> mvn test -Dtest=TestSparkCliDriver -Phadoop-2 -Dtest.output.overwrite=true 
> -Dqfile=annotate_stats_join.q,auto_join0.q,auto_join1.q,auto_join10.q,auto_join11.q,auto_join12.q,auto_join13.q,auto_join14.q,auto_join15.q,auto_join16.q,auto_join17.q,auto_join18.q,auto_join18_multi_distinct.q,auto_join19.q,auto_join2.q,auto_join20.q,auto_join21.q,auto_join22.q,auto_join23.q,auto_join24.q,auto_join26.q,auto_join27.q,auto_join28.q,auto_join29.q,auto_join3.q,auto_join30.q,auto_join31.q,auto_join32.q,auto_join9.q,auto_join_reordering_values.q
>  
> auto_join_without_localtask.q,auto_smb_mapjoin_14.q,auto_sortmerge_join_1.q,auto_sortmerge_join_10.q,auto_sortmerge_join_11.q,auto_sortmerge_join_12.q,auto_sortmerge_join_14.q,auto_sortmerge_join_15.q,auto_sortmerge_join_2.q,auto_sortmerge_join_3.q,auto_sortmerge_join_4.q,auto_sortmerge_join_5.q,auto_sortmerge_join_6.q,auto_sortmerge_join_7.q,auto_sortmerge_join_8.q,auto_sortmerge_join_9.q,bucket_map_join_1.q,bucket_map_join_2.q,bucket_map_join_tez1.q,bucket_map_join_tez2.q,bucketmapjoin1.q,bucketmapjoin10.q,bucketmapjoin11.q,bucketmapjoin12.q,bucketmapjoin13.q,bucketmapjoin2.q,bucketmapjoin3.q,bucketmapjoin4.q,bucketmapjoin5.q,bucketmapjoin7.q
>  
> bucketmapjoin8.q,bucketmapjoin9.q,bucketmapjoin_negative.q,bucketmapjoin_negative2.q,bucketmapjoin_negative3.q,column_access_stats.q,cross_join.q,ctas.q,custom_input_output_format.q,groupby4.q,groupby7_noskew_multi_single_reducer.q,groupby_complex_types.q,groupby_complex_types_multi_single_reducer.q,groupby_multi_single_reducer2.q,groupby_multi_single_reducer3.q,groupby_position.q,groupby_sort_1_23.q,groupby_sort_skew_1_23.q,having.q,index_auto_self_join.q,infer_bucket_sort_convert_join.q,innerjoin.q,input12.q,join0.q,join1.q,join11.q,join12.q,join13.q,join14.q,join15.q
>  
> join17.q,join18.q,join18_multi_distinct.q,join19.q,join2.q,join20.q,join21.q,join22.q,join23.q,join25.q,join26.q,join27.q,join28.q,join29.q,join3.q,join30.q,join31.q,join32.q,join32_lessSize.q,join33.q,join35.q,join36.q,join37.q,join38.q,join39.q,join40.q,join41.q,join9.q,join_alt_syntax.q,join_cond_pushdown_1.q
>  
> join_cond_pushdown_2.q,join_cond_pushdown_3.q,join_cond_pushdown_4.q,join_cond_pushdown_unqual1.q,join_cond_pushdown_unqual2.q,join_cond_pushdown_unqual3.q,join_cond_pushdown_unqual4.q,join_filters_overlap.q,join_hive_626.q,join_map_ppr.q,join_merge_multi_expressions.q,join_merging.q,join_nullsafe.q,join_rc.q,join_reorder.q,join_reorder2.q,join_reorder3.q,join_reorder4.q,join_star.q,join_thrift.q,join_vc.q,join_view.q,limit_pushdown.q,load_dyn_part13.q,load_dyn_part14.q,louter_join_ppr.q,mapjoin1.q,mapjoin_decimal.q,mapjoin_distinct.q,mapjoin_filter_on_outerjoin.q
>  
> mapjoin_hook.q,mapjoin_mapjoin.q,mapjoin_memcheck.q,mapjoin_subquery.q,mapjoin_subquery2.q,mapjoin_test_outer.q,mergejoins.q,mergejoins_mixed.q,multi_insert.q,multi_insert_gby.q,multi_insert_gby2.q,multi_insert_gby3.q,multi_insert_lateral_view.q,multi_insert_mixed.q,multi_insert_move_tasks_share_dependencies.q,multi_join_union.q,optimize_nullscan.q,outer_join_ppr.q,parallel.q,parallel_join0.q,parallel_join1.q,parquet_join.q,pcr.q,ppd_gby_join.q,ppd_join.q,ppd_join2.q,ppd_join3.q,ppd_join4.q,ppd_join5.q,ppd_join_filter.q
>  
> ppd_multi_insert.q,ppd_outer_join1.q,ppd_outer_join2.q,ppd_outer_join3.q,ppd_outer_join4.q,ppd_outer_join5.q,ppd_transform.q,reduce_deduplicate_exclude_join.q,router_join_ppr.q,sample10.q,sample8.q,script_pipe.q,semijoin.q,skewjoin.q,skewjoin_noskew.q,skewjoin_union_remove_1.q,skewjoin_union_remove_2.q,skewjoinopt1.q,skewjoinopt10.q,skewjoinopt11.q,skewjoinopt12.q,skewjoinopt13.q,skewjoinopt14.q,skewjoinopt15.q,skewjoinopt16.q,skewjoinopt17.q,skewjoinopt18.q,skewjoinopt19.q,skewjoinopt2.q,skewjoinopt20.q
>  
> skewjoinopt3.q,skewjoinopt4.q,skewjoinopt5.q,skewjoinopt6.q,skewjoinopt7.q,skewjoinopt8.q,skewjoinopt9.q,smb_mapjoin9.q,smb_mapjoin_1.q,smb_mapjoin_10.q,smb_mapjoin_13.q,smb_mapjoin_14.q,smb_mapjoin_15.q,smb_mapjoin_16.q,smb_mapjoin_17.q,smb_mapjoin_2.q,smb_mapjoin_25.q,smb_mapjoin_3.q,smb_mapjoin_4.q,smb_mapjoin_5.q,smb_mapjoin_6.q,smb_mapjoin_7.q,sort_merge_join_desc_1.q,sort_merge_join_desc_2.q,sort_merge_join_desc_3.q,sort_merge_join_desc_4.q,sort_merge_join_desc_5.q,sort_merge_join_desc_6.q,sort_merge_join_de

[jira] [Updated] (HIVE-8922) CBO: assorted date and timestamp issues

2014-12-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-8922:
---
Attachment: HIVE-8922.01.patch

rebased the patch

> CBO: assorted date and timestamp issues
> ---
>
> Key: HIVE-8922
> URL: https://issues.apache.org/jira/browse/HIVE-8922
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.15.0
>
> Attachments: HIVE-8922.01.patch, HIVE-8922.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8922) CBO: assorted date and timestamp issues

2014-12-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-8922:
---
Status: Patch Available  (was: Open)

> CBO: assorted date and timestamp issues
> ---
>
> Key: HIVE-8922
> URL: https://issues.apache.org/jira/browse/HIVE-8922
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.15.0
>
> Attachments: HIVE-8922.01.patch, HIVE-8922.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8923) HIVE-8512 needs to be fixed also for CBO

2014-12-01 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230916#comment-14230916
 ] 

Sergey Shelukhin commented on HIVE-8923:


Looks like tests failed because they have invalid *-gby queries... or the 
detection is invalid

> HIVE-8512 needs to be fixed also for CBO
> 
>
> Key: HIVE-8923
> URL: https://issues.apache.org/jira/browse/HIVE-8923
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.15.0
>
> Attachments: HIVE-8923.patch
>
>
> Queries reverted to incorrect results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8395) CBO: enable by default

2014-12-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-8395:
---
Attachment: HIVE-8395.26.patch

Update stuff. Fixes for the two open bugs are gone from the patch because they 
are not ready to commit, so some tests will fail, but I want to get back to 
real failures and not random out file changes.

> CBO: enable by default
> --
>
> Key: HIVE-8395
> URL: https://issues.apache.org/jira/browse/HIVE-8395
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.15.0
>
> Attachments: HIVE-8395.01.patch, HIVE-8395.02.patch, 
> HIVE-8395.03.patch, HIVE-8395.04.patch, HIVE-8395.05.patch, 
> HIVE-8395.06.patch, HIVE-8395.07.patch, HIVE-8395.08.patch, 
> HIVE-8395.09.patch, HIVE-8395.10.patch, HIVE-8395.11.patch, 
> HIVE-8395.12.patch, HIVE-8395.12.patch, HIVE-8395.13.patch, 
> HIVE-8395.13.patch, HIVE-8395.14.patch, HIVE-8395.15.patch, 
> HIVE-8395.16.patch, HIVE-8395.17.patch, HIVE-8395.18.patch, 
> HIVE-8395.18.patch, HIVE-8395.19.patch, HIVE-8395.20.patch, 
> HIVE-8395.21.patch, HIVE-8395.22.patch, HIVE-8395.23.patch, 
> HIVE-8395.23.withon.patch, HIVE-8395.24.patch, HIVE-8395.25.patch, 
> HIVE-8395.25.patch, HIVE-8395.26.patch, HIVE-8395.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9004) Reset doesn't work for the default empty value entry

2014-12-01 Thread Cheng Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HIVE-9004:

Fix Version/s: 0.14.1
   0.15.0
   spark-branch
   Status: Patch Available  (was: Open)

> Reset doesn't work for the default empty value entry
> 
>
> Key: HIVE-9004
> URL: https://issues.apache.org/jira/browse/HIVE-9004
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Reporter: Cheng Hao
>Assignee: Cheng Hao
> Fix For: spark-branch, 0.15.0, 0.14.1
>
> Attachments: reset.patch
>
>
> To illustrate that:
> In hive cli:
> hive> set hive.table.parameters.default;
> hive.table.parameters.default is undefined
> hive> set hive.table.parameters.default=key1=value1;
> hive> reset;
> hive> set hive.table.parameters.default;
> hive.table.parameters.default=key1=value1
> I think we expect the last output as "hive.table.parameters.default is 
> undefined"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9004) Reset doesn't work for the default empty value entry

2014-12-01 Thread Cheng Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HIVE-9004:

Attachment: reset.patch

> Reset doesn't work for the default empty value entry
> 
>
> Key: HIVE-9004
> URL: https://issues.apache.org/jira/browse/HIVE-9004
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Reporter: Cheng Hao
>Assignee: Cheng Hao
> Attachments: reset.patch
>
>
> To illustrate that:
> In hive cli:
> hive> set hive.table.parameters.default;
> hive.table.parameters.default is undefined
> hive> set hive.table.parameters.default=key1=value1;
> hive> reset;
> hive> set hive.table.parameters.default;
> hive.table.parameters.default=key1=value1
> I think we expect the last output as "hive.table.parameters.default is 
> undefined"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-9004) Reset doesn't work for the default empty value entry

2014-12-01 Thread Cheng Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao reassigned HIVE-9004:
---

Assignee: Cheng Hao

> Reset doesn't work for the default empty value entry
> 
>
> Key: HIVE-9004
> URL: https://issues.apache.org/jira/browse/HIVE-9004
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Reporter: Cheng Hao
>Assignee: Cheng Hao
>
> To illustrate that:
> In hive cli:
> hive> set hive.table.parameters.default;
> hive.table.parameters.default is undefined
> hive> set hive.table.parameters.default=key1=value1;
> hive> reset;
> hive> set hive.table.parameters.default;
> hive.table.parameters.default=key1=value1
> I think we expect the last output as "hive.table.parameters.default is 
> undefined"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8888) Mapjoin with LateralViewJoin generates wrong plan in Tez

2014-12-01 Thread Prasanth J (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230905#comment-14230905
 ] 

Prasanth J commented on HIVE-:
--

Last patch looks good to me. +1

> Mapjoin with LateralViewJoin generates wrong plan in Tez
> 
>
> Key: HIVE-
> URL: https://issues.apache.org/jira/browse/HIVE-
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 0.15.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Fix For: 0.14.1
>
> Attachments: HIVE-.1.patch, HIVE-.2.patch, HIVE-.3.patch, 
> HIVE-.4.patch, HIVE-.5.patch
>
>
> Queries like these 
> {code}
> with sub1 as
> (select aid, avalue from expod1 lateral view explode(av) avs as avalue ),
> sub2 as
> (select bid, bvalue from expod2 lateral view explode(bv) bvs as bvalue)
> select sub1.aid, sub1.avalue, sub2.bvalue
> from sub1,sub2
> where sub1.aid=sub2.bid;
> {code}
> generates twice the number of rows in Tez when compared to MR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9004) Reset doesn't work for the default empty value entry

2014-12-01 Thread Cheng Hao (JIRA)

Cheng Hao created HIVE-9004:
---

 Summary: Reset doesn't work for the default empty value entry
 Key: HIVE-9004
 URL: https://issues.apache.org/jira/browse/HIVE-9004
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Reporter: Cheng Hao


To illustrate that:
In hive cli:
hive> set hive.table.parameters.default;
hive.table.parameters.default is undefined
hive> set hive.table.parameters.default=key1=value1;
hive> reset;
hive> set hive.table.parameters.default;
hive.table.parameters.default=key1=value1

I think we expect the last output as "hive.table.parameters.default is 
undefined"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8900) Create encryption testing framework

2014-12-01 Thread Ferdinand Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230889#comment-14230889
 ] 

Ferdinand Xu commented on HIVE-8900:


Thank [~spena] for your review. I have updated the patch according to your 
comments and added some inline comments.

> Create encryption testing framework
> ---
>
> Key: HIVE-8900
> URL: https://issues.apache.org/jira/browse/HIVE-8900
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Brock Noland
>Assignee: Ferdinand Xu
> Attachments: HIVE-8065.patch
>
>
> As [mentioned by 
> Alan|https://issues.apache.org/jira/browse/HIVE-8821?focusedCommentId=14215318&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14215318]
>  we already have some q-file tests which fit our needs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 28283: HIVE-8900:Create encryption testing framework

2014-12-01 Thread cheng xu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28283/
---

(Updated Dec. 2, 2014, 2:58 a.m.)


Review request for hive.


Changes
---

summary:
1. clean the code
2. remove unnecessary variables from HadoopShim
3. remove needless template


Repository: hive-git


Description
---

The patch includes:
1. enable security properties for hive security cluster


Diffs (updated)
-

  .gitignore c5decaf 
  data/scripts/q_test_cleanup_for_encryption.sql PRE-CREATION 
  data/scripts/q_test_init_for_encryption.sql PRE-CREATION 
  itests/qtest/pom.xml 376f4a9 
  itests/src/test/resources/testconfiguration.properties 3ae001d 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 31d5c29 
  ql/src/test/queries/clientpositive/create_encrypted_table.q PRE-CREATION 
  shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 
2e00d93 
  shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java 
8161fc1 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 
fa66a4a 

Diff: https://reviews.apache.org/r/28283/diff/


Testing
---


Thanks,

cheng xu

[jira] [Commented] (HIVE-8995) Find thread leak in RSC Tests

2014-12-01 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230884#comment-14230884
 ] 

Xuefu Zhang commented on HIVE-8995:
---

[~lirui] SparkSessionManager, a singleton, seems to be a good place to 
initialize and destroy SparkClientFactory. We just need to do this lazily, 
which mean we initialize SparkClientFactory only if we instantiate 
SparkSessionManger instance. 

> Find thread leak in RSC Tests
> -
>
> Key: HIVE-8995
> URL: https://issues.apache.org/jira/browse/HIVE-8995
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Rui Li
>
> I was regenerating output as part of the merge:
> {noformat}
> mvn test -Dtest=TestSparkCliDriver -Phadoop-2 -Dtest.output.overwrite=true 
> -Dqfile=annotate_stats_join.q,auto_join0.q,auto_join1.q,auto_join10.q,auto_join11.q,auto_join12.q,auto_join13.q,auto_join14.q,auto_join15.q,auto_join16.q,auto_join17.q,auto_join18.q,auto_join18_multi_distinct.q,auto_join19.q,auto_join2.q,auto_join20.q,auto_join21.q,auto_join22.q,auto_join23.q,auto_join24.q,auto_join26.q,auto_join27.q,auto_join28.q,auto_join29.q,auto_join3.q,auto_join30.q,auto_join31.q,auto_join32.q,auto_join9.q,auto_join_reordering_values.q
>  
> auto_join_without_localtask.q,auto_smb_mapjoin_14.q,auto_sortmerge_join_1.q,auto_sortmerge_join_10.q,auto_sortmerge_join_11.q,auto_sortmerge_join_12.q,auto_sortmerge_join_14.q,auto_sortmerge_join_15.q,auto_sortmerge_join_2.q,auto_sortmerge_join_3.q,auto_sortmerge_join_4.q,auto_sortmerge_join_5.q,auto_sortmerge_join_6.q,auto_sortmerge_join_7.q,auto_sortmerge_join_8.q,auto_sortmerge_join_9.q,bucket_map_join_1.q,bucket_map_join_2.q,bucket_map_join_tez1.q,bucket_map_join_tez2.q,bucketmapjoin1.q,bucketmapjoin10.q,bucketmapjoin11.q,bucketmapjoin12.q,bucketmapjoin13.q,bucketmapjoin2.q,bucketmapjoin3.q,bucketmapjoin4.q,bucketmapjoin5.q,bucketmapjoin7.q
>  
> bucketmapjoin8.q,bucketmapjoin9.q,bucketmapjoin_negative.q,bucketmapjoin_negative2.q,bucketmapjoin_negative3.q,column_access_stats.q,cross_join.q,ctas.q,custom_input_output_format.q,groupby4.q,groupby7_noskew_multi_single_reducer.q,groupby_complex_types.q,groupby_complex_types_multi_single_reducer.q,groupby_multi_single_reducer2.q,groupby_multi_single_reducer3.q,groupby_position.q,groupby_sort_1_23.q,groupby_sort_skew_1_23.q,having.q,index_auto_self_join.q,infer_bucket_sort_convert_join.q,innerjoin.q,input12.q,join0.q,join1.q,join11.q,join12.q,join13.q,join14.q,join15.q
>  
> join17.q,join18.q,join18_multi_distinct.q,join19.q,join2.q,join20.q,join21.q,join22.q,join23.q,join25.q,join26.q,join27.q,join28.q,join29.q,join3.q,join30.q,join31.q,join32.q,join32_lessSize.q,join33.q,join35.q,join36.q,join37.q,join38.q,join39.q,join40.q,join41.q,join9.q,join_alt_syntax.q,join_cond_pushdown_1.q
>  
> join_cond_pushdown_2.q,join_cond_pushdown_3.q,join_cond_pushdown_4.q,join_cond_pushdown_unqual1.q,join_cond_pushdown_unqual2.q,join_cond_pushdown_unqual3.q,join_cond_pushdown_unqual4.q,join_filters_overlap.q,join_hive_626.q,join_map_ppr.q,join_merge_multi_expressions.q,join_merging.q,join_nullsafe.q,join_rc.q,join_reorder.q,join_reorder2.q,join_reorder3.q,join_reorder4.q,join_star.q,join_thrift.q,join_vc.q,join_view.q,limit_pushdown.q,load_dyn_part13.q,load_dyn_part14.q,louter_join_ppr.q,mapjoin1.q,mapjoin_decimal.q,mapjoin_distinct.q,mapjoin_filter_on_outerjoin.q
>  
> mapjoin_hook.q,mapjoin_mapjoin.q,mapjoin_memcheck.q,mapjoin_subquery.q,mapjoin_subquery2.q,mapjoin_test_outer.q,mergejoins.q,mergejoins_mixed.q,multi_insert.q,multi_insert_gby.q,multi_insert_gby2.q,multi_insert_gby3.q,multi_insert_lateral_view.q,multi_insert_mixed.q,multi_insert_move_tasks_share_dependencies.q,multi_join_union.q,optimize_nullscan.q,outer_join_ppr.q,parallel.q,parallel_join0.q,parallel_join1.q,parquet_join.q,pcr.q,ppd_gby_join.q,ppd_join.q,ppd_join2.q,ppd_join3.q,ppd_join4.q,ppd_join5.q,ppd_join_filter.q
>  
> ppd_multi_insert.q,ppd_outer_join1.q,ppd_outer_join2.q,ppd_outer_join3.q,ppd_outer_join4.q,ppd_outer_join5.q,ppd_transform.q,reduce_deduplicate_exclude_join.q,router_join_ppr.q,sample10.q,sample8.q,script_pipe.q,semijoin.q,skewjoin.q,skewjoin_noskew.q,skewjoin_union_remove_1.q,skewjoin_union_remove_2.q,skewjoinopt1.q,skewjoinopt10.q,skewjoinopt11.q,skewjoinopt12.q,skewjoinopt13.q,skewjoinopt14.q,skewjoinopt15.q,skewjoinopt16.q,skewjoinopt17.q,skewjoinopt18.q,skewjoinopt19.q,skewjoinopt2.q,skewjoinopt20.q
>  
> skewjoinopt3.q,skewjoinopt4.q,skewjoinopt5.q,skewjoinopt6.q,skewjoinopt7.q,skewjoinopt8.q,skewjoinopt9.q,smb_mapjoin9.q,smb_mapjoin_1.q,smb_mapjoin_10.q,smb_mapjoin_13.q,smb_mapjoin_14.q,smb_mapjoin_15.q,smb_mapjoin_16.q,smb_mapjoin_17.q,smb_mapjoin_2.q,smb_mapjoin_25.q,smb_mapjoin_3.q,smb_mapjoin_4.q,smb_mapjoin_5.q,smb_mapjoin_6.q,smb_mapjoin_7.q,sort_merge_join_desc_1.q,sort

[jira] [Commented] (HIVE-8995) Find thread leak in RSC Tests

2014-12-01 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230872#comment-14230872
 ] 

Rui Li commented on HIVE-8995:
--

Hi [~brocknoland], I checked the code. It seems we initialize 
{{SparkClientFactory}} each time we open a session, and we never stop it. Based 
on the discussion above, I think we should initialize {{SparkClientFactory}} 
only once and stop it when app shuts down right? Maybe we can do that in 
{{SparkSessionManager}}, what do you think?

> Find thread leak in RSC Tests
> -
>
> Key: HIVE-8995
> URL: https://issues.apache.org/jira/browse/HIVE-8995
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Rui Li
>
> I was regenerating output as part of the merge:
> {noformat}
> mvn test -Dtest=TestSparkCliDriver -Phadoop-2 -Dtest.output.overwrite=true 
> -Dqfile=annotate_stats_join.q,auto_join0.q,auto_join1.q,auto_join10.q,auto_join11.q,auto_join12.q,auto_join13.q,auto_join14.q,auto_join15.q,auto_join16.q,auto_join17.q,auto_join18.q,auto_join18_multi_distinct.q,auto_join19.q,auto_join2.q,auto_join20.q,auto_join21.q,auto_join22.q,auto_join23.q,auto_join24.q,auto_join26.q,auto_join27.q,auto_join28.q,auto_join29.q,auto_join3.q,auto_join30.q,auto_join31.q,auto_join32.q,auto_join9.q,auto_join_reordering_values.q
>  
> auto_join_without_localtask.q,auto_smb_mapjoin_14.q,auto_sortmerge_join_1.q,auto_sortmerge_join_10.q,auto_sortmerge_join_11.q,auto_sortmerge_join_12.q,auto_sortmerge_join_14.q,auto_sortmerge_join_15.q,auto_sortmerge_join_2.q,auto_sortmerge_join_3.q,auto_sortmerge_join_4.q,auto_sortmerge_join_5.q,auto_sortmerge_join_6.q,auto_sortmerge_join_7.q,auto_sortmerge_join_8.q,auto_sortmerge_join_9.q,bucket_map_join_1.q,bucket_map_join_2.q,bucket_map_join_tez1.q,bucket_map_join_tez2.q,bucketmapjoin1.q,bucketmapjoin10.q,bucketmapjoin11.q,bucketmapjoin12.q,bucketmapjoin13.q,bucketmapjoin2.q,bucketmapjoin3.q,bucketmapjoin4.q,bucketmapjoin5.q,bucketmapjoin7.q
>  
> bucketmapjoin8.q,bucketmapjoin9.q,bucketmapjoin_negative.q,bucketmapjoin_negative2.q,bucketmapjoin_negative3.q,column_access_stats.q,cross_join.q,ctas.q,custom_input_output_format.q,groupby4.q,groupby7_noskew_multi_single_reducer.q,groupby_complex_types.q,groupby_complex_types_multi_single_reducer.q,groupby_multi_single_reducer2.q,groupby_multi_single_reducer3.q,groupby_position.q,groupby_sort_1_23.q,groupby_sort_skew_1_23.q,having.q,index_auto_self_join.q,infer_bucket_sort_convert_join.q,innerjoin.q,input12.q,join0.q,join1.q,join11.q,join12.q,join13.q,join14.q,join15.q
>  
> join17.q,join18.q,join18_multi_distinct.q,join19.q,join2.q,join20.q,join21.q,join22.q,join23.q,join25.q,join26.q,join27.q,join28.q,join29.q,join3.q,join30.q,join31.q,join32.q,join32_lessSize.q,join33.q,join35.q,join36.q,join37.q,join38.q,join39.q,join40.q,join41.q,join9.q,join_alt_syntax.q,join_cond_pushdown_1.q
>  
> join_cond_pushdown_2.q,join_cond_pushdown_3.q,join_cond_pushdown_4.q,join_cond_pushdown_unqual1.q,join_cond_pushdown_unqual2.q,join_cond_pushdown_unqual3.q,join_cond_pushdown_unqual4.q,join_filters_overlap.q,join_hive_626.q,join_map_ppr.q,join_merge_multi_expressions.q,join_merging.q,join_nullsafe.q,join_rc.q,join_reorder.q,join_reorder2.q,join_reorder3.q,join_reorder4.q,join_star.q,join_thrift.q,join_vc.q,join_view.q,limit_pushdown.q,load_dyn_part13.q,load_dyn_part14.q,louter_join_ppr.q,mapjoin1.q,mapjoin_decimal.q,mapjoin_distinct.q,mapjoin_filter_on_outerjoin.q
>  
> mapjoin_hook.q,mapjoin_mapjoin.q,mapjoin_memcheck.q,mapjoin_subquery.q,mapjoin_subquery2.q,mapjoin_test_outer.q,mergejoins.q,mergejoins_mixed.q,multi_insert.q,multi_insert_gby.q,multi_insert_gby2.q,multi_insert_gby3.q,multi_insert_lateral_view.q,multi_insert_mixed.q,multi_insert_move_tasks_share_dependencies.q,multi_join_union.q,optimize_nullscan.q,outer_join_ppr.q,parallel.q,parallel_join0.q,parallel_join1.q,parquet_join.q,pcr.q,ppd_gby_join.q,ppd_join.q,ppd_join2.q,ppd_join3.q,ppd_join4.q,ppd_join5.q,ppd_join_filter.q
>  
> ppd_multi_insert.q,ppd_outer_join1.q,ppd_outer_join2.q,ppd_outer_join3.q,ppd_outer_join4.q,ppd_outer_join5.q,ppd_transform.q,reduce_deduplicate_exclude_join.q,router_join_ppr.q,sample10.q,sample8.q,script_pipe.q,semijoin.q,skewjoin.q,skewjoin_noskew.q,skewjoin_union_remove_1.q,skewjoin_union_remove_2.q,skewjoinopt1.q,skewjoinopt10.q,skewjoinopt11.q,skewjoinopt12.q,skewjoinopt13.q,skewjoinopt14.q,skewjoinopt15.q,skewjoinopt16.q,skewjoinopt17.q,skewjoinopt18.q,skewjoinopt19.q,skewjoinopt2.q,skewjoinopt20.q
>  
> skewjoinopt3.q,skewjoinopt4.q,skewjoinopt5.q,skewjoinopt6.q,skewjoinopt7.q,skewjoinopt8.q,skewjoinopt9.q,smb_mapjoin9.q,smb_mapjoin_1.q,smb_mapjoin_10.q,smb_mapjoin_13.q,smb_mapjoin_14.q,smb_mapjoin_15.q,smb_mapjoin_16.q,smb_mapjoin_17.q,smb_mapjoin_2.q,smb_mapjoin_25.q,smb_mapjoin_3.q,smb_mapjo

[jira] [Commented] (HIVE-9001) Ship with log4j.properties file that has a reliable time based rolling policy

2014-12-01 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230871#comment-14230871
 ] 

Sushanth Sowmyan commented on HIVE-9001:


Looks good to me except for one minor fix - in the change to 
data/conf/hive-log4j.properties, you seem to have a comment section that is 
space-aligned to go to the next line rather than having newlines, which makes 
it one long comment line in the patch. I can fix this myself before committing 
if that's okay with you.



> Ship with log4j.properties file that has a reliable time based rolling policy
> -
>
> Key: HIVE-9001
> URL: https://issues.apache.org/jira/browse/HIVE-9001
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-9001.1.patch
>
>
> The hive log gets locked by the hive process and cannot be rolled in windows 
> OS.
> Install Hive in  Windows, start hive, try and rename hive log while Hive is 
> running. 
> Wait for log4j tries to rename it and it will throw the same error as it is 
> locked by the process.
> The changes in https://issues.apache.org/bugzilla/show_bug.cgi?id=29726 
> should be integrated to Hive for a reliable rollover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8848) data loading from text files or text file processing doesn't handle nulls correctly

2014-12-01 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230862#comment-14230862
 ] 

Sergey Shelukhin commented on HIVE-8848:


Are HBase failures real?

> data loading from text files or text file processing doesn't handle nulls 
> correctly
> ---
>
> Key: HIVE-8848
> URL: https://issues.apache.org/jira/browse/HIVE-8848
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-8848.01.patch, HIVE-8848.2.patch.txt, 
> HIVE-8848.3.patch.txt, HIVE-8848.patch
>
>
> I am not sure how nulls are supposed to be stored in text tables, but after 
> loading some data with "null" or "NULL" strings, or x00 characters, we get 
> bunch of annoying logging from LazyPrimitive that data is not in INT format 
> and was converted to null, with data being "null" (string saying "null", I 
> assume from the code).
> Either load should load them as nulls, or there should be some defined way to 
> load nulls.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8848) data loading from text files or text file processing doesn't handle nulls correctly

2014-12-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-8848:
---
Assignee: Navis  (was: Sergey Shelukhin)

> data loading from text files or text file processing doesn't handle nulls 
> correctly
> ---
>
> Key: HIVE-8848
> URL: https://issues.apache.org/jira/browse/HIVE-8848
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Navis
> Attachments: HIVE-8848.01.patch, HIVE-8848.2.patch.txt, 
> HIVE-8848.3.patch.txt, HIVE-8848.patch
>
>
> I am not sure how nulls are supposed to be stored in text tables, but after 
> loading some data with "null" or "NULL" strings, or x00 characters, we get 
> bunch of annoying logging from LazyPrimitive that data is not in INT format 
> and was converted to null, with data being "null" (string saying "null", I 
> assume from the code).
> Either load should load them as nulls, or there should be some defined way to 
> load nulls.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8968) Changing hive metastore to sql server causes an error

2014-12-01 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230855#comment-14230855
 ] 

Sergey Shelukhin commented on HIVE-8968:


This might be an Ambari issue. Let me check...

> Changing hive metastore to sql server causes an error
> -
>
> Key: HIVE-8968
> URL: https://issues.apache.org/jira/browse/HIVE-8968
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 0.13.0
> Environment: HDP 2.1, On CentOS
>Reporter: Colman Madden
>Priority: Minor
> Fix For: 0.14.0, 0.13.1
>
>
> The following script does not cater for the microsoft JDBC driver
> /var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HIVE/package/scripts/params.py
> It only caters for mysql, postgres and oracle drivers as below:
> hive_jdbc_driver = 
> config['configurations']['hive-site']['javax.jdo.option.ConnectionDriverName']
> if hive_jdbc_driver == "com.mysql.jdbc.Driver":
>   jdbc_jar_name = "mysql-connector-java.jar"
> elif hive_jdbc_driver == "org.postgresql.Driver":
>   jdbc_jar_name = "postgresql-jdbc.jar"
> elif hive_jdbc_driver == "oracle.jdbc.driver.OracleDriver":
>   jdbc_jar_name = "ojdbc6.jar"
> We needed to add in the following two lines in order to get it to work with 
> the sql server JDBC:
> elif hive_jdbc_driver == "com.microsoft.sqlserver.jdbc.SQLServerDriver":
>   jdbc_jar_name = "sqljdbc4.jar"
> We have only implemented this on our own cluster and it appears to work 
> without issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: 0.15 release

2014-12-01 Thread Brock Noland

Yes I thinking the same, quarterly releases and branching for 0.15 in
a month or two.

On Mon, Dec 1, 2014 at 2:58 PM, Thejas Nair  wrote:
> Brock,
> When you say more frequent releases, what schedule do you have in mind
> ? I think a (approximately) quarterly release cycle would be good.
> We branched for hive 0.14 on Sept 25, which means we have been adding
> new features not in 0.14 for more than 2 months.
> How about branching for 0.15 equivalent in another month or two ?
> Sometime in Jan ?
>
>
>
> On Mon, Dec 1, 2014 at 2:19 PM, Thejas Nair  wrote:
>> +1 .
>> Regarding the next version being 0.15 - I have some thoughts on the
>> versioning of hive. I will start a different thread on that.
>>
>>
>> On Mon, Dec 1, 2014 at 11:43 AM, Brock Noland  wrote:
>>> Hi,
>>>
>>> In 2014 we did two large releases. Thank you very much to the RM's for
>>> pushing those out! I've found that Apache projects gain traction
>>> through releasing often, thus I think we should aim to increase the
>>> rate of releases in 2015. (Not that I cannot complain since I did not
>>> volunteer to RM any release.)
>>>
>>> As such I'd like to volunteer as RM for the 0.15 release.
>>>
>>> Cheers,
>>> Brock
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.

[jira] [Commented] (HIVE-8916) Handle user@domain username under LDAP authentication

2014-12-01 Thread Mohit Sabharwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230846#comment-14230846
 ] 

Mohit Sabharwal commented on HIVE-8916:
---

We could add documentation to [Configuration Properties -- 
hive.server2.authentication.ldap.Domain | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.server2.authentication.ldap.Domain]

In case of LDAP authentication, {{hive.server2.authentication.ldap.Domain}}, if 
configured, is appended to the LDAP username passed in the client connection. 
This is because LDAP providers like Active Directory expect a fully qualified 
username that includes the domain.

Starting 0.15.0 (HIVE-8916), if the username passed in the client connection 
already includes the domain, any value configured in 
{{hive.server2.authentication.ldap.Domain}} is not appended to the username.

> Handle user@domain username under LDAP authentication
> -
>
> Key: HIVE-8916
> URL: https://issues.apache.org/jira/browse/HIVE-8916
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Fix For: 0.15.0
>
> Attachments: HIVE-8916.2.patch, HIVE-8916.3.patch, HIVE-8916.patch
>
>
> If LDAP is configured with multiple domains for authentication, users can be 
> in different domains.
> Currently, LdapAuthenticationProviderImpl blindly appends the domain 
> configured "hive.server2.authentication.ldap.Domain" to the username, which 
> limits user to that domain. However, under multi-domain authentication, the 
> username may already include the domain (ex:  u...@domain.foo.com). We should 
> not append a domain if one is already present.
> Also, if username already includes the domain, rest of Hive and authorization 
> providers still expects the "short name" ("user" and not 
> "u...@domain.foo.com") for looking up privilege rules, etc.  As such, any 
> domain info in the username should be stripped off.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8886) Some Vectorized String CONCAT expressions result in runtime error Vectorization: Unsuported vector output type: StringGroup

2014-12-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230839#comment-14230839
 ] 

Hive QA commented on HIVE-8886:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12684488/HIVE-8886.02.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6695 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_aggregate
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mapjoin_mapjoin
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1942/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1942/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1942/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12684488 - PreCommit-HIVE-TRUNK-Build

> Some Vectorized String CONCAT expressions result in runtime error 
> Vectorization: Unsuported vector output type: StringGroup
> ---
>
> Key: HIVE-8886
> URL: https://issues.apache.org/jira/browse/HIVE-8886
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.14.1
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8886.01.patch, HIVE-8886.02.patch
>
>
> {noformat}
> SELECT CONCAT(CONCAT(CONCAT('Quarter ',CAST(CAST((MONTH(dt) - 1) / 3 + 1 AS 
> INT) AS STRING)),'-'),CAST(YEAR(dt) AS STRING)) AS `field`
> FROM vectortab2korc 
> GROUP BY CONCAT(CONCAT(CONCAT('Quarter ',CAST(CAST((MONTH(dt) - 1) / 3 + 
> 1 AS INT) AS STRING)),'-'),CAST(YEAR(dt) AS STRING))
> LIMIT 50;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 28283: HIVE-8900:Create encryption testing framework

2014-12-01 Thread cheng xu



> On Dec. 1, 2014, 9:35 p.m., Sergio Pena wrote:
> >

Thanks for your review. Please see my inline comments.


> On Dec. 1, 2014, 9:35 p.m., Sergio Pena wrote:
> > itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java, line 268
> > 
> >
> > Do we need to set this value? For what I know, AES/CTR/NoPadding is the 
> > only cipher mode that HDFS supports.

Yes, you are right. We can remove it at this point. I add the setter here just 
in case one or more cipher will be supported later.


> On Dec. 1, 2014, 9:35 p.m., Sergio Pena wrote:
> > itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java, line 365
> > 
> >
> > I think this method 'in itEncryptionRelatedConfIfNeeded()' can be 
> > called inside the block line 370
> > as it is only called when clusterType is encrypted. Also, we may rename 
> > the method for a shorter name as IfNeeded won't be used.

I am afriad not since the initialization of dfs needs the security related 
properties. To clean the code, I do a change in this snippet.


> On Dec. 1, 2014, 9:35 p.m., Sergio Pena wrote:
> > itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java, line 372
> > 
> >
> > What if we move this line inside initEncryptionConf()? It is part of 
> > encryption initialization.

What the initEncryptionConf did is trying to set the security related 
properties. Another bigger consideration is that the fs needs the security 
related configuration and we have to complete the configuration setting work 
before the initilazing dfs or hes.


> On Dec. 1, 2014, 9:35 p.m., Sergio Pena wrote:
> > itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java, line 754
> > 
> >
> > - May we rename this method so that starts with the 'init' verb? This 
> > is just a good pratice I've learned in order
> >   to read code much better. Also, IfNeeded() is the correct syntax.
> > - We could also get rid of the IfNeeded() word (making the name 
> > shorter) if if add the validation when this method
> >   is called instead of inside the method. It is just an opinion.

Thanks for your suggestion. FIXED it.


> On Dec. 1, 2014, 9:35 p.m., Sergio Pena wrote:
> > itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java, line 785
> > 
> >
> > Just to comment that AES-256 can be used only if JCE is installed in 
> > your environment. Otherwise, any encryption
> >   with this key will fail. Keys can be created, but when you try to 
> > encrypt something, fails. We should put a 
> >   comment here so that another developer knows this.

FIXED


> On Dec. 1, 2014, 9:35 p.m., Sergio Pena wrote:
> > shims/0.20/src/main/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java, 
> > line 872
> > 
> >
> > I think we should leave the 'hadoop.encryption.is.not.supported' key 
> > name on unsupported hadoop versions. This was left only as a comment for 
> > developers. Nobody will use this configuration key anyways.

FIXED


> On Dec. 1, 2014, 9:35 p.m., Sergio Pena wrote:
> > shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java, 
> > line 497
> > 
> >
> > I think we should leave the 'hadoop.encryption.is.not.supported' key 
> > name on unsupported hadoop versions. This was left only as a comment for 
> > developers. Nobody will use this configuration key anyways.

FIXED


> On Dec. 1, 2014, 9:35 p.m., Sergio Pena wrote:
> > shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java, 
> > lines 498-499
> > 
> >
> > Do we need these two configuration values in the configuration 
> > environment? These are used only for test purposes on QTestUtil. The user 
> > won't use these fields on hive-site.xml ever. Or not yet.

FIXED


> On Dec. 1, 2014, 9:35 p.m., Sergio Pena wrote:
> > shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java, 
> > lines 960-970
> > 
> >
> > Why was this block removed? I see the keyProvider variable is 
> > initialized inside getMiniDfs() method (testing). But what will happen with 
> > production code?

We should get the key provider via the name node who has already created a key 
provider. It has no different for the KMS while not for the java key provider. 
For java key provider, it stores the key into a file. And I digged into the 
code and found that two key pr

[jira] [Commented] (HIVE-8374) schematool fails on Postgres versions < 9.2

2014-12-01 Thread Mohit Sabharwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230826#comment-14230826
 ] 

Mohit Sabharwal commented on HIVE-8374:
---

Thanks, [~leftylev]. Yes, this a bug fix, so no documentation is needed for it.

The "-dbOpts" option was added for future use. However, the option does appear 
in the schematool usage output:
{code}
 -dbOpts  Backend DB specific options
{code}
I'd just document it when we add options for this arg. What do you think?

> schematool fails on Postgres versions < 9.2
> ---
>
> Key: HIVE-8374
> URL: https://issues.apache.org/jira/browse/HIVE-8374
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Fix For: 0.15.0, 0.14.1
>
> Attachments: HIVE-8374.1.patch, HIVE-8374.2.patch, HIVE-8374.3.patch, 
> HIVE-8374.patch
>
>
> The upgrade script for HIVE-5700 creates an UDF with language 'plpgsql',
> which is available by default only for Postgres 9.2+.
> For older Postgres versions, the language must be explicitly created,
> otherwise schematool fails with the error:
> {code}
> Error: ERROR: language "plpgsql" does not exist
>   Hint: Use CREATE LANGUAGE to load the language into the database. 
> (state=42704,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8990) mapjoin_mapjoin.q is failing on Tez (missed golden file update)

2014-12-01 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230825#comment-14230825
 ] 

Szehon Ho commented on HIVE-8990:
-

+1

> mapjoin_mapjoin.q is failing on Tez (missed golden file update)
> ---
>
> Key: HIVE-8990
> URL: https://issues.apache.org/jira/browse/HIVE-8990
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-8990.1.patch
>
>
> mapjoin_mapjoin.q was updated (SORT_BEFORE_DIFF). However, since the tez test 
> were stuck the accompanying update to the golden file was missed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8774) CBO: enable groupBy index

2014-12-01 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8774:
--
Attachment: HIVE-8774.11.patch

address [~jpullokkaran]'s comments to remove support for constant and function 
inside parameters of count

> CBO: enable groupBy index
> -
>
> Key: HIVE-8774
> URL: https://issues.apache.org/jira/browse/HIVE-8774
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-8774.1.patch, HIVE-8774.10.patch, 
> HIVE-8774.11.patch, HIVE-8774.2.patch, HIVE-8774.3.patch, HIVE-8774.4.patch, 
> HIVE-8774.5.patch, HIVE-8774.6.patch, HIVE-8774.7.patch, HIVE-8774.8.patch, 
> HIVE-8774.9.patch
>
>
> Right now, even when groupby index is build, CBO is not able to use it. In 
> this patch, we are trying to make it use groupby index that we build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8774) CBO: enable groupBy index

2014-12-01 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8774:
--
Status: Patch Available  (was: Open)

> CBO: enable groupBy index
> -
>
> Key: HIVE-8774
> URL: https://issues.apache.org/jira/browse/HIVE-8774
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-8774.1.patch, HIVE-8774.10.patch, 
> HIVE-8774.11.patch, HIVE-8774.2.patch, HIVE-8774.3.patch, HIVE-8774.4.patch, 
> HIVE-8774.5.patch, HIVE-8774.6.patch, HIVE-8774.7.patch, HIVE-8774.8.patch, 
> HIVE-8774.9.patch
>
>
> Right now, even when groupby index is build, CBO is not able to use it. In 
> this patch, we are trying to make it use groupby index that we build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8774) CBO: enable groupBy index

2014-12-01 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8774:
--
Status: Open  (was: Patch Available)

> CBO: enable groupBy index
> -
>
> Key: HIVE-8774
> URL: https://issues.apache.org/jira/browse/HIVE-8774
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-8774.1.patch, HIVE-8774.10.patch, 
> HIVE-8774.2.patch, HIVE-8774.3.patch, HIVE-8774.4.patch, HIVE-8774.5.patch, 
> HIVE-8774.6.patch, HIVE-8774.7.patch, HIVE-8774.8.patch, HIVE-8774.9.patch
>
>
> Right now, even when groupby index is build, CBO is not able to use it. In 
> this patch, we are trying to make it use groupby index that we build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8995) Find thread leak in RSC Tests

2014-12-01 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230812#comment-14230812
 ] 

Rui Li commented on HIVE-8995:
--

OK I'll have a look.

> Find thread leak in RSC Tests
> -
>
> Key: HIVE-8995
> URL: https://issues.apache.org/jira/browse/HIVE-8995
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Rui Li
>
> I was regenerating output as part of the merge:
> {noformat}
> mvn test -Dtest=TestSparkCliDriver -Phadoop-2 -Dtest.output.overwrite=true 
> -Dqfile=annotate_stats_join.q,auto_join0.q,auto_join1.q,auto_join10.q,auto_join11.q,auto_join12.q,auto_join13.q,auto_join14.q,auto_join15.q,auto_join16.q,auto_join17.q,auto_join18.q,auto_join18_multi_distinct.q,auto_join19.q,auto_join2.q,auto_join20.q,auto_join21.q,auto_join22.q,auto_join23.q,auto_join24.q,auto_join26.q,auto_join27.q,auto_join28.q,auto_join29.q,auto_join3.q,auto_join30.q,auto_join31.q,auto_join32.q,auto_join9.q,auto_join_reordering_values.q
>  
> auto_join_without_localtask.q,auto_smb_mapjoin_14.q,auto_sortmerge_join_1.q,auto_sortmerge_join_10.q,auto_sortmerge_join_11.q,auto_sortmerge_join_12.q,auto_sortmerge_join_14.q,auto_sortmerge_join_15.q,auto_sortmerge_join_2.q,auto_sortmerge_join_3.q,auto_sortmerge_join_4.q,auto_sortmerge_join_5.q,auto_sortmerge_join_6.q,auto_sortmerge_join_7.q,auto_sortmerge_join_8.q,auto_sortmerge_join_9.q,bucket_map_join_1.q,bucket_map_join_2.q,bucket_map_join_tez1.q,bucket_map_join_tez2.q,bucketmapjoin1.q,bucketmapjoin10.q,bucketmapjoin11.q,bucketmapjoin12.q,bucketmapjoin13.q,bucketmapjoin2.q,bucketmapjoin3.q,bucketmapjoin4.q,bucketmapjoin5.q,bucketmapjoin7.q
>  
> bucketmapjoin8.q,bucketmapjoin9.q,bucketmapjoin_negative.q,bucketmapjoin_negative2.q,bucketmapjoin_negative3.q,column_access_stats.q,cross_join.q,ctas.q,custom_input_output_format.q,groupby4.q,groupby7_noskew_multi_single_reducer.q,groupby_complex_types.q,groupby_complex_types_multi_single_reducer.q,groupby_multi_single_reducer2.q,groupby_multi_single_reducer3.q,groupby_position.q,groupby_sort_1_23.q,groupby_sort_skew_1_23.q,having.q,index_auto_self_join.q,infer_bucket_sort_convert_join.q,innerjoin.q,input12.q,join0.q,join1.q,join11.q,join12.q,join13.q,join14.q,join15.q
>  
> join17.q,join18.q,join18_multi_distinct.q,join19.q,join2.q,join20.q,join21.q,join22.q,join23.q,join25.q,join26.q,join27.q,join28.q,join29.q,join3.q,join30.q,join31.q,join32.q,join32_lessSize.q,join33.q,join35.q,join36.q,join37.q,join38.q,join39.q,join40.q,join41.q,join9.q,join_alt_syntax.q,join_cond_pushdown_1.q
>  
> join_cond_pushdown_2.q,join_cond_pushdown_3.q,join_cond_pushdown_4.q,join_cond_pushdown_unqual1.q,join_cond_pushdown_unqual2.q,join_cond_pushdown_unqual3.q,join_cond_pushdown_unqual4.q,join_filters_overlap.q,join_hive_626.q,join_map_ppr.q,join_merge_multi_expressions.q,join_merging.q,join_nullsafe.q,join_rc.q,join_reorder.q,join_reorder2.q,join_reorder3.q,join_reorder4.q,join_star.q,join_thrift.q,join_vc.q,join_view.q,limit_pushdown.q,load_dyn_part13.q,load_dyn_part14.q,louter_join_ppr.q,mapjoin1.q,mapjoin_decimal.q,mapjoin_distinct.q,mapjoin_filter_on_outerjoin.q
>  
> mapjoin_hook.q,mapjoin_mapjoin.q,mapjoin_memcheck.q,mapjoin_subquery.q,mapjoin_subquery2.q,mapjoin_test_outer.q,mergejoins.q,mergejoins_mixed.q,multi_insert.q,multi_insert_gby.q,multi_insert_gby2.q,multi_insert_gby3.q,multi_insert_lateral_view.q,multi_insert_mixed.q,multi_insert_move_tasks_share_dependencies.q,multi_join_union.q,optimize_nullscan.q,outer_join_ppr.q,parallel.q,parallel_join0.q,parallel_join1.q,parquet_join.q,pcr.q,ppd_gby_join.q,ppd_join.q,ppd_join2.q,ppd_join3.q,ppd_join4.q,ppd_join5.q,ppd_join_filter.q
>  
> ppd_multi_insert.q,ppd_outer_join1.q,ppd_outer_join2.q,ppd_outer_join3.q,ppd_outer_join4.q,ppd_outer_join5.q,ppd_transform.q,reduce_deduplicate_exclude_join.q,router_join_ppr.q,sample10.q,sample8.q,script_pipe.q,semijoin.q,skewjoin.q,skewjoin_noskew.q,skewjoin_union_remove_1.q,skewjoin_union_remove_2.q,skewjoinopt1.q,skewjoinopt10.q,skewjoinopt11.q,skewjoinopt12.q,skewjoinopt13.q,skewjoinopt14.q,skewjoinopt15.q,skewjoinopt16.q,skewjoinopt17.q,skewjoinopt18.q,skewjoinopt19.q,skewjoinopt2.q,skewjoinopt20.q
>  
> skewjoinopt3.q,skewjoinopt4.q,skewjoinopt5.q,skewjoinopt6.q,skewjoinopt7.q,skewjoinopt8.q,skewjoinopt9.q,smb_mapjoin9.q,smb_mapjoin_1.q,smb_mapjoin_10.q,smb_mapjoin_13.q,smb_mapjoin_14.q,smb_mapjoin_15.q,smb_mapjoin_16.q,smb_mapjoin_17.q,smb_mapjoin_2.q,smb_mapjoin_25.q,smb_mapjoin_3.q,smb_mapjoin_4.q,smb_mapjoin_5.q,smb_mapjoin_6.q,smb_mapjoin_7.q,sort_merge_join_desc_1.q,sort_merge_join_desc_2.q,sort_merge_join_desc_3.q,sort_merge_join_desc_4.q,sort_merge_join_desc_5.q,sort_merge_join_desc_6.q,sort_merge_join_desc_7.q,sort_merge_join_desc_8.q
>  
> stats1.q,subquery_in.q,subquery_multiinsert.q,table_access

[jira] [Commented] (HIVE-8991) Fix custom_input_output_format [Spark Branch]

2014-12-01 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230811#comment-14230811
 ] 

Rui Li commented on HIVE-8991:
--

Hi [~vanzin], just as [~xuefuz] said, this JIRA is only meant to fix the test 
{{custom_input_output_format.q}} after we enable unit tests with remote spark 
context. Please feel free to take it if you think of a better solution. Thanks!

> Fix custom_input_output_format [Spark Branch]
> -
>
> Key: HIVE-8991
> URL: https://issues.apache.org/jira/browse/HIVE-8991
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-8991.1-spark.patch
>
>
> After HIVE-8836, {{custom_input_output_format}} fails because of missing 
> hive-it-util in remote driver's class path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8943) Fix memory limit check for combine nested mapjoins [Spark Branch]

2014-12-01 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-8943:

Attachment: (was: HIVE-8943-4.spark.branch)

> Fix memory limit check for combine nested mapjoins [Spark Branch]
> -
>
> Key: HIVE-8943
> URL: https://issues.apache.org/jira/browse/HIVE-8943
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-8943-4.spark.patch, HIVE-8943.1-spark.patch, 
> HIVE-8943.1-spark.patch, HIVE-8943.2-spark.patch, HIVE-8943.3-spark.patch
>
>
> Its the opposite problem of what we thought in HIVE-8701.
> SparkMapJoinOptimizer does combine nested mapjoins into one work due to 
> removal of RS for big-table.  So we need to enhance the check to calculate if 
> all the MapJoins in that work (spark-stage) will fit into the memory, 
> otherwise it might overwhelm memory for that particular spark executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8943) Fix memory limit check for combine nested mapjoins [Spark Branch]

2014-12-01 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-8943:

Attachment: HIVE-8943-4.spark.patch

> Fix memory limit check for combine nested mapjoins [Spark Branch]
> -
>
> Key: HIVE-8943
> URL: https://issues.apache.org/jira/browse/HIVE-8943
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-8943-4.spark.patch, HIVE-8943.1-spark.patch, 
> HIVE-8943.1-spark.patch, HIVE-8943.2-spark.patch, HIVE-8943.3-spark.patch
>
>
> Its the opposite problem of what we thought in HIVE-8701.
> SparkMapJoinOptimizer does combine nested mapjoins into one work due to 
> removal of RS for big-table.  So we need to enhance the check to calculate if 
> all the MapJoins in that work (spark-stage) will fit into the memory, 
> otherwise it might overwhelm memory for that particular spark executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8943) Fix memory limit check for combine nested mapjoins [Spark Branch]

2014-12-01 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-8943:

Attachment: HIVE-8943-4.spark.branch

Fix algorithm and cleanup after discussion with Xuefu.  Original code was too 
aggressively incorporating connected mapjoins into its size calculation, new 
code only looks at the big table's connected mapjoins.

> Fix memory limit check for combine nested mapjoins [Spark Branch]
> -
>
> Key: HIVE-8943
> URL: https://issues.apache.org/jira/browse/HIVE-8943
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-8943-4.spark.patch, HIVE-8943.1-spark.patch, 
> HIVE-8943.1-spark.patch, HIVE-8943.2-spark.patch, HIVE-8943.3-spark.patch
>
>
> Its the opposite problem of what we thought in HIVE-8701.
> SparkMapJoinOptimizer does combine nested mapjoins into one work due to 
> removal of RS for big-table.  So we need to enhance the check to calculate if 
> all the MapJoins in that work (spark-stage) will fit into the memory, 
> otherwise it might overwhelm memory for that particular spark executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 28500: HIVE-8943 : Fix memory limit check for combine nested mapjoins [Spark Branch]

2014-12-01 Thread Szehon Ho


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28500/
---

(Updated Dec. 2, 2014, 1:34 a.m.)


Review request for hive, Chao Sun, Suhas Satish, and Xuefu Zhang.


Changes
---

Fix algorithm and cleanup after discussion with Xuefu.  Original code was too 
aggressively incorporating connected mapjoins into its size calculation, new 
code only looks at the big table's connected mapjoins.


Bugs: HIVE-8943
https://issues.apache.org/jira/browse/HIVE-8943


Repository: hive-git


Description
---

SparkMapJoinOptimizer by default combines nested mapjoins into one work due to 
removal of RS for big-table. So we need to enhance the mapjoin check to 
calculate if all the MapJoins in that work (spark-stage) will fit into the 
memory, otherwise it might overwhelm memory for that particular spark executor.


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java
 819eef1 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/spark/OptimizeSparkProcContext.java 
0c339a5 
  ql/src/test/queries/clientpositive/auto_join_stats.q PRE-CREATION 
  ql/src/test/queries/clientpositive/auto_join_stats2.q PRE-CREATION 
  ql/src/test/results/clientpositive/auto_join_stats.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/auto_join_stats2.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/spark/auto_join_stats.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/spark/auto_join_stats2.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/28500/diff/


Testing
---

Added two unit tests:

1.  auto_join_stats, which sets a memory limit and checks that algorithm does 
not put more than 1 mapjoin in one BaseWork
2.  auto_join_stats2, which is the same query without memory limit, and check 
that algorithm puts all mapjoin in one BaseWork because it can.


Thanks,

Szehon Ho

[jira] [Commented] (HIVE-9001) Ship with log4j.properties file that has a reliable time based rolling policy

2014-12-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230774#comment-14230774
 ] 

Hive QA commented on HIVE-9001:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12684492/HIVE-9001.1.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 6695 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_aggregate
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mapjoin_mapjoin
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Delimited
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1941/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1941/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1941/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12684492 - PreCommit-HIVE-TRUNK-Build

> Ship with log4j.properties file that has a reliable time based rolling policy
> -
>
> Key: HIVE-9001
> URL: https://issues.apache.org/jira/browse/HIVE-9001
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-9001.1.patch
>
>
> The hive log gets locked by the hive process and cannot be rolled in windows 
> OS.
> Install Hive in  Windows, start hive, try and rename hive log while Hive is 
> running. 
> Wait for log4j tries to rename it and it will throw the same error as it is 
> locked by the process.
> The changes in https://issues.apache.org/bugzilla/show_bug.cgi?id=29726 
> should be integrated to Hive for a reliable rollover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8135) Pool zookeeper connections

2014-12-01 Thread Ferdinand Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230762#comment-14230762
 ] 

Ferdinand Xu commented on HIVE-8135:


I think so. http://curator.apache.org/curator-recipes/index.html And I will 
take a look into the details.

> Pool zookeeper connections
> --
>
> Key: HIVE-8135
> URL: https://issues.apache.org/jira/browse/HIVE-8135
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Brock Noland
>Assignee: Ferdinand Xu
>
> Today we create a ZK connection per client. We should instead have a 
> connection pool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-860) Persistent distributed cache

2014-12-01 Thread Ferdinand Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-860:
--
Attachment: HIVE-860.4.patch

reattach since the error "No space left on device (28)"

> Persistent distributed cache
> 
>
> Key: HIVE-860
> URL: https://issues.apache.org/jira/browse/HIVE-860
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.12.0
>Reporter: Zheng Shao
>Assignee: Ferdinand Xu
> Fix For: 0.15.0
>
> Attachments: HIVE-860.1.patch, HIVE-860.2.patch, HIVE-860.2.patch, 
> HIVE-860.3.patch, HIVE-860.4.patch, HIVE-860.4.patch, HIVE-860.patch, 
> HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, 
> HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, 
> HIVE-860.patch, HIVE-860.patch
>
>
> DistributedCache is shared across multiple jobs, if the hdfs file name is the 
> same.
> We need to make sure Hive put the same file into the same location every time 
> and do not overwrite if the file content is the same.
> We can achieve 2 different results:
> A1. Files added with the same name, timestamp, and md5 in the same session 
> will have a single copy in distributed cache.
> A2. Filed added with the same name, timestamp, and md5 will have a single 
> copy in distributed cache.
> A2 has a bigger benefit in sharing but may raise a question on when Hive 
> should clean it up in hdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9003) Vectorized IF expr broken for the scalar and scalar case

2014-12-01 Thread Matt McCline (JIRA)

Matt McCline created HIVE-9003:
--

 Summary: Vectorized IF expr broken for the scalar and scalar case
 Key: HIVE-9003
 URL: https://issues.apache.org/jira/browse/HIVE-9003
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 0.14.0
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Fix For: 0.14.1


SELECT IF (bool_col, 'first', 'second') FROM ...

is broken for Vectorization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9002) union all does not generate correct result for order by and limit

2014-12-01 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230734#comment-14230734
 ] 

Pengcheng Xiong commented on HIVE-9002:
---

3 candidate ways to fix it
(1) fix that within HiveParser.g
(2) fix that in QB by rewriting
(3) partially reverse the patch of 
https://issues.apache.org/jira/browse/HIVE-6189 and use subqueries for union all

[~jpullokkaran], could you please take a look? 

> union all does not generate correct result for order by and limit
> -
>
> Key: HIVE-9002
> URL: https://issues.apache.org/jira/browse/HIVE-9002
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> Right now if we have
> select col from A
> union all
> select col from B [Operator]
> it is treated as
> (select col from A)
> union all
> (select col from B [Operator])
> Although it is correct for where, group by (having) join operators, it is not 
> correct for order by and limit operators. They should be
> (select col from A
> union all
> select col from B) [order by, limit]
> For order by, we can refer to MySQL, Oracle, DB2
> mysql
> http://dev.mysql.com/doc/refman/5.1/en/union.html
> oracle
> https://docs.oracle.com/cd/E17952_01/refman-5.0-en/union.html
> ibm
> http://www-01.ibm.com/support/knowledgecenter/ssw_i5_54/sqlp/rbafykeyu.htm



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9002) union all does not generate correct result for order by and limit

2014-12-01 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-9002:
--
Description: 
Right now if we have
select col from A
union all
select col from B [Operator]

it is treated as

(select col from A)
union all
(select col from B [Operator])

Although it is correct for where, group by (having) join operators, it is not 
correct for order by and limit operators. They should be

(select col from A
union all
select col from B) [order by, limit]

For order by, we can refer to MySQL, Oracle, DB2

mysql

http://dev.mysql.com/doc/refman/5.1/en/union.html

oracle

https://docs.oracle.com/cd/E17952_01/refman-5.0-en/union.html

ibm

http://www-01.ibm.com/support/knowledgecenter/ssw_i5_54/sqlp/rbafykeyu.htm


  was:
Right now if we have
select col from A
union all
select col from B [Operator]

it is treated as

(select col from A)
union all
(select col from B [Operator])

Although it is correct for where, group by (having) join operators, it is not 
correct for order by and limit operators. They should be

(select col from A
union all
select col from B) [order by, limit]



> union all does not generate correct result for order by and limit
> -
>
> Key: HIVE-9002
> URL: https://issues.apache.org/jira/browse/HIVE-9002
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> Right now if we have
> select col from A
> union all
> select col from B [Operator]
> it is treated as
> (select col from A)
> union all
> (select col from B [Operator])
> Although it is correct for where, group by (having) join operators, it is not 
> correct for order by and limit operators. They should be
> (select col from A
> union all
> select col from B) [order by, limit]
> For order by, we can refer to MySQL, Oracle, DB2
> mysql
> http://dev.mysql.com/doc/refman/5.1/en/union.html
> oracle
> https://docs.oracle.com/cd/E17952_01/refman-5.0-en/union.html
> ibm
> http://www-01.ibm.com/support/knowledgecenter/ssw_i5_54/sqlp/rbafykeyu.htm



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9002) union all does not generate correct result for order by and limit

2014-12-01 Thread Pengcheng Xiong (JIRA)

Pengcheng Xiong created HIVE-9002:
-

 Summary: union all does not generate correct result for order by 
and limit
 Key: HIVE-9002
 URL: https://issues.apache.org/jira/browse/HIVE-9002
 Project: Hive
  Issue Type: Bug
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong


Right now if we have
select col from A
union all
select col from B [Operator]

it is treated as

(select col from A)
union all
(select col from B [Operator])

Although it is correct for where, group by (having) join operators, it is not 
correct for order by and limit operators. They should be

(select col from A
union all
select col from B) [order by, limit]




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8947) HIVE-8876 also affects Postgres < 9.2

2014-12-01 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230726#comment-14230726
 ] 

Sergey Shelukhin commented on HIVE-8947:


[~vikram.dixit] ok for 14.1?

> HIVE-8876 also affects Postgres < 9.2
> -
>
> Key: HIVE-8947
> URL: https://issues.apache.org/jira/browse/HIVE-8947
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.15.0
>
> Attachments: HIVE-8947.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8974) Upgrade to Calcite 1.0.0-SNAPSHOT (with lots of renames)

2014-12-01 Thread Laljo John Pullokkaran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230718#comment-14230718
 ] 

Laljo John Pullokkaran commented on HIVE-8974:
--

The failure may be due to QA not clearing the local mvn repo.
I have updated your bug description (which should prompt qa run to clear cache).

> Upgrade to Calcite 1.0.0-SNAPSHOT (with lots of renames)
> 
>
> Key: HIVE-8974
> URL: https://issues.apache.org/jira/browse/HIVE-8974
> Project: Hive
>  Issue Type: Task
>Affects Versions: 0.15.0
>Reporter: Julian Hyde
>Assignee: Jesus Camacho Rodriguez
> Fix For: 0.15.0
>
> Attachments: HIVE-8974.01.patch, HIVE-8974.patch
>
>
> CLEAR LIBRARY CACHE
> Calcite recently (after 0.9.2, before 1.0.0) re-organized its package 
> structure and renamed a lot of classes. CALCITE-296 has the details, 
> including a description of the before:after mapping.
> This task is to upgrade to the version of Calcite that has the renamed 
> packages. There is a 1.0.0-SNAPSHOT in Apache nexus.
> Calcite functionality has not changed significantly, so it should be 
> straightforward to rename. This task should be completed ASAP, before Calcite 
> moves on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8974) Upgrade to Calcite 1.0.0-SNAPSHOT (with lots of renames)

2014-12-01 Thread Laljo John Pullokkaran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-8974:
-
Description: 
CLEAR LIBRARY CACHE

Calcite recently (after 0.9.2, before 1.0.0) re-organized its package structure 
and renamed a lot of classes. CALCITE-296 has the details, including a 
description of the before:after mapping.

This task is to upgrade to the version of Calcite that has the renamed 
packages. There is a 1.0.0-SNAPSHOT in Apache nexus.

Calcite functionality has not changed significantly, so it should be 
straightforward to rename. This task should be completed ASAP, before Calcite 
moves on.

  was:
Calcite recently (after 0.9.2, before 1.0.0) re-organized its package structure 
and renamed a lot of classes. CALCITE-296 has the details, including a 
description of the before:after mapping.

This task is to upgrade to the version of Calcite that has the renamed 
packages. There is a 1.0.0-SNAPSHOT in Apache nexus.

Calcite functionality has not changed significantly, so it should be 
straightforward to rename. This task should be completed ASAP, before Calcite 
moves on.


> Upgrade to Calcite 1.0.0-SNAPSHOT (with lots of renames)
> 
>
> Key: HIVE-8974
> URL: https://issues.apache.org/jira/browse/HIVE-8974
> Project: Hive
>  Issue Type: Task
>Affects Versions: 0.15.0
>Reporter: Julian Hyde
>Assignee: Jesus Camacho Rodriguez
> Fix For: 0.15.0
>
> Attachments: HIVE-8974.01.patch, HIVE-8974.patch
>
>
> CLEAR LIBRARY CACHE
> Calcite recently (after 0.9.2, before 1.0.0) re-organized its package 
> structure and renamed a lot of classes. CALCITE-296 has the details, 
> including a description of the before:after mapping.
> This task is to upgrade to the version of Calcite that has the renamed 
> packages. There is a 1.0.0-SNAPSHOT in Apache nexus.
> Calcite functionality has not changed significantly, so it should be 
> straightforward to rename. This task should be completed ASAP, before Calcite 
> moves on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8948) TestStreaming is flaky

2014-12-01 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8948:
-
   Resolution: Fixed
Fix Version/s: 0.15.0
   Status: Resolved  (was: Patch Available)

Patch checked in.  Thanks Eugene for the review.

> TestStreaming is flaky
> --
>
> Key: HIVE-8948
> URL: https://issues.apache.org/jira/browse/HIVE-8948
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 0.15.0
>
> Attachments: HIVE-8948.patch
>
>
> TestStreaming seems to fail in one of its tests or another about 1 in 50 
> times.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 28510: HIVE-8974

2014-12-01 Thread John Pullokkaran


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28510/#review63461
---



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveAggregate.java


Why can't we reuse HiveAggregateRel?


- John Pullokkaran


On Nov. 27, 2014, 2:37 p.m., Jesús Camacho Rodríguez wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28510/
> ---
> 
> (Updated Nov. 27, 2014, 2:37 p.m.)
> 
> 
> Review request for hive, John Pullokkaran and Julian Hyde.
> 
> 
> Bugs: HIVE-8974
> https://issues.apache.org/jira/browse/HIVE-8974
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Upgrade to Calcite 1.0.0-SNAPSHOT
> 
> 
> Diffs
> -
> 
>   pom.xml 630b10ce35032e4b2dee50ef3dfe5feb58223b78 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveAggregate.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/HiveDefaultRelMetadataProvider.java
>  e9e052ffe8759fa9c49377c58d41450feee0b126 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/HiveOptiqUtil.java 
> 80f657e9b1e7e9e965e6814ae76de78316367135 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/HiveTypeSystemImpl.java 
> 1bc5a2cfca071ea02a446ae517481f927193f23c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/OptiqSemanticException.java
>  d2b08fa64b868942b7636df171ed89f0081f7253 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/RelOptHiveTable.java 
> 080d27fa873f071fb2e0f7932ad26819b79d0477 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/TraitsUtil.java 
> 4b44a28ca77540fd643fc03b89dcb4b2155d081a 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/cost/HiveCost.java 
> 72fe5d6f26d0fd9a34c8e89be3040cce4593fd4a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/cost/HiveCostUtil.java 
> 7436f12f662542c41e71a7fee37179e35e4e2553 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/cost/HiveVolcanoPlanner.java
>  5deb801649f47e0629b3583ef57c62d4a4699f78 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/reloperators/HiveAggregateRel.java
>  fc198958735e12cb3503a0b4c486d8328a10a2fa 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/reloperators/HiveFilterRel.java
>  8b850463ac1c3270163725f876404449ef8dc5f9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/reloperators/HiveJoinRel.java
>  3d6aa848cd4c83ec8eb22f7df449911d67a53b9b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/reloperators/HiveLimitRel.java
>  f8755d0175c10e5b5461649773bf44abe998b44e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/reloperators/HiveProjectRel.java
>  7b434ea58451bef6a6566eb241933843ee855606 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/reloperators/HiveRel.java
>  4738c4ac2d33cd15d2db7fe4b8336e1f59dd5212 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/reloperators/HiveSortRel.java
>  f85363d50c1c3eb9cef39072106057669454d4da 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/reloperators/HiveTableScanRel.java
>  bd66459def099df6432f344a9d8439deef09daa6 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/reloperators/HiveUnionRel.java
>  d34fe9540e239c13f6bd23894056305c0c402e0d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/rules/HiveMergeProjectRule.java
>  d6581e64fc8ea183666ea6c91397378456461088 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/rules/HivePartitionPrunerRule.java
>  ee19a6cbab0597242214e915745631f76214f70f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/rules/HivePushFilterPastJoinRule.java
>  1c483eabcc1aa43cc80d7b71e21a4ae4d30a7e12 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/rules/PartitionPruner.java
>  bdc8373877c1684855d256c9d45743f383fc7615 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/stats/FilterSelectivityEstimator.java
>  28bf2ad506656b78894467c30364d751b180676e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/stats/HiveRelMdDistinctRowCount.java
>  4be57b110c1a45819467d55e8a69e5529989c8f6 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/stats/HiveRelMdRowCount.java
>  8c7f643940b74dd7743635c3eaa046d52d41346f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/stats/HiveRelMdSelectivity.java
>  49d2ee5a67b72fbf6134ce71de1d7260069cd16f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/stats/HiveRelMdUniqueKeys.java
>  c3c8bdd2466b0f46d49437fcf8d49dbb689cfcda 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/ASTBuilder.java
>  58320c73aafbfeec025f52ee813b3cfd06fa0821 
>   
> ql/src/java/or

Re: Review Request 27713: CBO: enable groupBy index

2014-12-01 Thread John Pullokkaran


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27713/#review63459
---



ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java


I don't think you can allow function wraping index key. Since we don't know 
if UDF is going to  mutate the values (Non Null -> Null, Null -> Non Null).

Example:
select a, count(b) from (select a, (case a is null then 1 else a) as b from 
r1)r2 group by a;


- John Pullokkaran


On Dec. 1, 2014, 6:57 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27713/
> ---
> 
> (Updated Dec. 1, 2014, 6:57 p.m.)
> 
> 
> Review request for hive and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Right now, even when groupby index is build, CBO is not able to use it. In 
> this patch, we are trying to make it use groupby index that we build. The 
> basic problem is that 
> for SEL1-SEL2-GRY-...-SEL3,
> the previous version only modify SEL2, which immediately precedes GRY.
> Now, with CBO, we have lots of SELs, e.g., SEL1.
> So, the solution is to modify all of them.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java 
> 9ffa708 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyProcFactory.java
>  02216de 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteGBUsingIndex.java
>  0f06ec9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndex.java
>  74614f3 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndexCtx.java
>  d699308 
>   ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx_cbo_1.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx_cbo_2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out fdc1dc6 
>   ql/src/test/results/clientpositive/ql_rewrite_gbtoidx_cbo_1.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/ql_rewrite_gbtoidx_cbo_2.q.out 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/27713/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>

Re: 0.15 release

2014-12-01 Thread Thejas Nair

Brock,
When you say more frequent releases, what schedule do you have in mind
? I think a (approximately) quarterly release cycle would be good.
We branched for hive 0.14 on Sept 25, which means we have been adding
new features not in 0.14 for more than 2 months.
How about branching for 0.15 equivalent in another month or two ?
Sometime in Jan ?



On Mon, Dec 1, 2014 at 2:19 PM, Thejas Nair  wrote:
> +1 .
> Regarding the next version being 0.15 - I have some thoughts on the
> versioning of hive. I will start a different thread on that.
>
>
> On Mon, Dec 1, 2014 at 11:43 AM, Brock Noland  wrote:
>> Hi,
>>
>> In 2014 we did two large releases. Thank you very much to the RM's for
>> pushing those out! I've found that Apache projects gain traction
>> through releasing often, thus I think we should aim to increase the
>> rate of releases in 2015. (Not that I cannot complain since I did not
>> volunteer to RM any release.)
>>
>> As such I'd like to volunteer as RM for the 0.15 release.
>>
>> Cheers,
>> Brock

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

[jira] [Updated] (HIVE-6421) abs() should preserve precision/scale of decimal input

2014-12-01 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6421:
-
   Resolution: Fixed
Fix Version/s: 0.15.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks for review Ashutosh

> abs() should preserve precision/scale of decimal input
> --
>
> Key: HIVE-6421
> URL: https://issues.apache.org/jira/browse/HIVE-6421
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 0.15.0
>
> Attachments: HIVE-6421.1.txt, HIVE-6421.2.patch, HIVE-6421.3.patch
>
>
> {noformat}
> hive> describe dec1;
> OK
> c1decimal(10,2)   None 
> hive> explain select c1, abs(c1) from dec1;
>  ...
> Select Operator
>   expressions: c1 (type: decimal(10,2)), abs(c1) (type: 
> decimal(38,18))
> {noformat}
> Given that abs() is a GenericUDF it should be possible for the return type 
> precision/scale to match the input precision/scale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8981) Not a directory error in mapjoin_hook.q [Spark Branch]

2014-12-01 Thread Chao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230600#comment-14230600
 ] 

Chao commented on HIVE-8981:


[~szehon] Is this issue also happening randomly? What is the failing test? Any 
suggestion on how to reproduce it?

> Not a directory error in mapjoin_hook.q [Spark Branch]
> --
>
> Key: HIVE-8981
> URL: https://issues.apache.org/jira/browse/HIVE-8981
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
> Environment: Using remote-spark context with 
> spark-master=local-cluster [2,2,1024]
>Reporter: Szehon Ho
>Assignee: Chao
>
> Hits the following exception:
> {noformat}
> 2014-11-26 15:17:11,728 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - 14/11/26 15:17:11 WARN TaskSetManager: Lost 
> task 0.0 in stage 8.0 (TID 18, 172.16.3.52): java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to 
> create table container
> 2014-11-26 15:17:11,728 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:160)
> 2014-11-26 15:17:11,728 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:47)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:28)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> scala.collection.Iterator$class.foreach(Iterator.scala:727)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.scheduler.Task.run(Task.scala:56)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at java.lang.Thread.run(Thread.java:744)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - Caused by: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to 
> create table container
> 2014-11-26 15:17:11,729 I

[jira] [Reopened] (HIVE-8981) Not a directory error in mapjoin_hook.q [Spark Branch]

2014-12-01 Thread Chao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao reopened HIVE-8981:


> Not a directory error in mapjoin_hook.q [Spark Branch]
> --
>
> Key: HIVE-8981
> URL: https://issues.apache.org/jira/browse/HIVE-8981
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
> Environment: Using remote-spark context with 
> spark-master=local-cluster [2,2,1024]
>Reporter: Szehon Ho
>Assignee: Chao
>
> Hits the following exception:
> {noformat}
> 2014-11-26 15:17:11,728 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - 14/11/26 15:17:11 WARN TaskSetManager: Lost 
> task 0.0 in stage 8.0 (TID 18, 172.16.3.52): java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to 
> create table container
> 2014-11-26 15:17:11,728 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:160)
> 2014-11-26 15:17:11,728 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:47)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:28)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> scala.collection.Iterator$class.foreach(Iterator.scala:727)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.scheduler.Task.run(Task.scala:56)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at java.lang.Thread.run(Thread.java:744)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - Caused by: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to 
> create table container
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java

[jira] [Commented] (HIVE-8982) IndexOutOfBounds exception in mapjoin [Spark Branch]

2014-12-01 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230588#comment-14230588
 ] 

Szehon Ho commented on HIVE-8982:
-

Sorry this is the "Not a directory" exception that was closed in the other 
JIRA..

> IndexOutOfBounds exception in mapjoin [Spark Branch]
> 
>
> Key: HIVE-8982
> URL: https://issues.apache.org/jira/browse/HIVE-8982
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Chao
>
> There are sometimes random failures in spark mapjoin during unit tests like:
> {noformat}
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.first(MapJoinEagerRowContainer.java:70)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.write(MapJoinEagerRowContainer.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.persist(MapJoinTableContainerSerDe.java:167)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.flushToFile(SparkHashTableSinkOperator.java:128)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:77)
>   ... 20 more
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115

[GitHub] hive pull request: How to calculate the Kendall coefficient of cor...

2014-12-01 Thread MarcinKosinski

GitHub user MarcinKosinski opened a pull request:

https://github.com/apache/hive/pull/24

How to calculate the Kendall coefficient of correlation of a pair of a 
numeric columns in the group?

In this [wiki 
page](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF) 
there is a function `corr()` that calculates the Pearson coefficient of 
correlation, but my question is that: is there any function in Hive that 
enables to calculate the Kendall coefficient of correlation of a pair of a 
numeric columns in the group?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/hive HIVE-8065

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/24.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #24


commit 1628cb08e0bf1c6b168a9aa7b6f978a943cdc105
Author: Brock Noland 
Date:   2014-11-05T23:38:20Z

Creating branch for HIVE-8065

git-svn-id: 
https://svn.apache.org/repos/asf/hive/branches/HIVE-8065@1637006 
13f79535-47bb-0310-9956-ffa450edef68

commit a9a413d6f4bd7273caf3d26bd4dd2b0d9672d56d
Author: Brock Noland 
Date:   2014-11-14T00:04:08Z

HIVE-8749 - Change Hadoop version on HIVE-8065 to 2.6-SNAPSHOT (Sergio Pena 
via Brock)

git-svn-id: 
https://svn.apache.org/repos/asf/hive/branches/HIVE-8065@1639558 
13f79535-47bb-0310-9956-ffa450edef68

commit b45941d8b64e3b2553034cc6ae212a31084a694d
Author: Brock Noland 
Date:   2014-11-17T22:36:47Z

HIVE-8750 - Commit initial encryption work (Sergio Pena via Brock)

git-svn-id: 
https://svn.apache.org/repos/asf/hive/branches/HIVE-8065@1640247 
13f79535-47bb-0310-9956-ffa450edef68

commit 184cf1ef21d7f9e8ce6b9d39044708d6daf1ffab
Author: Brock Noland 
Date:   2014-11-18T22:51:55Z

HIVE-8904 - Hive should support multiple Key provider modes (Ferdinand Xu 
via Brock)

git-svn-id: 
https://svn.apache.org/repos/asf/hive/branches/HIVE-8065@1640446 
13f79535-47bb-0310-9956-ffa450edef68

commit 61c468250512d7242aa343d59f2a81e3174ea112
Author: Brock Noland 
Date:   2014-11-20T06:10:44Z

HIVE-8919 - Fix FileUtils.copy() method to call distcp only for HDFS files 
(not local files) (Sergio Pena via Brock)

git-svn-id: 
https://svn.apache.org/repos/asf/hive/branches/HIVE-8065@1640684 
13f79535-47bb-0310-9956-ffa450edef68

commit 018b67cadc0dbad64df05819d92b87f2dc5bdaf8
Author: Brock Noland 
Date:   2014-11-21T21:57:23Z

HIVE-8945 - Allow user to read encrypted read-only tables only if the 
scratch directory is encrypted (Sergio Pena via Brock)

git-svn-id: 
https://svn.apache.org/repos/asf/hive/branches/HIVE-8065@1641007 
13f79535-47bb-0310-9956-ffa450edef68




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Resolved] (HIVE-8982) IndexOutOfBounds exception in mapjoin [Spark Branch]

2014-12-01 Thread Chao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao resolved HIVE-8982.

Resolution: Cannot Reproduce

> IndexOutOfBounds exception in mapjoin [Spark Branch]
> 
>
> Key: HIVE-8982
> URL: https://issues.apache.org/jira/browse/HIVE-8982
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Chao
>
> There are sometimes random failures in spark mapjoin during unit tests like:
> {noformat}
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.first(MapJoinEagerRowContainer.java:70)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.write(MapJoinEagerRowContainer.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.persist(MapJoinTableContainerSerDe.java:167)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.flushToFile(SparkHashTableSinkOperator.java:128)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:77)
>   ... 20 more
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at 
> org.apache.sp

[jira] [Reopened] (HIVE-8982) IndexOutOfBounds exception in mapjoin [Spark Branch]

2014-12-01 Thread Chao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao reopened HIVE-8982:


> IndexOutOfBounds exception in mapjoin [Spark Branch]
> 
>
> Key: HIVE-8982
> URL: https://issues.apache.org/jira/browse/HIVE-8982
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Chao
>
> There are sometimes random failures in spark mapjoin during unit tests like:
> {noformat}
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.first(MapJoinEagerRowContainer.java:70)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.write(MapJoinEagerRowContainer.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.persist(MapJoinTableContainerSerDe.java:167)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.flushToFile(SparkHashTableSinkOperator.java:128)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:77)
>   ... 20 more
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.appl

[jira] [Commented] (HIVE-8982) IndexOutOfBounds exception in mapjoin [Spark Branch]

2014-12-01 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230583#comment-14230583
 ] 

Szehon Ho commented on HIVE-8982:
-

I dug a little and found the exception again here as part of run 464.  See 
[http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-464/failed/TestSparkCliDriver-groupby_complex_types.q-auto_join9.q-groupby_map_ppr.q-and-12-more/spark.log|http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-464/failed/TestSparkCliDriver-groupby_complex_types.q-auto_join9.q-groupby_map_ppr.q-and-12-more/spark.log].
  I think its still unresolved..

> IndexOutOfBounds exception in mapjoin [Spark Branch]
> 
>
> Key: HIVE-8982
> URL: https://issues.apache.org/jira/browse/HIVE-8982
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Chao
>
> There are sometimes random failures in spark mapjoin during unit tests like:
> {noformat}
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.first(MapJoinEagerRowContainer.java:70)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.write(MapJoinEagerRowContainer.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.persist(MapJoinTableContainerSerDe.java:167)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.flushToFile(SparkHashTableSinkOperator.java:128)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:77)
>   ... 20 more
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.conve

[jira] [Resolved] (HIVE-8982) IndexOutOfBounds exception in mapjoin [Spark Branch]

2014-12-01 Thread Chao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao resolved HIVE-8982.

Resolution: Cannot Reproduce

> IndexOutOfBounds exception in mapjoin [Spark Branch]
> 
>
> Key: HIVE-8982
> URL: https://issues.apache.org/jira/browse/HIVE-8982
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Chao
>
> There are sometimes random failures in spark mapjoin during unit tests like:
> {noformat}
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.first(MapJoinEagerRowContainer.java:70)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.write(MapJoinEagerRowContainer.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.persist(MapJoinTableContainerSerDe.java:167)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.flushToFile(SparkHashTableSinkOperator.java:128)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:77)
>   ... 20 more
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at 
> org.apache.sp

[jira] [Updated] (HIVE-9001) Ship with log4j.properties file that has a reliable time based rolling policy

2014-12-01 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-9001:

Attachment: HIVE-9001.1.patch

> Ship with log4j.properties file that has a reliable time based rolling policy
> -
>
> Key: HIVE-9001
> URL: https://issues.apache.org/jira/browse/HIVE-9001
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-9001.1.patch
>
>
> The hive log gets locked by the hive process and cannot be rolled in windows 
> OS.
> Install Hive in  Windows, start hive, try and rename hive log while Hive is 
> running. 
> Wait for log4j tries to rename it and it will throw the same error as it is 
> locked by the process.
> The changes in https://issues.apache.org/bugzilla/show_bug.cgi?id=29726 
> should be integrated to Hive for a reliable rollover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9001) Ship with log4j.properties file that has a reliable time based rolling policy

2014-12-01 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-9001:

Attachment: (was: HIVE-9001.1.patch)

> Ship with log4j.properties file that has a reliable time based rolling policy
> -
>
> Key: HIVE-9001
> URL: https://issues.apache.org/jira/browse/HIVE-9001
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-9001.1.patch
>
>
> The hive log gets locked by the hive process and cannot be rolled in windows 
> OS.
> Install Hive in  Windows, start hive, try and rename hive log while Hive is 
> running. 
> Wait for log4j tries to rename it and it will throw the same error as it is 
> locked by the process.
> The changes in https://issues.apache.org/bugzilla/show_bug.cgi?id=29726 
> should be integrated to Hive for a reliable rollover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-8982) IndexOutOfBounds exception in mapjoin [Spark Branch]

2014-12-01 Thread Chao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao reassigned HIVE-8982:
--

Assignee: Chao

> IndexOutOfBounds exception in mapjoin [Spark Branch]
> 
>
> Key: HIVE-8982
> URL: https://issues.apache.org/jira/browse/HIVE-8982
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Chao
>
> There are sometimes random failures in spark mapjoin during unit tests like:
> {noformat}
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.first(MapJoinEagerRowContainer.java:70)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.write(MapJoinEagerRowContainer.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.persist(MapJoinTableContainerSerDe.java:167)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.flushToFile(SparkHashTableSinkOperator.java:128)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:77)
>   ... 20 more
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at 
> org.apache.spark.Spark

[jira] [Commented] (HIVE-8982) IndexOutOfBounds exception in mapjoin [Spark Branch]

2014-12-01 Thread Chao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230572#comment-14230572
 ] 

Chao commented on HIVE-8982:


OK, closing for now.

> IndexOutOfBounds exception in mapjoin [Spark Branch]
> 
>
> Key: HIVE-8982
> URL: https://issues.apache.org/jira/browse/HIVE-8982
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>
> There are sometimes random failures in spark mapjoin during unit tests like:
> {noformat}
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.first(MapJoinEagerRowContainer.java:70)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.write(MapJoinEagerRowContainer.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.persist(MapJoinTableContainerSerDe.java:167)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.flushToFile(SparkHashTableSinkOperator.java:128)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:77)
>   ... 20 more
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at

[jira] [Updated] (HIVE-9001) Ship with log4j.properties file that has a reliable time based rolling policy

2014-12-01 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-9001:

Description: 
The hive log gets locked by the hive process and cannot be rolled in windows OS.
Install Hive in  Windows, start hive, try and rename hive log while Hive is 
running. 
Wait for log4j tries to rename it and it will throw the same error as it is 
locked by the process.

The changes in https://issues.apache.org/bugzilla/show_bug.cgi?id=29726 should 
be integrated to Hive for a reliable rollover.

  was:
The hive log gets locked by the hive process and cannot be rolled in windows OS.
Install Hive in  Windows, start hive, try and rename hive log while Hive is 
running. 
Wait for log4j tries to rename it and it will throw the same error as it is 
locked by the process.

The changes in https://issues.apache.org/bugzilla/show_bug.cgi?id=29726 should 
be integrated to Hive (Internal as well as trunk) for a reliable rollover.


> Ship with log4j.properties file that has a reliable time based rolling policy
> -
>
> Key: HIVE-9001
> URL: https://issues.apache.org/jira/browse/HIVE-9001
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-9001.1.patch
>
>
> The hive log gets locked by the hive process and cannot be rolled in windows 
> OS.
> Install Hive in  Windows, start hive, try and rename hive log while Hive is 
> running. 
> Wait for log4j tries to rename it and it will throw the same error as it is 
> locked by the process.
> The changes in https://issues.apache.org/bugzilla/show_bug.cgi?id=29726 
> should be integrated to Hive for a reliable rollover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9001) Ship with log4j.properties file that has a reliable time based rolling policy

2014-12-01 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-9001:

Status: Patch Available  (was: Open)

> Ship with log4j.properties file that has a reliable time based rolling policy
> -
>
> Key: HIVE-9001
> URL: https://issues.apache.org/jira/browse/HIVE-9001
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-9001.1.patch
>
>
> The hive log gets locked by the hive process and cannot be rolled in windows 
> OS.
> Install Hive in  Windows, start hive, try and rename hive log while Hive is 
> running. 
> Wait for log4j tries to rename it and it will throw the same error as it is 
> locked by the process.
> The changes in https://issues.apache.org/bugzilla/show_bug.cgi?id=29726 
> should be integrated to Hive for a reliable rollover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8886) Some Vectorized String CONCAT expressions result in runtime error Vectorization: Unsuported vector output type: StringGroup

2014-12-01 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8886:
---
Status: Patch Available  (was: In Progress)

> Some Vectorized String CONCAT expressions result in runtime error 
> Vectorization: Unsuported vector output type: StringGroup
> ---
>
> Key: HIVE-8886
> URL: https://issues.apache.org/jira/browse/HIVE-8886
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.14.1
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8886.01.patch, HIVE-8886.02.patch
>
>
> {noformat}
> SELECT CONCAT(CONCAT(CONCAT('Quarter ',CAST(CAST((MONTH(dt) - 1) / 3 + 1 AS 
> INT) AS STRING)),'-'),CAST(YEAR(dt) AS STRING)) AS `field`
> FROM vectortab2korc 
> GROUP BY CONCAT(CONCAT(CONCAT('Quarter ',CAST(CAST((MONTH(dt) - 1) / 3 + 
> 1 AS INT) AS STRING)),'-'),CAST(YEAR(dt) AS STRING))
> LIMIT 50;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8982) IndexOutOfBounds exception in mapjoin [Spark Branch]

2014-12-01 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230568#comment-14230568
 ] 

Xuefu Zhang commented on HIVE-8982:
---

It doesn't seem they are happening any more. Feel free to close this.

> IndexOutOfBounds exception in mapjoin [Spark Branch]
> 
>
> Key: HIVE-8982
> URL: https://issues.apache.org/jira/browse/HIVE-8982
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>
> There are sometimes random failures in spark mapjoin during unit tests like:
> {noformat}
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.first(MapJoinEagerRowContainer.java:70)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.write(MapJoinEagerRowContainer.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.persist(MapJoinTableContainerSerDe.java:167)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.flushToFile(SparkHashTableSinkOperator.java:128)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:77)
>   ... 20 more
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.Spar

[jira] [Updated] (HIVE-9001) Ship with log4j.properties file that has a reliable time based rolling policy

2014-12-01 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-9001:

Attachment: HIVE-9001.1.patch

cc-ing [~sushanth] for reviewing this change.

> Ship with log4j.properties file that has a reliable time based rolling policy
> -
>
> Key: HIVE-9001
> URL: https://issues.apache.org/jira/browse/HIVE-9001
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-9001.1.patch
>
>
> The hive log gets locked by the hive process and cannot be rolled in windows 
> OS.
> Install Hive in  Windows, start hive, try and rename hive log while Hive is 
> running. 
> Wait for log4j tries to rename it and it will throw the same error as it is 
> locked by the process.
> The changes in https://issues.apache.org/bugzilla/show_bug.cgi?id=29726 
> should be integrated to Hive (Internal as well as trunk) for a reliable 
> rollover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8982) IndexOutOfBounds exception in mapjoin [Spark Branch]

2014-12-01 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230567#comment-14230567
 ] 

Szehon Ho commented on HIVE-8982:
-

Yea.  I still see some random failures in mapjoin tests like:

[http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/464/testReport/junit/org.apache.hadoop.hive.cli/TestSparkCliDriver/testCliDriver_mapjoin_hook/|http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/464/testReport/junit/org.apache.hadoop.hive.cli/TestSparkCliDriver/testCliDriver_mapjoin_hook/]

Usually when I get those, I see this exception.  I didnt dig too deep into the 
latest random failure logs to confirm again though.

> IndexOutOfBounds exception in mapjoin [Spark Branch]
> 
>
> Key: HIVE-8982
> URL: https://issues.apache.org/jira/browse/HIVE-8982
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>
> There are sometimes random failures in spark mapjoin during unit tests like:
> {noformat}
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.first(MapJoinEagerRowContainer.java:70)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.write(MapJoinEagerRowContainer.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.persist(MapJoinTableContainerSerDe.java:167)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.flushToFile(SparkHashTableSinkOperator.java:128)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:77)
>   ... 20 more
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:1

[jira] [Updated] (HIVE-8886) Some Vectorized String CONCAT expressions result in runtime error Vectorization: Unsuported vector output type: StringGroup

2014-12-01 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8886:
---
Attachment: HIVE-8886.02.patch

> Some Vectorized String CONCAT expressions result in runtime error 
> Vectorization: Unsuported vector output type: StringGroup
> ---
>
> Key: HIVE-8886
> URL: https://issues.apache.org/jira/browse/HIVE-8886
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.14.1
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8886.01.patch, HIVE-8886.02.patch
>
>
> {noformat}
> SELECT CONCAT(CONCAT(CONCAT('Quarter ',CAST(CAST((MONTH(dt) - 1) / 3 + 1 AS 
> INT) AS STRING)),'-'),CAST(YEAR(dt) AS STRING)) AS `field`
> FROM vectortab2korc 
> GROUP BY CONCAT(CONCAT(CONCAT('Quarter ',CAST(CAST((MONTH(dt) - 1) / 3 + 
> 1 AS INT) AS STRING)),'-'),CAST(YEAR(dt) AS STRING))
> LIMIT 50;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8886) Some Vectorized String CONCAT expressions result in runtime error Vectorization: Unsuported vector output type: StringGroup

2014-12-01 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8886:
---
Status: In Progress  (was: Patch Available)

> Some Vectorized String CONCAT expressions result in runtime error 
> Vectorization: Unsuported vector output type: StringGroup
> ---
>
> Key: HIVE-8886
> URL: https://issues.apache.org/jira/browse/HIVE-8886
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.14.1
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 0.14.1
>
> Attachments: HIVE-8886.01.patch, HIVE-8886.02.patch
>
>
> {noformat}
> SELECT CONCAT(CONCAT(CONCAT('Quarter ',CAST(CAST((MONTH(dt) - 1) / 3 + 1 AS 
> INT) AS STRING)),'-'),CAST(YEAR(dt) AS STRING)) AS `field`
> FROM vectortab2korc 
> GROUP BY CONCAT(CONCAT(CONCAT('Quarter ',CAST(CAST((MONTH(dt) - 1) / 3 + 
> 1 AS INT) AS STRING)),'-'),CAST(YEAR(dt) AS STRING))
> LIMIT 50;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9001) Ship with log4j.properties file that has a reliable time based rolling policy

2014-12-01 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

Hari Sankar Sivarama Subramaniyan created HIVE-9001:
---

 Summary: Ship with log4j.properties file that has a reliable time 
based rolling policy
 Key: HIVE-9001
 URL: https://issues.apache.org/jira/browse/HIVE-9001
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan


The hive log gets locked by the hive process and cannot be rolled in windows OS.
Install Hive in  Windows, start hive, try and rename hive log while Hive is 
running. 
Wait for log4j tries to rename it and it will throw the same error as it is 
locked by the process.

The changes in https://issues.apache.org/bugzilla/show_bug.cgi?id=29726 should 
be integrated to Hive (Internal as well as trunk) for a reliable rollover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: 0.15 release

2014-12-01 Thread Thejas Nair

+1 .
Regarding the next version being 0.15 - I have some thoughts on the
versioning of hive. I will start a different thread on that.


On Mon, Dec 1, 2014 at 11:43 AM, Brock Noland  wrote:
> Hi,
>
> In 2014 we did two large releases. Thank you very much to the RM's for
> pushing those out! I've found that Apache projects gain traction
> through releasing often, thus I think we should aim to increase the
> rate of releases in 2015. (Not that I cannot complain since I did not
> volunteer to RM any release.)
>
> As such I'd like to volunteer as RM for the 0.15 release.
>
> Cheers,
> Brock

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

[jira] [Commented] (HIVE-8374) schematool fails on Postgres versions < 9.2

2014-12-01 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230555#comment-14230555
 ] 

Sergey Shelukhin commented on HIVE-8374:


backported to 14

> schematool fails on Postgres versions < 9.2
> ---
>
> Key: HIVE-8374
> URL: https://issues.apache.org/jira/browse/HIVE-8374
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Fix For: 0.15.0, 0.14.1
>
> Attachments: HIVE-8374.1.patch, HIVE-8374.2.patch, HIVE-8374.3.patch, 
> HIVE-8374.patch
>
>
> The upgrade script for HIVE-5700 creates an UDF with language 'plpgsql',
> which is available by default only for Postgres 9.2+.
> For older Postgres versions, the language must be explicitly created,
> otherwise schematool fails with the error:
> {code}
> Error: ERROR: language "plpgsql" does not exist
>   Hint: Use CREATE LANGUAGE to load the language into the database. 
> (state=42704,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8374) schematool fails on Postgres versions < 9.2

2014-12-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-8374:
---
Fix Version/s: 0.14.1

> schematool fails on Postgres versions < 9.2
> ---
>
> Key: HIVE-8374
> URL: https://issues.apache.org/jira/browse/HIVE-8374
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Fix For: 0.15.0, 0.14.1
>
> Attachments: HIVE-8374.1.patch, HIVE-8374.2.patch, HIVE-8374.3.patch, 
> HIVE-8374.patch
>
>
> The upgrade script for HIVE-5700 creates an UDF with language 'plpgsql',
> which is available by default only for Postgres 9.2+.
> For older Postgres versions, the language must be explicitly created,
> otherwise schematool fails with the error:
> {code}
> Error: ERROR: language "plpgsql" does not exist
>   Hint: Use CREATE LANGUAGE to load the language into the database. 
> (state=42704,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-8992) Fix two bucket related test failures, infer_bucket_sort_convert_join.q and parquet_join.q [Spark Branch]

2014-12-01 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang reassigned HIVE-8992:
-

Assignee: Jimmy Xiang

> Fix two bucket related test failures, infer_bucket_sort_convert_join.q and 
> parquet_join.q [Spark Branch]
> 
>
> Key: HIVE-8992
> URL: https://issues.apache.org/jira/browse/HIVE-8992
> Project: Hive
>  Issue Type: Sub-task
>  Components: spark-branch
>Reporter: Xuefu Zhang
>Assignee: Jimmy Xiang
>
> Failures shown in HIVE-8836. The seemed related to wrong reducer numbers in 
> terms of bucket join.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8982) IndexOutOfBounds exception in mapjoin [Spark Branch]

2014-12-01 Thread Chao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230548#comment-14230548
 ] 

Chao commented on HIVE-8982:


I ran mapjoin_mapjoin and auto_join31 each 10 times on the latest spark branch, 
but couldn't reproduce the issue. Is this still occuring on jenkins?

> IndexOutOfBounds exception in mapjoin [Spark Branch]
> 
>
> Key: HIVE-8982
> URL: https://issues.apache.org/jira/browse/HIVE-8982
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>
> There are sometimes random failures in spark mapjoin during unit tests like:
> {noformat}
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1365)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.first(MapJoinEagerRowContainer.java:70)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer.write(MapJoinEagerRowContainer.java:150)
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.persist(MapJoinTableContainerSerDe.java:167)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.flushToFile(SparkHashTableSinkOperator.java:128)
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:77)
>   ... 20 more
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>   at 
> org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.closeOp(SparkHashTableSinkOperator.java:83)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.ap

Re: Review Request 28283: HIVE-8900:Create encryption testing framework

2014-12-01 Thread Sergio Pena

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28283/#review63437
---

itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java

Do we need to set this value? For what I know, AES/CTR/NoPadding is the 
only cipher mode that HDFS supports.

itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java

I think this method 'in itEncryptionRelatedConfIfNeeded()' can be called 
inside the block line 370
as it is only called when clusterType is encrypted. Also, we may rename the 
method for a shorter name as IfNeeded won't be used.

itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java

What if we move this line inside initEncryptionConf()? It is part of 
encryption initialization.

itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java

- May we rename this method so that starts with the 'init' verb? This is 
just a good pratice I've learned in order
  to read code much better. Also, IfNeeded() is the correct syntax.
- We could also get rid of the IfNeeded() word (making the name shorter) if 
if add the validation when this method
  is called instead of inside the method. It is just an opinion.

itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java

Just to comment that AES-256 can be used only if JCE is installed in your 
environment. Otherwise, any encryption
  with this key will fail. Keys can be created, but when you try to encrypt 
something, fails. We should put a 
  comment here so that another developer knows this.

ql/src/test/templates/TestEncrytedHDFSCliDriver.vm

Why do we need this new class instead of TestCliDriver.vm?

shims/0.20/src/main/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java

I think we should leave the 'hadoop.encryption.is.not.supported' key name 
on unsupported hadoop versions. This was left only as a comment for developers. 
Nobody will use this configuration key anyways.

shims/0.20/src/main/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java

Do we need these two configuration values in the configuration environment? 
These are used only for test purposes on QTestUtil. The user won't use these 
fields on hive-site.xml ever. Or not yet.

shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java

I think we should leave the 'hadoop.encryption.is.not.supported' key name 
on unsupported hadoop versions. This was left only as a comment for developers. 
Nobody will use this configuration key anyways.

shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java

Do we need these two configuration values in the configuration environment? 
These are used only for test purposes on QTestUtil. The user won't use these 
fields on hive-site.xml ever. Or not yet.

shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java

Let's import the necessary modules only. I think the IDE did this 
replacement.

shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java

Why was this block removed? I see the keyProvider variable is initialized 
inside getMiniDfs() method (testing). But what will happen with production code?

- Sergio Pena

On Nov. 28, 2014, 1:45 a.m., cheng xu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28283/
> ---
> 
> (Updated Nov. 28, 2014, 1:45 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The patch includes:
> 1. enable security properties for hive security cluster
> 
> 
> Diffs
> -
> 
>   .gitignore c5decaf 
>   data/scripts/q_test_cleanup_for_encryption.sql PRE-CREATION 
>   data/scripts/q_test_init_for_encryption.sql PRE-CREATION 
>   itests/qtest/pom.xml 376f4a9 
>   itests/src/test/resources/testconfiguration.properties 3ae001d 
>   itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 31d5c29 
>   ql/src/test/queries/clientpositive/create_encrypted_table.q PRE-CREATION 
>   ql/src/test/templates/TestEncrytedHDFSCliDriver.vm PRE-CREATION 
>   shims/

[jira] [Commented] (HIVE-8957) Remote spark context needs to clean up itself in case of connection timeout [Spark Branch]

2014-12-01 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230502#comment-14230502
 ] 

Xuefu Zhang commented on HIVE-8957:
---

That's all right. I think I can bug you on this when you have cycles.

> Remote spark context needs to clean up itself in case of connection timeout 
> [Spark Branch]
> --
>
> Key: HIVE-8957
> URL: https://issues.apache.org/jira/browse/HIVE-8957
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-8957.1-spark.patch
>
>
> In the current SparkClient implementation (class SparkClientImpl), the 
> constructor does some initialization and in the end waits for the remote 
> driver to connect. In case of timeout, it just throws an exception without 
> cleaning itself. The cleanup is necessary to release system resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8970) Enable map join optimization only when hive.auto.convert.join is true [Spark Branch]

2014-12-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230489#comment-14230489
 ] 

Hive QA commented on HIVE-8970:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12684466/HIVE-8970.3-spark.patch

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 7223 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_cast_constant
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_custom_input_output_format
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_parquet_join
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/469/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/469/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-469/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12684466 - PreCommit-HIVE-SPARK-Build

> Enable map join optimization only when hive.auto.convert.join is true [Spark 
> Branch]
> 
>
> Key: HIVE-8970
> URL: https://issues.apache.org/jira/browse/HIVE-8970
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Chao
>Assignee: Chao
> Fix For: spark-branch
>
> Attachments: HIVE-8970.1-spark.patch, HIVE-8970.2-spark.patch, 
> HIVE-8970.3-spark.patch
>
>
> Right now, in Spark branch we enable MJ without looking at this 
> configuration. The related code in {{SparkMapJoinOptimizer}} is commented 
> out. We should only enable MJ when the flag is true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8774) CBO: enable groupBy index

2014-12-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230484#comment-14230484
 ] 

Hive QA commented on HIVE-8774:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12684451/HIVE-8774.10.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 6697 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_aggregate
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mapjoin_mapjoin
org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1940/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1940/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1940/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12684451 - PreCommit-HIVE-TRUNK-Build

> CBO: enable groupBy index
> -
>
> Key: HIVE-8774
> URL: https://issues.apache.org/jira/browse/HIVE-8774
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-8774.1.patch, HIVE-8774.10.patch, 
> HIVE-8774.2.patch, HIVE-8774.3.patch, HIVE-8774.4.patch, HIVE-8774.5.patch, 
> HIVE-8774.6.patch, HIVE-8774.7.patch, HIVE-8774.8.patch, HIVE-8774.9.patch
>
>
> Right now, even when groupby index is build, CBO is not able to use it. In 
> this patch, we are trying to make it use groupby index that we build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-8981) Not a directory error in mapjoin_hook.q [Spark Branch]

2014-12-01 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang resolved HIVE-8981.
---
Resolution: Cannot Reproduce

> Not a directory error in mapjoin_hook.q [Spark Branch]
> --
>
> Key: HIVE-8981
> URL: https://issues.apache.org/jira/browse/HIVE-8981
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
> Environment: Using remote-spark context with 
> spark-master=local-cluster [2,2,1024]
>Reporter: Szehon Ho
>Assignee: Chao
>
> Hits the following exception:
> {noformat}
> 2014-11-26 15:17:11,728 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - 14/11/26 15:17:11 WARN TaskSetManager: Lost 
> task 0.0 in stage 8.0 (TID 18, 172.16.3.52): java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to 
> create table container
> 2014-11-26 15:17:11,728 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:160)
> 2014-11-26 15:17:11,728 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:47)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:28)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> scala.collection.Iterator$class.foreach(Iterator.scala:727)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.scheduler.Task.run(Task.scala:56)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at java.lang.Thread.run(Thread.java:744)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - Caused by: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to 
> create table container
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.hadoop.hive.ql.exec.

[jira] [Commented] (HIVE-8981) Not a directory error in mapjoin_hook.q [Spark Branch]

2014-12-01 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230486#comment-14230486
 ] 

Xuefu Zhang commented on HIVE-8981:
---

Yeah, the test seems passed in the latest test run: 
https://issues.apache.org/jira/browse/HIVE-8998?focusedCommentId=14229321&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14229321

Closing this for now.

> Not a directory error in mapjoin_hook.q [Spark Branch]
> --
>
> Key: HIVE-8981
> URL: https://issues.apache.org/jira/browse/HIVE-8981
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
> Environment: Using remote-spark context with 
> spark-master=local-cluster [2,2,1024]
>Reporter: Szehon Ho
>Assignee: Chao
>
> Hits the following exception:
> {noformat}
> 2014-11-26 15:17:11,728 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - 14/11/26 15:17:11 WARN TaskSetManager: Lost 
> task 0.0 in stage 8.0 (TID 18, 172.16.3.52): java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to 
> create table container
> 2014-11-26 15:17:11,728 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:160)
> 2014-11-26 15:17:11,728 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:47)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:28)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> scala.collection.Iterator$class.foreach(Iterator.scala:727)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.scheduler.Task.run(Task.scala:56)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - at java.lang.Thread.run(Thread.java:744)
> 2014-11-26 15:17:11,729 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(364)) - Caused by: 
> org.apache.hadoop.hive.ql.metadata.HiveExc

[jira] [Commented] (HIVE-8957) Remote spark context needs to clean up itself in case of connection timeout [Spark Branch]

2014-12-01 Thread Marcelo Vanzin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230478#comment-14230478
 ] 

Marcelo Vanzin commented on HIVE-8957:
--

If you don't mind the bug remaining unattended for several days, sure. I have 
my hands full with all sorts of other things at the moment.

> Remote spark context needs to clean up itself in case of connection timeout 
> [Spark Branch]
> --
>
> Key: HIVE-8957
> URL: https://issues.apache.org/jira/browse/HIVE-8957
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-8957.1-spark.patch
>
>
> In the current SparkClient implementation (class SparkClientImpl), the 
> constructor does some initialization and in the end waits for the remote 
> driver to connect. In case of timeout, it just throws an exception without 
> cleaning itself. The cleanup is necessary to release system resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8957) Remote spark context needs to clean up itself in case of connection timeout [Spark Branch]

2014-12-01 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230473#comment-14230473
 ] 

Xuefu Zhang commented on HIVE-8957:
---

[~vanzin], would you mind owning the JIRA for now until you figure out a 
solution? 

> Remote spark context needs to clean up itself in case of connection timeout 
> [Spark Branch]
> --
>
> Key: HIVE-8957
> URL: https://issues.apache.org/jira/browse/HIVE-8957
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-8957.1-spark.patch
>
>
> In the current SparkClient implementation (class SparkClientImpl), the 
> constructor does some initialization and in the end waits for the remote 
> driver to connect. In case of timeout, it just throws an exception without 
> cleaning itself. The cleanup is necessary to release system resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8957) Remote spark context needs to clean up itself in case of connection timeout [Spark Branch]

2014-12-01 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-8957:
--
Status: Open  (was: Patch Available)

> Remote spark context needs to clean up itself in case of connection timeout 
> [Spark Branch]
> --
>
> Key: HIVE-8957
> URL: https://issues.apache.org/jira/browse/HIVE-8957
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-8957.1-spark.patch
>
>
> In the current SparkClient implementation (class SparkClientImpl), the 
> constructor does some initialization and in the end waits for the remote 
> driver to connect. In case of timeout, it just throws an exception without 
> cleaning itself. The cleanup is necessary to release system resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8970) Enable map join optimization only when hive.auto.convert.join is true [Spark Branch]

2014-12-01 Thread Chao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230470#comment-14230470
 ] 

Chao commented on HIVE-8970:


Yes, I believe so. When I enable mapjoin, I compared the unit test results 
against the previous results in the spark branch, which was previously compared 
against the MR results. 

> Enable map join optimization only when hive.auto.convert.join is true [Spark 
> Branch]
> 
>
> Key: HIVE-8970
> URL: https://issues.apache.org/jira/browse/HIVE-8970
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Chao
>Assignee: Chao
> Fix For: spark-branch
>
> Attachments: HIVE-8970.1-spark.patch, HIVE-8970.2-spark.patch, 
> HIVE-8970.3-spark.patch
>
>
> Right now, in Spark branch we enable MJ without looking at this 
> configuration. The related code in {{SparkMapJoinOptimizer}} is commented 
> out. We should only enable MJ when the flag is true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8991) Fix custom_input_output_format [Spark Branch]

2014-12-01 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230468#comment-14230468
 ] 

Xuefu Zhang commented on HIVE-8991:
---

[~vanzin], this doesn't block anything, and so let's do it in the right way. In 
the meantime, does it make sense for you to take this JIRA while you're doing 
the research? Thanks.

> Fix custom_input_output_format [Spark Branch]
> -
>
> Key: HIVE-8991
> URL: https://issues.apache.org/jira/browse/HIVE-8991
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-8991.1-spark.patch
>
>
> After HIVE-8836, {{custom_input_output_format}} fails because of missing 
> hive-it-util in remote driver's class path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 139 matches

Mail list logo