[jira] [Commented] (HIVE-14993) make WriteEntity distinguish writeType

2016-10-20 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15594136#comment-15594136
 ] 

Eugene Koifman commented on HIVE-14993:
---

from 
https://builds.apache.org/view/H-L/view/Hive/job/PreCommit-HIVE-Build/1701/testReport/
  for HIVE-13589

Test Name
Duration
Age
 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic]  
16 sec  3
 
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[current_date_timestamp]
4 sec   3
 org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0]   0.11 
sec148
 
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
  27 ms   148
 org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1]   34 ms   
148


> make WriteEntity distinguish writeType
> --
>
> Key: HIVE-14993
> URL: https://issues.apache.org/jira/browse/HIVE-14993
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14993.2.patch, HIVE-14993.3.patch, 
> HIVE-14993.4.patch, HIVE-14993.5.patch, HIVE-14993.6.patch, 
> HIVE-14993.7.patch, HIVE-14993.8.patch, HIVE-14993.patch, 
> debug.not2checkin.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15027) make sure export takes MM information into account

2016-10-20 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593832#comment-15593832
 ] 

Eugene Koifman commented on HIVE-15027:
---

nothing concrete

> make sure export takes MM information into account
> --
>
> Key: HIVE-15027
> URL: https://issues.apache.org/jira/browse/HIVE-15027
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>
> Export would currently blindly copy the directories; it should take MM 
> information into account.
> It's relatively easy to do in the CopyWork created from 
> ExportSemanticAnalyzer, but I will leave it undone for now in case there's 
> some better way to do it after ACID integration



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14953) don't use globStatus on S3 in MM tables

2016-10-20 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593700#comment-15593700
 ] 

Rajesh Balamohan edited comment on HIVE-14953 at 10/21/16 2:16 AM:
---

[~sershe] - It should be listFiles(path, recursive). I accidentally added as 
listStatus recursive in my earlier comment.

Default FS: 
https://github.com/apache/hadoop/blob/branch-2.8/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L1814
S3A FS which optimizes for bulk listing: 
https://github.com/apache/hadoop/blob/branch-2.8/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2025

So instead of 1000s of calls to s3 with globStatus, it would end up using very 
few calls to S3 with listFiles(path, recursive) and client side path filtering 
can be done on need basis. 

 


was (Author: rajesh.balamohan):
[~sershe] - It should be listFiles(path, recursive). I accidentally added as 
listStatus recursive in my earlier comment.

Default FS: 
https://github.com/apache/hadoop/blob/branch-2.8/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L1814
S3A FS which optimizes for bulk listing: 
https://github.com/apache/hadoop/blob/branch-2.8/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2025


 

> don't use globStatus on S3 in MM tables
> ---
>
> Key: HIVE-14953
> URL: https://issues.apache.org/jira/browse/HIVE-14953
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14953.patch
>
>
> Need to investigate if recursive get is faster. Also, normal listStatus might 
> suffice because MM code handles directory structure in a more definite manner 
> than old code; so it knows where the files of interest are to be found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14953) don't use globStatus on S3 in MM tables

2016-10-20 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593700#comment-15593700
 ] 

Rajesh Balamohan commented on HIVE-14953:
-

[~sershe] - It should be listFiles(path, recursive). I accidentally added as 
listStatus recursive in my earlier comment.

Default FS: 
https://github.com/apache/hadoop/blob/branch-2.8/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L1814
S3A FS which optimizes for bulk listing: 
https://github.com/apache/hadoop/blob/branch-2.8/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2025


 

> don't use globStatus on S3 in MM tables
> ---
>
> Key: HIVE-14953
> URL: https://issues.apache.org/jira/browse/HIVE-14953
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14953.patch
>
>
> Need to investigate if recursive get is faster. Also, normal listStatus might 
> suffice because MM code handles directory structure in a more definite manner 
> than old code; so it knows where the files of interest are to be found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14482) Add and drop table partition is not audit logged in HMS

2016-10-20 Thread Eric Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593695#comment-15593695
 ] 

Eric Lin edited comment on HIVE-14482 at 10/21/16 2:12 AM:
---

The error seems to be unrelated (please see the attached screenshot) and there 
is no existing test coverage for HMS audit logs, also it is pretty hard to test 
it.

Please advise if the patch can be accepted.

Thanks


was (Author: ericlin):
The error seems to be unrelated and there is no existing test coverage for HMS 
audit logs, also it is pretty hard to test it.

Please advise if the patch can be accepted.

Thanks

> Add and drop table partition is not audit logged in HMS
> ---
>
> Key: HIVE-14482
> URL: https://issues.apache.org/jira/browse/HIVE-14482
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Eric Lin
>Assignee: Eric Lin
>Priority: Minor
> Attachments: HIVE-14482.2.patch, HIVE-14482.patch, Screen Shot 
> 2016-10-21 at 1.06.33 pm.png
>
>
> When running:
> {code}
> ALTER TABLE test DROP PARTITION (b=140);
> {code}
> I only see the following in the HMS log:
> {code}
> 2016-08-08 23:12:34,081 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [pool-4-thread-2]:  from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
> 2016-08-08 23:12:34,082 INFO  org.apache.hadoop.hive.metastore.HiveMetaStore: 
> [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test
> 2016-08-08 23:12:34,082 INFO  
> org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: 
> ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default 
> tbl=test
> 2016-08-08 23:12:34,094 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [pool-4-thread-2]:  end=1470723154094 duration=13 
> from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 
> retryCount=0 error=false>
> 2016-08-08 23:12:34,095 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [pool-4-thread-2]:  from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
> 2016-08-08 23:12:34,095 INFO  org.apache.hadoop.hive.metastore.HiveMetaStore: 
> [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_partitions_by_expr : 
> db=default tbl=test
> 2016-08-08 23:12:34,096 INFO  
> org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: 
> ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_partitions_by_expr 
> : db=default tbl=test
> 2016-08-08 23:12:34,112 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [pool-4-thread-2]:  start=1470723154095 end=1470723154112 duration=17 
> from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 
> retryCount=0 error=false>
> 2016-08-08 23:12:34,172 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [pool-4-thread-2]:  from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
> 2016-08-08 23:12:34,173 INFO  org.apache.hadoop.hive.metastore.HiveMetaStore: 
> [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test
> 2016-08-08 23:12:34,173 INFO  
> org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: 
> ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default 
> tbl=test
> 2016-08-08 23:12:34,186 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [pool-4-thread-2]:  end=1470723154186 duration=14 
> from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 
> retryCount=0 error=false>
> 2016-08-08 23:12:34,186 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [pool-4-thread-2]:  from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
> 2016-08-08 23:12:34,187 INFO  org.apache.hadoop.hive.metastore.HiveMetaStore: 
> [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test
> 2016-08-08 23:12:34,187 INFO  
> org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: 
> ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default 
> tbl=test
> 2016-08-08 23:12:34,199 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [pool-4-thread-2]:  end=1470723154199 duration=13 
> from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 
> retryCount=0 error=false>
> 2016-08-08 23:12:34,203 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [pool-4-thread-2]:  from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
> 2016-08-08 23:12:34,215 INFO  org.apache.hadoop.hive.metastore.ObjectStore: 
> [pool-4-thread-2]: JDO filter pushdown cannot be used: Filtering is supported 
> only on partition keys of type string
> 2016-08-08 23:12:34,226 ERROR org.apache.hadoop.hdfs.KeyProviderCache: 
> [pool-4-thread-2]: Could not find uri with key 
> [dfs.encryption.key.provider.uri] to create a keyProvider !!
> 2016-08-08 23:12:34,239 INFO  org.apache.hadoop.hive.metastore.HiveMetaStore: 
> 

[jira] [Updated] (HIVE-14482) Add and drop table partition is not audit logged in HMS

2016-10-20 Thread Eric Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Lin updated HIVE-14482:

Attachment: Screen Shot 2016-10-21 at 1.06.33 pm.png

The error seems to be unrelated and there is no existing test coverage for HMS 
audit logs, also it is pretty hard to test it.

Please advise if the patch can be accepted.

Thanks

> Add and drop table partition is not audit logged in HMS
> ---
>
> Key: HIVE-14482
> URL: https://issues.apache.org/jira/browse/HIVE-14482
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Eric Lin
>Assignee: Eric Lin
>Priority: Minor
> Attachments: HIVE-14482.2.patch, HIVE-14482.patch, Screen Shot 
> 2016-10-21 at 1.06.33 pm.png
>
>
> When running:
> {code}
> ALTER TABLE test DROP PARTITION (b=140);
> {code}
> I only see the following in the HMS log:
> {code}
> 2016-08-08 23:12:34,081 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [pool-4-thread-2]:  from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
> 2016-08-08 23:12:34,082 INFO  org.apache.hadoop.hive.metastore.HiveMetaStore: 
> [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test
> 2016-08-08 23:12:34,082 INFO  
> org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: 
> ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default 
> tbl=test
> 2016-08-08 23:12:34,094 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [pool-4-thread-2]:  end=1470723154094 duration=13 
> from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 
> retryCount=0 error=false>
> 2016-08-08 23:12:34,095 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [pool-4-thread-2]:  from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
> 2016-08-08 23:12:34,095 INFO  org.apache.hadoop.hive.metastore.HiveMetaStore: 
> [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_partitions_by_expr : 
> db=default tbl=test
> 2016-08-08 23:12:34,096 INFO  
> org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: 
> ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_partitions_by_expr 
> : db=default tbl=test
> 2016-08-08 23:12:34,112 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [pool-4-thread-2]:  start=1470723154095 end=1470723154112 duration=17 
> from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 
> retryCount=0 error=false>
> 2016-08-08 23:12:34,172 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [pool-4-thread-2]:  from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
> 2016-08-08 23:12:34,173 INFO  org.apache.hadoop.hive.metastore.HiveMetaStore: 
> [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test
> 2016-08-08 23:12:34,173 INFO  
> org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: 
> ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default 
> tbl=test
> 2016-08-08 23:12:34,186 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [pool-4-thread-2]:  end=1470723154186 duration=14 
> from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 
> retryCount=0 error=false>
> 2016-08-08 23:12:34,186 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [pool-4-thread-2]:  from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
> 2016-08-08 23:12:34,187 INFO  org.apache.hadoop.hive.metastore.HiveMetaStore: 
> [pool-4-thread-2]: 2: source:xx.xx.xxx.xxx get_table : db=default tbl=test
> 2016-08-08 23:12:34,187 INFO  
> org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-4-thread-2]: 
> ugi=hive ip=xx.xx.xxx.xxxcmd=source:xx.xx.xxx.xxx get_table : db=default 
> tbl=test
> 2016-08-08 23:12:34,199 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [pool-4-thread-2]:  end=1470723154199 duration=13 
> from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=2 
> retryCount=0 error=false>
> 2016-08-08 23:12:34,203 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [pool-4-thread-2]:  from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
> 2016-08-08 23:12:34,215 INFO  org.apache.hadoop.hive.metastore.ObjectStore: 
> [pool-4-thread-2]: JDO filter pushdown cannot be used: Filtering is supported 
> only on partition keys of type string
> 2016-08-08 23:12:34,226 ERROR org.apache.hadoop.hdfs.KeyProviderCache: 
> [pool-4-thread-2]: Could not find uri with key 
> [dfs.encryption.key.provider.uri] to create a keyProvider !!
> 2016-08-08 23:12:34,239 INFO  org.apache.hadoop.hive.metastore.HiveMetaStore: 
> [pool-4-thread-2]: dropPartition() will move partition-directories to 
> trash-directory.
> 2016-08-08 23:12:34,239 INFO  hive.metastore.hivemetastoressimpl: 
> [pool-4-thread-2]: deleting  
> hdfs://:8020/user/hive/warehouse/default/test/b=140
> 2016-08-08 23:12:34,247 INFO  

[jira] [Commented] (HIVE-14953) don't use globStatus on S3 in MM tables

2016-10-20 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593670#comment-15593670
 ] 

Sergey Shelukhin commented on HIVE-14953:
-

[~rajesh.balamohan] but does it actually do that? I can see the implementation 
of listFiles(path, recursive) being a bunch of local code using 
listLocatedStatus for each located directory. listStatus doesn't have a 
recursive overload that I see

> don't use globStatus on S3 in MM tables
> ---
>
> Key: HIVE-14953
> URL: https://issues.apache.org/jira/browse/HIVE-14953
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14953.patch
>
>
> Need to investigate if recursive get is faster. Also, normal listStatus might 
> suffice because MM code handles directory structure in a more definite manner 
> than old code; so it knows where the files of interest are to be found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13589) beeline support prompt for password with '-p' option

2016-10-20 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593629#comment-15593629
 ] 

Ferdinand Xu commented on HIVE-13589:
-

Thank you for the update. Leave a few more comments on RB. 

> beeline support prompt for password with '-p' option
> 
>
> Key: HIVE-13589
> URL: https://issues.apache.org/jira/browse/HIVE-13589
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Thejas M Nair
>Assignee: Vihang Karajgaonkar
> Fix For: 2.2.0
>
> Attachments: HIVE-13589.1.patch, HIVE-13589.10.patch, 
> HIVE-13589.2.patch, HIVE-13589.3.patch, HIVE-13589.4.patch, 
> HIVE-13589.5.patch, HIVE-13589.6.patch, HIVE-13589.7.patch, 
> HIVE-13589.8.patch, HIVE-13589.9.patch
>
>
> Specifying connection string using commandline options in beeline is 
> convenient, as it gets saved in shell command history, and it is easy to 
> retrieve it from there.
> However, specifying the password in command prompt is not secure as it gets 
> displayed on screen and saved in the history.
> It should be possible to specify '-p' without an argument to make beeline 
> prompt for password.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14953) don't use globStatus on S3 in MM tables

2016-10-20 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593627#comment-15593627
 ] 

Rajesh Balamohan commented on HIVE-14953:
-

[~sershe] - It was in FileSinkOperator.handleMMTable (getMmDirectoryCandidates) 
specifically. I do not see that codepath in the latest codebase in the branch 
now. globStatus with pattern has to be replaced with {{listStatus(path, boolean 
recursive)}} and any additional filtering pattern has to be applied on client 
side. In cloud storage systems, it would be able to do prefix listing and 
reduce the number of calls significantly as compared to globStatus which 
iterates through the files one at a time in client side.

> don't use globStatus on S3 in MM tables
> ---
>
> Key: HIVE-14953
> URL: https://issues.apache.org/jira/browse/HIVE-14953
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14953.patch
>
>
> Need to investigate if recursive get is faster. Also, normal listStatus might 
> suffice because MM code handles directory structure in a more definite manner 
> than old code; so it knows where the files of interest are to be found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14953) don't use globStatus on S3 in MM tables

2016-10-20 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593624#comment-15593624
 ] 

Hive QA commented on HIVE-14953:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12834579/HIVE-14953.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1709/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1709/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1709/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2016-10-21 01:29:29.983
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-1709/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-10-21 01:29:29.988
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 1da HIVE-14985 : Remove UDF-s created during test runs 
(Peter Vary, reviewed by Sergey Shelukhin)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 1da HIVE-14985 : Remove UDF-s created during test runs 
(Peter Vary, reviewed by Sergey Shelukhin)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2016-10-21 01:29:31.144
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java:3816
error: ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java: patch does 
not apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java:1705
error: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java: patch does not 
apply
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12834579 - PreCommit-HIVE-Build

> don't use globStatus on S3 in MM tables
> ---
>
> Key: HIVE-14953
> URL: https://issues.apache.org/jira/browse/HIVE-14953
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14953.patch
>
>
> Need to investigate if recursive get is faster. Also, normal listStatus might 
> suffice because MM code handles directory structure in a more definite manner 
> than old code; so it knows where the files of interest are to be found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14954) put FSOP manifests for the instances of the same vertex into a directory

2016-10-20 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593622#comment-15593622
 ] 

Hive QA commented on HIVE-14954:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12834578/HIVE-14954.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1708/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1708/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1708/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2016-10-21 01:28:51.218
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-1708/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-10-21 01:28:51.221
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 1da HIVE-14985 : Remove UDF-s created during test runs 
(Peter Vary, reviewed by Sergey Shelukhin)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 1da HIVE-14985 : Remove UDF-s created during test runs 
(Peter Vary, reviewed by Sergey Shelukhin)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2016-10-21 01:28:52.176
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: common/src/java/org/apache/hadoop/hive/common/ValidWriteIds.java: No 
such file or directory
error: 
metastore/src/java/org/apache/hadoop/hive/metastore/MmCleanerThread.java: No 
such file or directory
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java:149
error: ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java: patch does 
not apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateMapper.java:234
error: 
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateMapper.java:
 patch does not apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java:1853
error: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java: patch does not 
apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/unionproc/UnionProcFactory.java:218
error: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/unionproc/UnionProcFactory.java:
 patch does not apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java:65
error: ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java: 
patch does not apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:208
error: ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java: patch 
does not apply
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12834578 - PreCommit-HIVE-Build

> put FSOP manifests for the instances of the same vertex into a directory
> 
>
> Key: HIVE-14954
> URL: https://issues.apache.org/jira/browse/HIVE-14954
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14954.patch
>
>
> Deleting 100s of manifests can be expensive.



--
This message was sent by Atlassian 

[jira] [Commented] (HIVE-14993) make WriteEntity distinguish writeType

2016-10-20 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593617#comment-15593617
 ] 

Hive QA commented on HIVE-14993:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12834576/HIVE-14993.8.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10564 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=132)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[current_date_timestamp]
 (batchId=144)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=164)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=164)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=164)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1707/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1707/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1707/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12834576 - PreCommit-HIVE-Build

> make WriteEntity distinguish writeType
> --
>
> Key: HIVE-14993
> URL: https://issues.apache.org/jira/browse/HIVE-14993
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14993.2.patch, HIVE-14993.3.patch, 
> HIVE-14993.4.patch, HIVE-14993.5.patch, HIVE-14993.6.patch, 
> HIVE-14993.7.patch, HIVE-14993.8.patch, HIVE-14993.patch, 
> debug.not2checkin.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14913) Add new unit tests

2016-10-20 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593607#comment-15593607
 ] 

Vineet Garg commented on HIVE-14913:


cc [~ashutoshc] latest patch is to fix pre-commit test failures

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Fix For: 2.2.0
>
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, 
> HIVE-14913.3.patch, HIVE-14913.4.patch, HIVE-14913.5.patch, 
> HIVE-14913.6.patch, HIVE-14913.7.patch, HIVE-14913.8.patch, HIVE-14913.9.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14913) Add new unit tests

2016-10-20 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14913:
---
Attachment: HIVE-14913.9.patch

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Fix For: 2.2.0
>
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, 
> HIVE-14913.3.patch, HIVE-14913.4.patch, HIVE-14913.5.patch, 
> HIVE-14913.6.patch, HIVE-14913.7.patch, HIVE-14913.8.patch, HIVE-14913.9.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15027) make sure export takes MM information into account

2016-10-20 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593591#comment-15593591
 ] 

Sergey Shelukhin commented on HIVE-15027:
-

[~ekoifman] [~alangates] [~wzheng] Are there plans to support export table for 
full ACID?

> make sure export takes MM information into account
> --
>
> Key: HIVE-15027
> URL: https://issues.apache.org/jira/browse/HIVE-15027
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>
> Export would currently blindly copy the directories; it should take MM 
> information into account.
> It's relatively easy to do in the CopyWork created from 
> ExportSemanticAnalyzer, but I will leave it undone for now in case there's 
> some better way to do it after ACID integration



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14881) integrate MM tables into ACID: merge cleaner into ACID threads

2016-10-20 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14881:
--
Component/s: Transactions

> integrate MM tables into ACID: merge cleaner into ACID threads 
> ---
>
> Key: HIVE-14881
> URL: https://issues.apache.org/jira/browse/HIVE-14881
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14679) csv2/tsv2 output format disables quoting by default and it's difficult to enable

2016-10-20 Thread Jianguo Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593556#comment-15593556
 ] 

Jianguo Tian edited comment on HIVE-14679 at 10/21/16 1:05 AM:
---

I have updated the latest patch on the [Review 
Board|https://reviews.apache.org/r/52981/], [~brocknoland], [~kennethmac2000], 
[~ngangam], could you please help me review this latest patch? Looking forward 
to your precious opinion. Thanks a lot!


was (Author: jonnyr):
I have updated latest patch on the [Review 
Board|https://reviews.apache.org/r/52981/], [~brocknoland], [~kennethmac2000], 
[~ngangam], could you please help me review this latest patch? Looking forward 
to your precious opinion. Thanks a lot!

> csv2/tsv2 output format disables quoting by default and it's difficult to 
> enable
> 
>
> Key: HIVE-14679
> URL: https://issues.apache.org/jira/browse/HIVE-14679
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Jianguo Tian
> Attachments: HIVE-14769.1.patch
>
>
> Over in HIVE-9788 we made quoting optional for csv2/tsv2.
> However I see the following issues:
> * JIRA doc doesn't mention it's disabled by default, this should be there an 
> in the output of beeline help.
> * The JIRA says the property is {{--disableQuotingForSV}} but it's actually a 
> system property. We should not use a system property as it's non-standard so 
> extremely hard for users to set. For example I must do: {{env 
> HADOOP_CLIENT_OPTS="-Ddisable.quoting.for.sv=false" beeline ...}}
> * The arg {{--disableQuotingForSV}} should be documented in beeline help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14679) csv2/tsv2 output format disables quoting by default and it's difficult to enable

2016-10-20 Thread Jianguo Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593556#comment-15593556
 ] 

Jianguo Tian edited comment on HIVE-14679 at 10/21/16 1:05 AM:
---

I have updated latest patch on the [Review 
Board|https://reviews.apache.org/r/52981/], [~brocknoland], [~kennethmac2000], 
[~ngangam], could you please help me review this latest patch? Looking forward 
to your precious opinion. Thanks a lot!


was (Author: jonnyr):
I have updated latest patch on the Review Board, [~brocknoland], 
[~kennethmac2000], [~ngangam], could you please help me review this latest 
patch? Looking forward to your precious opinion. Thanks a lot!

> csv2/tsv2 output format disables quoting by default and it's difficult to 
> enable
> 
>
> Key: HIVE-14679
> URL: https://issues.apache.org/jira/browse/HIVE-14679
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Jianguo Tian
> Attachments: HIVE-14769.1.patch
>
>
> Over in HIVE-9788 we made quoting optional for csv2/tsv2.
> However I see the following issues:
> * JIRA doc doesn't mention it's disabled by default, this should be there an 
> in the output of beeline help.
> * The JIRA says the property is {{--disableQuotingForSV}} but it's actually a 
> system property. We should not use a system property as it's non-standard so 
> extremely hard for users to set. For example I must do: {{env 
> HADOOP_CLIENT_OPTS="-Ddisable.quoting.for.sv=false" beeline ...}}
> * The arg {{--disableQuotingForSV}} should be documented in beeline help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14979) Removing stale Zookeeper locks at HiveServer2 initialization

2016-10-20 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593555#comment-15593555
 ] 

Sergey Shelukhin commented on HIVE-14979:
-

ZK session timeout does seems excessive... not sure why it's like that. 
[~thejas] [~vgumashta] can you comment?
In a default config, ZK probably won't even allow such a long timeout.

Reading ZK docs, it does seem like session timeout would allow the locks to 
ride over disconnection. I wonder why e.g. LLAP registry updates so far despite 
this timeout value. 
I think we should reduce the timeout to ~3mins and commit this patch unless   
[~thejas] [~vgumashta] object.
Also I wonder if we need to account for multi-HS2 scenarios at all.

> Removing stale Zookeeper locks at HiveServer2 initialization
> 
>
> Key: HIVE-14979
> URL: https://issues.apache.org/jira/browse/HIVE-14979
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14979.3.patch, HIVE-14979.4.patch, HIVE-14979.patch
>
>
> HiveServer2 could use Zookeeper to store token that indicate that particular 
> tables are locked with the creation of persistent Zookeeper objects. 
> A problem can occur when a HiveServer2 instance creates a lock on a table and 
> the HiveServer2 instances crashes ("Out of Memory" for example) and the locks 
> are not released in Zookeeper. This lock will then remain until it is 
> manually cleared by an admin.
> There should be a way to remove stale locks at HiveServer2 initialization, 
> helping the admins life.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14679) csv2/tsv2 output format disables quoting by default and it's difficult to enable

2016-10-20 Thread Jianguo Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593556#comment-15593556
 ] 

Jianguo Tian commented on HIVE-14679:
-

I have updated latest patch on the Review Board, [~brocknoland], 
[~kennethmac2000], [~ngangam], could you please help me review this latest 
patch? Looking forward to your precious opinion. Thanks a lot!

> csv2/tsv2 output format disables quoting by default and it's difficult to 
> enable
> 
>
> Key: HIVE-14679
> URL: https://issues.apache.org/jira/browse/HIVE-14679
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Jianguo Tian
> Attachments: HIVE-14769.1.patch
>
>
> Over in HIVE-9788 we made quoting optional for csv2/tsv2.
> However I see the following issues:
> * JIRA doc doesn't mention it's disabled by default, this should be there an 
> in the output of beeline help.
> * The JIRA says the property is {{--disableQuotingForSV}} but it's actually a 
> system property. We should not use a system property as it's non-standard so 
> extremely hard for users to set. For example I must do: {{env 
> HADOOP_CLIENT_OPTS="-Ddisable.quoting.for.sv=false" beeline ...}}
> * The arg {{--disableQuotingForSV}} should be documented in beeline help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (HIVE-14679) csv2/tsv2 output format disables quoting by default and it's difficult to enable

2016-10-20 Thread Jianguo Tian (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianguo Tian updated HIVE-14679:

Comment: was deleted

(was: I have fixed this issue, you can check the code as below:
{code:borderStyle=solid}
unquotedCsvPreference = new CsvPreference.Builder('\u0020', separator, 
"").surroundingSpacesNeedQuotes(true).build();
{code}
And accordingto the API of *CsvPreference.Builder*, method 
*surroundingSpacesNeedQuotes*'s parameter is "indicating whether spaces at the 
beginning or end of a cell should be ignored if they're not surrounded by 
quotes".
)

> csv2/tsv2 output format disables quoting by default and it's difficult to 
> enable
> 
>
> Key: HIVE-14679
> URL: https://issues.apache.org/jira/browse/HIVE-14679
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Jianguo Tian
> Attachments: HIVE-14769.1.patch
>
>
> Over in HIVE-9788 we made quoting optional for csv2/tsv2.
> However I see the following issues:
> * JIRA doc doesn't mention it's disabled by default, this should be there an 
> in the output of beeline help.
> * The JIRA says the property is {{--disableQuotingForSV}} but it's actually a 
> system property. We should not use a system property as it's non-standard so 
> extremely hard for users to set. For example I must do: {{env 
> HADOOP_CLIENT_OPTS="-Ddisable.quoting.for.sv=false" beeline ...}}
> * The arg {{--disableQuotingForSV}} should be documented in beeline help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-7483) hive insert overwrite table select from self dead lock

2016-10-20 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-7483:


Assignee: Eugene Koifman

> hive insert overwrite table select from self dead lock
> --
>
> Key: HIVE-7483
> URL: https://issues.apache.org/jira/browse/HIVE-7483
> Project: Hive
>  Issue Type: Bug
>  Components: Locking
>Affects Versions: 0.13.1
>Reporter: Xiaoyu Wang
>Assignee: Eugene Koifman
>
> CREATE TABLE test(
>   id int, 
>   msg string)
> PARTITIONED BY ( 
>   continent string, 
>   country string)
> CLUSTERED BY (id) 
> INTO 10 BUCKETS
> STORED AS ORC;
> alter table test add partition(continent='Asia',country='India');
> in hive-site.xml:
> hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> hive.support.concurrency=true;
> in hive shell:
> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> insert into test table some records first.
> then execute sql:
> insert overwrite table test partition(continent='Asia',country='India') 
> select id,msg from test;
> the log stop at :
> INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> i think it has dead lock when insert overwrite table from it self.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-7483) hive insert overwrite table select from self dead lock

2016-10-20 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman resolved HIVE-7483.
--
Resolution: Duplicate

> hive insert overwrite table select from self dead lock
> --
>
> Key: HIVE-7483
> URL: https://issues.apache.org/jira/browse/HIVE-7483
> Project: Hive
>  Issue Type: Bug
>  Components: Locking
>Affects Versions: 0.13.1
>Reporter: Xiaoyu Wang
>Assignee: Eugene Koifman
>
> CREATE TABLE test(
>   id int, 
>   msg string)
> PARTITIONED BY ( 
>   continent string, 
>   country string)
> CLUSTERED BY (id) 
> INTO 10 BUCKETS
> STORED AS ORC;
> alter table test add partition(continent='Asia',country='India');
> in hive-site.xml:
> hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> hive.support.concurrency=true;
> in hive shell:
> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> insert into test table some records first.
> then execute sql:
> insert overwrite table test partition(continent='Asia',country='India') 
> select id,msg from test;
> the log stop at :
> INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> i think it has dead lock when insert overwrite table from it self.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14953) don't use globStatus on S3 in MM tables

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14953:

Attachment: HIVE-14953.patch

Small patch.
[~rajesh.balamohan] is the insert path the one where you wanted to avoid 
globStatus? I added listStatus in a simple case when there's no recursion.
However, it seems like any recursion (DP or LB) would result in a large number 
of listStatus calls for each directory and then each subdirectory, etc. Are you 
sure it's better than globStatus?

> don't use globStatus on S3 in MM tables
> ---
>
> Key: HIVE-14953
> URL: https://issues.apache.org/jira/browse/HIVE-14953
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14953.patch
>
>
> Need to investigate if recursive get is faster. Also, normal listStatus might 
> suffice because MM code handles directory structure in a more definite manner 
> than old code; so it knows where the files of interest are to be found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14953) don't use globStatus on S3 in MM tables

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14953:

Fix Version/s: hive-14535
   Status: Patch Available  (was: Open)

> don't use globStatus on S3 in MM tables
> ---
>
> Key: HIVE-14953
> URL: https://issues.apache.org/jira/browse/HIVE-14953
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14953.patch
>
>
> Need to investigate if recursive get is faster. Also, normal listStatus might 
> suffice because MM code handles directory structure in a more definite manner 
> than old code; so it knows where the files of interest are to be found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-14953) don't use globStatus on S3 in MM tables

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-14953:
---

Assignee: Sergey Shelukhin

> don't use globStatus on S3 in MM tables
> ---
>
> Key: HIVE-14953
> URL: https://issues.apache.org/jira/browse/HIVE-14953
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
>
> Need to investigate if recursive get is faster. Also, normal listStatus might 
> suffice because MM code handles directory structure in a more definite manner 
> than old code; so it knows where the files of interest are to be found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7817) distinct/group by don't work on partition columns

2016-10-20 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7817:
-
Priority: Critical  (was: Major)

> distinct/group by don't work on partition columns
> -
>
> Key: HIVE-7817
> URL: https://issues.apache.org/jira/browse/HIVE-7817
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Priority: Critical
>
> suppose you have a table like this:
> {code:sql}
> CREATE TABLE page_view(
>viewTime INT,
>userid BIGINT,
> page_url STRING,
> referrer_url STRING,
> ip STRING COMMENT 'IP Address of the User')
> COMMENT 'This is the page view table'
> PARTITIONED BY(dt STRING, country STRING)
> CLUSTERED BY(userid) INTO 4 BUCKETS
> {code}
> Then 
> {code:sql}
> select distinct dt from page_view;
> select distinct dt, country from page_view;
> select dt, country from page_view group by dt, country;
> {code}
> all fail with
> {noformat}
> Query ID = ekoifman_20140820172626_b03ba819-c111-433f-a3fc-453c7d5a3e86
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> Job running in-process (local Hadoop)
> Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0
> 2014-08-20 17:26:13,018 Stage-1 map = 0%,  reduce = 0%
> Ended Job = job_local165359429_0013 with errors
> Error during job, obtaining debugging information...
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> MapReduce Jobs Launched: 
> Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 FAIL
> Total MapReduce CPU Time Spent: 0 msec
> {noformat}
> but 
> {code:sql}
> select dt, country, count(*) from page_view group by dt, country;
> {code}
> works fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11599) Add metastore command to dump it's configs

2016-10-20 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593512#comment-15593512
 ] 

Eugene Koifman commented on HIVE-11599:
---

HIVE-14038 made some progress in this direction

> Add metastore command to dump it's configs
> --
>
> Key: HIVE-11599
> URL: https://issues.apache.org/jira/browse/HIVE-11599
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>
> We should have equivalent of Hive CLI "set" command on Metastore (and likely 
> HS2) which can dump out all properties this particular process is running 
> with.
> cc [~thejas]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14954) put FSOP manifests for the instances of the same vertex into a directory

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14954:

Attachment: HIVE-14954.patch

The patch.

> put FSOP manifests for the instances of the same vertex into a directory
> 
>
> Key: HIVE-14954
> URL: https://issues.apache.org/jira/browse/HIVE-14954
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14954.patch
>
>
> Deleting 100s of manifests can be expensive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14954) put FSOP manifests for the instances of the same vertex into a directory

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14954:

Fix Version/s: hive-14535

> put FSOP manifests for the instances of the same vertex into a directory
> 
>
> Key: HIVE-14954
> URL: https://issues.apache.org/jira/browse/HIVE-14954
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14954.patch
>
>
> Deleting 100s of manifests can be expensive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14954) put FSOP manifests for the instances of the same vertex into a directory

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14954:

Status: Patch Available  (was: Open)

> put FSOP manifests for the instances of the same vertex into a directory
> 
>
> Key: HIVE-14954
> URL: https://issues.apache.org/jira/browse/HIVE-14954
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14954.patch
>
>
> Deleting 100s of manifests can be expensive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13980) create table as select should acquire X lock on target table

2016-10-20 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593494#comment-15593494
 ] 

Eugene Koifman commented on HIVE-13980:
---

cc [~sershe] - maybe CTAS does work

> create table as select should acquire X lock on target table
> 
>
> Key: HIVE-13980
> URL: https://issues.apache.org/jira/browse/HIVE-13980
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>
> hive> create table test.dummy as select * from oraclehadoop.dummy;
> This acquires SHARED_READ on oraclehadoop.dummy table and SHARED_READ on 
> _test_ database.
> The effect is that you can drop _test.dummy_ from another session while the 
> insert is still in progress.
> This operation is a bit odd in that it combines a DDL operation which is not 
> transactional with a DML operation which is.
> If it were to fail in the middle, the target table would remain.  This can't 
> be fixed easily but we should get an X lock on _test.dummy_.
> The workaround is to split this into 2 commands
> 1. create table
> 2. perform insert
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14993) make WriteEntity distinguish writeType

2016-10-20 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14993:
--
Attachment: HIVE-14993.8.patch

> make WriteEntity distinguish writeType
> --
>
> Key: HIVE-14993
> URL: https://issues.apache.org/jira/browse/HIVE-14993
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14993.2.patch, HIVE-14993.3.patch, 
> HIVE-14993.4.patch, HIVE-14993.5.patch, HIVE-14993.6.patch, 
> HIVE-14993.7.patch, HIVE-14993.8.patch, HIVE-14993.patch, 
> debug.not2checkin.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14803) S3: Stats gathering for insert queries can be expensive for partitioned dataset

2016-10-20 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-14803:

Status: Open  (was: Patch Available)

> S3: Stats gathering for insert queries can be expensive for partitioned 
> dataset
> ---
>
> Key: HIVE-14803
> URL: https://issues.apache.org/jira/browse/HIVE-14803
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.1.0
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14803.1.patch, HIVE-14803.2.patch
>
>
> StatsTask's aggregateStats populates stats details for all partitions by 
> checking the file sizes which turns out to be expensive when larger number of 
> partitions are inserted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14993) make WriteEntity distinguish writeType

2016-10-20 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593388#comment-15593388
 ] 

Hive QA commented on HIVE-14993:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12834571/debug.not2checkin.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1706/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1706/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1706/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2016-10-20 23:45:11.409
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-1706/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-10-20 23:45:11.412
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   a057e12..b60bbc2  hive-14535 -> origin/hive-14535
+ git reset --hard HEAD
HEAD is now at 50f7539 HIVE-11832: HIVE-11802 breaks compilation in JDK 8
+ git clean -f -d
Removing druid-handler/
Removing hplsql/
Removing itests/custom-udfs/
Removing itests/hive-blobstore/
Removing llap-client/
Removing llap-common/
Removing llap-ext-client/
Removing llap-server/
Removing llap-tez/
Removing orc/
Removing service-rpc/
Removing storage-api/
+ git checkout master
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 1da HIVE-14985 : Remove UDF-s created during test runs 
(Peter Vary, reviewed by Sergey Shelukhin)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2016-10-20 23:45:15.285
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java:159
error: ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java: patch does not 
apply
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12834571 - PreCommit-HIVE-Build

> make WriteEntity distinguish writeType
> --
>
> Key: HIVE-14993
> URL: https://issues.apache.org/jira/browse/HIVE-14993
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14993.2.patch, HIVE-14993.3.patch, 
> HIVE-14993.4.patch, HIVE-14993.5.patch, HIVE-14993.6.patch, 
> HIVE-14993.7.patch, HIVE-14993.patch, debug.not2checkin.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14993) make WriteEntity distinguish writeType

2016-10-20 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14993:
--
Attachment: debug.not2checkin.patch
HIVE-14993.7.patch

debug.not2checkin.patch has some convenient hacks to find where duplicate 
WriteEntity objects are created



> make WriteEntity distinguish writeType
> --
>
> Key: HIVE-14993
> URL: https://issues.apache.org/jira/browse/HIVE-14993
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14993.2.patch, HIVE-14993.3.patch, 
> HIVE-14993.4.patch, HIVE-14993.5.patch, HIVE-14993.6.patch, 
> HIVE-14993.7.patch, HIVE-14993.patch, debug.not2checkin.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14543) Create Druid table without specifying data source

2016-10-20 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593340#comment-15593340
 ] 

slim bouguerra commented on HIVE-14543:
---

how we can get the hive table name ?

> Create Druid table without specifying data source
> -
>
> Key: HIVE-14543
> URL: https://issues.apache.org/jira/browse/HIVE-14543
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>
> We should be able to omit the Druid datasource from the TBLPROPERTIES. In 
> that case, the Druid datasource name should match the Hive table name.
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler';
> TBLPROPERTIES ("druid.address" = "localhost");
> {code}
> For instance, the statement above creates a table that references the Druid 
> datasource "druid_table_1".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14881) integrate MM tables into ACID: merge cleaner into ACID threads

2016-10-20 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593329#comment-15593329
 ] 

Sergey Shelukhin commented on HIVE-14881:
-

This will also need to account for list bucketing (current ACID cleaner 
probably doesn't because LB doesn't work with bucketing, and thus ACID).
List bucketing looks like partition directories inside partition directories, 
for a subset of keys.
MM currently puts MM dirs inside of LB dirs. (e.g. 
mytable/pkey1=535/pkey2=foo/lbkey=555/mm_0)

If it looks inside delta directories for MM (for whatever reason), it should 
also be aware of there being subdirectories there in some cases (at least, 
union inserts produce subdirectories for each side of the union; overall, it 
seems to be legal in Hive to have arbitrary directory structure inside a 
table/partition).

> integrate MM tables into ACID: merge cleaner into ACID threads 
> ---
>
> Key: HIVE-14881
> URL: https://issues.apache.org/jira/browse/HIVE-14881
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15007) Hive 1.2.2 release planning

2016-10-20 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593273#comment-15593273
 ] 

Hive QA commented on HIVE-15007:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12834540/HIVE-15007-branch-1.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 197 failed/errored test(s), 7897 tests 
executed
*Failed tests:*
{noformat}
TestAdminUser - did not produce a TEST-*.xml file (likely timed out) 
(batchId=347)
TestAuthorizationPreEventListener - did not produce a TEST-*.xml file (likely 
timed out) (batchId=378)
TestAuthzApiEmbedAuthorizerInEmbed - did not produce a TEST-*.xml file (likely 
timed out) (batchId=357)
TestAuthzApiEmbedAuthorizerInRemote - did not produce a TEST-*.xml file (likely 
timed out) (batchId=363)
TestBeeLineWithArgs - did not produce a TEST-*.xml file (likely timed out) 
(batchId=385)
TestBitFieldReader - did not produce a TEST-*.xml file (likely timed out) 
(batchId=467)
TestBitPack - did not produce a TEST-*.xml file (likely timed out) (batchId=463)
TestBuddyAllocator - did not produce a TEST-*.xml file (likely timed out) 
(batchId=777)
TestCLIAuthzSessionContext - did not produce a TEST-*.xml file (likely timed 
out) (batchId=401)
TestClientSideAuthorizationProvider - did not produce a TEST-*.xml file (likely 
timed out) (batchId=377)
TestColumnStatistics - did not produce a TEST-*.xml file (likely timed out) 
(batchId=443)
TestColumnStatisticsImpl - did not produce a TEST-*.xml file (likely timed out) 
(batchId=457)
TestCompactor - did not produce a TEST-*.xml file (likely timed out) 
(batchId=367)
TestConverters - did not produce a TEST-*.xml file (likely timed out) 
(batchId=723)
TestCreateUdfEntities - did not produce a TEST-*.xml file (likely timed out) 
(batchId=366)
TestCustomAuthentication - did not produce a TEST-*.xml file (likely timed out) 
(batchId=386)
TestDBTokenStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=332)
TestDDLWithRemoteMetastoreSecondNamenode - did not produce a TEST-*.xml file 
(likely timed out) (batchId=365)
TestDataReaderProperties - did not produce a TEST-*.xml file (likely timed out) 
(batchId=461)
TestDruidSerDe - did not produce a TEST-*.xml file (likely timed out) 
(batchId=441)
TestDynamicArray - did not produce a TEST-*.xml file (likely timed out) 
(batchId=459)
TestDynamicSerDe - did not produce a TEST-*.xml file (likely timed out) 
(batchId=335)
TestEmbeddedHiveMetaStore - did not produce a TEST-*.xml file (likely timed 
out) (batchId=344)
TestEmbeddedThriftBinaryCLIService - did not produce a TEST-*.xml file (likely 
timed out) (batchId=389)
TestFileDump - did not produce a TEST-*.xml file (likely timed out) 
(batchId=445)
TestFilterHooks - did not produce a TEST-*.xml file (likely timed out) 
(batchId=339)
TestFirstInFirstOutComparator - did not produce a TEST-*.xml file (likely timed 
out) (batchId=786)
TestFolderPermissions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=372)
TestHS2AuthzContext - did not produce a TEST-*.xml file (likely timed out) 
(batchId=404)
TestHS2AuthzSessionContext - did not produce a TEST-*.xml file (likely timed 
out) (batchId=405)
TestHS2ImpersonationWithRemoteMS - did not produce a TEST-*.xml file (likely 
timed out) (batchId=393)
TestHiveAuthorizerCheckInvocation - did not produce a TEST-*.xml file (likely 
timed out) (batchId=381)
TestHiveAuthorizerShowFilters - did not produce a TEST-*.xml file (likely timed 
out) (batchId=380)
TestHiveDruidQueryBasedInputFormat - did not produce a TEST-*.xml file (likely 
timed out) (batchId=440)
TestHiveHistory - did not produce a TEST-*.xml file (likely timed out) 
(batchId=383)
TestHiveMetaStoreTxns - did not produce a TEST-*.xml file (likely timed out) 
(batchId=359)
TestHiveMetaStoreWithEnvironmentContext - did not produce a TEST-*.xml file 
(likely timed out) (batchId=349)
TestHiveMetaTool - did not produce a TEST-*.xml file (likely timed out) 
(batchId=362)
TestHiveServer2 - did not produce a TEST-*.xml file (likely timed out) 
(batchId=407)
TestHiveServer2SessionTimeout - did not produce a TEST-*.xml file (likely timed 
out) (batchId=408)
TestHiveSessionImpl - did not produce a TEST-*.xml file (likely timed out) 
(batchId=390)
TestHplsqlLocal - did not produce a TEST-*.xml file (likely timed out) 
(batchId=475)
TestHplsqlOffline - did not produce a TEST-*.xml file (likely timed out) 
(batchId=474)
TestHs2Hooks - did not produce a TEST-*.xml file (likely timed out) 
(batchId=364)
TestHs2HooksWithMiniKdc - did not produce a TEST-*.xml file (likely timed out) 
(batchId=436)
TestInStream - did not produce a TEST-*.xml file (likely timed out) 
(batchId=453)
TestIncrementalObjectSizeEstimator - did not produce a TEST-*.xml file (likely 
timed out) (batchId=779)
TestIntegerCompressionReader - did not produce 

[jira] [Assigned] (HIVE-14954) put FSOP manifests for the instances of the same vertex into a directory

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-14954:
---

Assignee: Sergey Shelukhin

> put FSOP manifests for the instances of the same vertex into a directory
> 
>
> Key: HIVE-14954
> URL: https://issues.apache.org/jira/browse/HIVE-14954
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
>
> Deleting 100s of manifests can be expensive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15024) LLAP: ClassCastException: org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to org.apache.orc.impl.BufferChunk

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15024:

Status: Patch Available  (was: Open)

> LLAP: ClassCastException: org.apache.hadoop.hive.common.io.DiskRangeList 
> cannot be cast to org.apache.orc.impl.BufferChunk
> --
>
> Key: HIVE-15024
> URL: https://issues.apache.org/jira/browse/HIVE-15024
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>Priority: Critical
> Attachments: HIVE-15024.patch
>
>
> {code}
> Caused by: java.io.IOException: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to 
> org.apache.orc.impl.BufferChunk
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.rethrowErrorIfAny(LlapInputFormat.java:383)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.nextCvb(LlapInputFormat.java:338)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.next(LlapInputFormat.java:278)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.next(LlapInputFormat.java:167)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
> ... 23 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to 
> org.apache.orc.impl.BufferChunk
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.prepareRangesForCompressedRead(EncodedReaderImpl.java:728)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedStream(EncodedReaderImpl.java:616)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:397)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93)
> ... 6 more
> 2016-10-20T00:48:45,354 WARN  [TezTaskRunner 
> (1475017598908_0410_15_00_20_0)] 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Ignoring exception when 
> closing input part(cleanup). Exception class
> =java.io.IOException, message=java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to 
> org.apache.orc.impl.BufferChunk
> 2016-10-20T00:48:45,416 WARN  [TaskHeartbeatThread 
> (1475017598908_0410_15_00_20_0)] 
> org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter: Exiting 
> TaskReporter thread with pending queue size=2
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15024) LLAP: ClassCastException: org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to org.apache.orc.impl.BufferChunk

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15024:

Attachment: HIVE-15024.patch

Improved logging patch. Does this make sense?

> LLAP: ClassCastException: org.apache.hadoop.hive.common.io.DiskRangeList 
> cannot be cast to org.apache.orc.impl.BufferChunk
> --
>
> Key: HIVE-15024
> URL: https://issues.apache.org/jira/browse/HIVE-15024
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>Priority: Critical
> Attachments: HIVE-15024.patch
>
>
> {code}
> Caused by: java.io.IOException: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to 
> org.apache.orc.impl.BufferChunk
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.rethrowErrorIfAny(LlapInputFormat.java:383)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.nextCvb(LlapInputFormat.java:338)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.next(LlapInputFormat.java:278)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.next(LlapInputFormat.java:167)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
> ... 23 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to 
> org.apache.orc.impl.BufferChunk
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.prepareRangesForCompressedRead(EncodedReaderImpl.java:728)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedStream(EncodedReaderImpl.java:616)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:397)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93)
> ... 6 more
> 2016-10-20T00:48:45,354 WARN  [TezTaskRunner 
> (1475017598908_0410_15_00_20_0)] 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Ignoring exception when 
> closing input part(cleanup). Exception class
> =java.io.IOException, message=java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to 
> org.apache.orc.impl.BufferChunk
> 2016-10-20T00:48:45,416 WARN  [TaskHeartbeatThread 
> (1475017598908_0410_15_00_20_0)] 
> org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter: Exiting 
> TaskReporter thread with pending queue size=2
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-15024) LLAP: ClassCastException: org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to org.apache.orc.impl.BufferChunk

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-15024:
---

Assignee: Sergey Shelukhin

> LLAP: ClassCastException: org.apache.hadoop.hive.common.io.DiskRangeList 
> cannot be cast to org.apache.orc.impl.BufferChunk
> --
>
> Key: HIVE-15024
> URL: https://issues.apache.org/jira/browse/HIVE-15024
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>Priority: Critical
>
> {code}
> Caused by: java.io.IOException: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to 
> org.apache.orc.impl.BufferChunk
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.rethrowErrorIfAny(LlapInputFormat.java:383)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.nextCvb(LlapInputFormat.java:338)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.next(LlapInputFormat.java:278)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.next(LlapInputFormat.java:167)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
> ... 23 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to 
> org.apache.orc.impl.BufferChunk
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.prepareRangesForCompressedRead(EncodedReaderImpl.java:728)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedStream(EncodedReaderImpl.java:616)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:397)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93)
> ... 6 more
> 2016-10-20T00:48:45,354 WARN  [TezTaskRunner 
> (1475017598908_0410_15_00_20_0)] 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Ignoring exception when 
> closing input part(cleanup). Exception class
> =java.io.IOException, message=java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to 
> org.apache.orc.impl.BufferChunk
> 2016-10-20T00:48:45,416 WARN  [TaskHeartbeatThread 
> (1475017598908_0410_15_00_20_0)] 
> org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter: Exiting 
> TaskReporter thread with pending queue size=2
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14990) run all tests for MM tables and fix the issues that are found

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14990:

Attachment: HIVE-14990.patch

The aggregate patch of the current branch (mostly it's generated code) with 
isMmTable changed to return true.

I expect a lot of things won't work, even after I fix an NPE that will cause 
90% of the tests to fail the first time over (but not the tests I've run on 
local).

Some of these would be known unsupported cases and some will be real issues.

> run all tests for MM tables and fix the issues that are found
> -
>
> Key: HIVE-14990
> URL: https://issues.apache.org/jira/browse/HIVE-14990
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
> Attachments: HIVE-14990.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14990) run all tests for MM tables and fix the issues that are found

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14990:

Status: Patch Available  (was: Open)

> run all tests for MM tables and fix the issues that are found
> -
>
> Key: HIVE-14990
> URL: https://issues.apache.org/jira/browse/HIVE-14990
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14990.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-14990) run all tests for MM tables and fix the issues that are found

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-14990:
---

Assignee: Sergey Shelukhin

> run all tests for MM tables and fix the issues that are found
> -
>
> Key: HIVE-14990
> URL: https://issues.apache.org/jira/browse/HIVE-14990
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14990.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15019) handle import for MM tables

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15019:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

committed to the feature branch

> handle import for MM tables
> ---
>
> Key: HIVE-15019
> URL: https://issues.apache.org/jira/browse/HIVE-15019
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-15019.WIP.patch, HIVE-15019.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15020) handle truncate for MM tables (not atomic yet)

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15020:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

committed to the feature branch

> handle truncate for MM tables (not atomic yet)
> --
>
> Key: HIVE-15020
> URL: https://issues.apache.org/jira/browse/HIVE-15020
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-15020.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15021) handle (or add a test for) multi-insert into MM tables

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15021:

   Resolution: Fixed
Fix Version/s: hive-14535
   Status: Resolved  (was: Patch Available)

committed to the feature branch

> handle (or add a test for) multi-insert into MM tables
> --
>
> Key: HIVE-15021
> URL: https://issues.apache.org/jira/browse/HIVE-15021
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-15021.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14535) add micromanaged tables to Hive (metastore keeps track of the files)

2016-10-20 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593182#comment-15593182
 ] 

Gopal V commented on HIVE-14535:


bq.  Was Hive modified to force each task attempt to write to the same file?

No, the file name choice was the product of hive bucketing. Due to the write 
once, rename twice (_tmp -> task dir, task dir -> table dir), this was not a 
problem until someone tried to write directly.

bq.  In that case what was the exact issue with checksum-safety?

The writers can't "win" till they have consumed the last byte of their shuffle, 
which is the point where one of them gets to find out they had corrupted data 
(because the checksum does not match).

> add micromanaged tables to Hive (metastore keeps track of the files)
> 
>
> Key: HIVE-14535
> URL: https://issues.apache.org/jira/browse/HIVE-14535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Design doc: 
> https://docs.google.com/document/d/1b3t1RywfyRb73-cdvkEzJUyOiekWwkMHdiQ-42zCllY
> Feel free to comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15021) handle (or add a test for) multi-insert into MM tables

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15021:

Summary: handle (or add a test for) multi-insert into MM tables  (was: 
handle (or add a test) for multi-insert into MM tables)

> handle (or add a test for) multi-insert into MM tables
> --
>
> Key: HIVE-15021
> URL: https://issues.apache.org/jira/browse/HIVE-15021
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15021.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15021) handle (or add a test) for multi-insert into MM tables

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15021:

Assignee: Sergey Shelukhin
  Status: Patch Available  (was: Open)

> handle (or add a test) for multi-insert into MM tables
> --
>
> Key: HIVE-15021
> URL: https://issues.apache.org/jira/browse/HIVE-15021
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15021.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15021) handle (or add a test) for multi-insert into MM tables

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15021:

Attachment: HIVE-15021.patch

This already works. Adding tests.

> handle (or add a test) for multi-insert into MM tables
> --
>
> Key: HIVE-15021
> URL: https://issues.apache.org/jira/browse/HIVE-15021
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
> Attachments: HIVE-15021.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination

2016-10-20 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593138#comment-15593138
 ] 

Steve Loughran commented on HIVE-14271:
---

one funny about last-writer-wins is the scenario

# executor 1 starts working on part-001
# executor 2 gets starts working on it, also opens stream to part-001
# executor 2 finishes; their work becomes visible
# whatever was waiting for part 001 to be ready sets off
# executor 1 finishes and overwrites the existing part 001

That needs to be avoided

> FileSinkOperator should not rename files to final paths when S3 is the 
> default destination
> --
>
> Key: HIVE-14271
> URL: https://issues.apache.org/jira/browse/HIVE-14271
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>
> FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished 
> writing all rows to a temporary path. The problem is that S3 does not support 
> renaming.
> Two options can be considered:
> a. Use a copy operation instead. After FileSinkOperator writes all rows to 
> outPaths, then the commit method will do a copy() call instead of move().
> b. Write row by row directly to the S3 path (see HIVE-1620). This may add 
> better performance calls, but we should take care of the cleanup part in case 
> of writing errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14535) add micromanaged tables to Hive (metastore keeps track of the files)

2016-10-20 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593127#comment-15593127
 ] 

Sahil Takiar commented on HIVE-14535:
-

Thanks a ton [~gopalv]! Definitely helps a lot, don't want to go down the 
direct write approach if we have already seen users hit issues with it. Could 
you expand a little more on your comment about writing to the same file name? 
Was Hive modified to force each task attempt to write to the same file? In that 
case what was the exact issue with checksum-safety? Was one of the writes 
rejected?

Also, want to understand more how it can lead to data loss? Wouldn't the query 
fail if something like this happens?

> add micromanaged tables to Hive (metastore keeps track of the files)
> 
>
> Key: HIVE-14535
> URL: https://issues.apache.org/jira/browse/HIVE-14535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Design doc: 
> https://docs.google.com/document/d/1b3t1RywfyRb73-cdvkEzJUyOiekWwkMHdiQ-42zCllY
> Feel free to comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14535) add micromanaged tables to Hive (metastore keeps track of the files)

2016-10-20 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593118#comment-15593118
 ] 

Sergey Shelukhin commented on HIVE-14535:
-

Also, as a side note, there's work under way to get rid of new APIs and tables 
that I added and reuse existing ACID infrastructure (without requirements like 
ORC/buckets/etc.) for MM tables.

> add micromanaged tables to Hive (metastore keeps track of the files)
> 
>
> Key: HIVE-14535
> URL: https://issues.apache.org/jira/browse/HIVE-14535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Design doc: 
> https://docs.google.com/document/d/1b3t1RywfyRb73-cdvkEzJUyOiekWwkMHdiQ-42zCllY
> Feel free to comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14535) add micromanaged tables to Hive (metastore keeps track of the files)

2016-10-20 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593112#comment-15593112
 ] 

Sergey Shelukhin commented on HIVE-14535:
-

Just to add to [~gopalv] 's response - the "rest" of the MM table support, 
namely the commit mechanic in metastore, is what makes it safe to write 
directly to the table without moves/copies, in the presence of task 
failures/retries/speculative execution, catastrophic query failures (when 
there's noone left to clean up), and also considering reads parallel with 
in-flight writes.
There has to be some way to tell apart the committed files from uncommitted.
My initial plan was to store file names in metastore for every file that 
MoveTask would have moved, but the ID approach is much more efficient for 
commit and DB storage requirements.

> add micromanaged tables to Hive (metastore keeps track of the files)
> 
>
> Key: HIVE-14535
> URL: https://issues.apache.org/jira/browse/HIVE-14535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Design doc: 
> https://docs.google.com/document/d/1b3t1RywfyRb73-cdvkEzJUyOiekWwkMHdiQ-42zCllY
> Feel free to comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15020) handle truncate for MM tables (not atomic yet)

2016-10-20 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593100#comment-15593100
 ] 

Hive QA commented on HIVE-15020:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12834527/HIVE-15020.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1704/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1704/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1704/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2016-10-20 21:38:10.412
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-1704/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-10-20 21:38:10.414
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 1da HIVE-14985 : Remove UDF-s created during test runs 
(Peter Vary, reviewed by Sergey Shelukhin)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 1da HIVE-14985 : Remove UDF-s created during test runs 
(Peter Vary, reviewed by Sergey Shelukhin)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2016-10-20 21:38:11.329
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java:1060
error: ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java: 
patch does not apply
error: ql/src/test/queries/clientpositive/mm_all.q: No such file or directory
error: ql/src/test/results/clientpositive/llap/mm_all.q.out: No such file or 
directory
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12834527 - PreCommit-HIVE-Build

> handle truncate for MM tables (not atomic yet)
> --
>
> Key: HIVE-15020
> URL: https://issues.apache.org/jira/browse/HIVE-15020
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-15020.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14535) add micromanaged tables to Hive (metastore keeps track of the files)

2016-10-20 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593084#comment-15593084
 ] 

Gopal V commented on HIVE-14535:


>  Do you think it would be reasonable to commit the changes to the 
> FileSinkOperator without the rest of the MM tables support?

No, a direct output committer approach without query isolation has lost data 
for production customers before, by forcing multiple tasks to write to the same 
file-name by accident - due to the way checksum-safety works, the first writer 
is not the winner in failure-tolerance scenarios.

We want to prevent users from making such expensive mistakes again, so this 
patch isolates different queries from each other - without which you will stomp 
over files.

>  I know there are some concerns that this "direct output committer" approach 
> could cause data corruption issues, is this something was considered 
> explicitly in the design? If so, could you expand on why those data 
> corruption issues would occur?

Without the isolation fix, the other parts are dangerous to use. 

With the isolation in place, the system moves away from the move model to a 
cleanup model (the cleanup code already exists, it is just applied to the 
scratch dir today).

> add micromanaged tables to Hive (metastore keeps track of the files)
> 
>
> Key: HIVE-14535
> URL: https://issues.apache.org/jira/browse/HIVE-14535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Design doc: 
> https://docs.google.com/document/d/1b3t1RywfyRb73-cdvkEzJUyOiekWwkMHdiQ-42zCllY
> Feel free to comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14993) make WriteEntity distinguish writeType

2016-10-20 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593085#comment-15593085
 ] 

Hive QA commented on HIVE-14993:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12834477/HIVE-14993.6.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10564 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk] 
(batchId=89)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=132)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[current_date_timestamp]
 (batchId=144)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=164)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=164)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=164)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1703/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1703/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1703/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12834477 - PreCommit-HIVE-Build

> make WriteEntity distinguish writeType
> --
>
> Key: HIVE-14993
> URL: https://issues.apache.org/jira/browse/HIVE-14993
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14993.2.patch, HIVE-14993.3.patch, 
> HIVE-14993.4.patch, HIVE-14993.5.patch, HIVE-14993.6.patch, HIVE-14993.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14535) add micromanaged tables to Hive (metastore keeps track of the files)

2016-10-20 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593030#comment-15593030
 ] 

Sahil Takiar commented on HIVE-14535:
-

Hey [~sershe],

Very interesting feature. I think this could have some benefits for Hive-on-S3 
write performance also (ref: HIVE-14269). Particularly the changes to the 
{{FileSinkOperator}}. If I understand correctly, the changes cause the 
{{FileSinkOperator}} to directly write to the final Hive table location rather 
than to a staging directory. On Blobstores (like S3), this should significantly 
improve performance since data doesn't need to be copied from a staging 
directory to the final directory. We were thinking of implementing something 
similar in HIVE-14271. Do you think it would be reasonable to commit the 
changes to the {{FileSinkOperator}} without the rest of the MM tables support? 
I know there are some concerns that this "direct output committer" approach 
could cause data corruption issues, is this something was considered explicitly 
in the design? If so, could you expand on why those data corruption issues 
would occur?

> add micromanaged tables to Hive (metastore keeps track of the files)
> 
>
> Key: HIVE-14535
> URL: https://issues.apache.org/jira/browse/HIVE-14535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Design doc: 
> https://docs.google.com/document/d/1b3t1RywfyRb73-cdvkEzJUyOiekWwkMHdiQ-42zCllY
> Feel free to comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination

2016-10-20 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593020#comment-15593020
 ] 

Sahil Takiar commented on HIVE-14271:
-

[~cnauroth], we were actually thinking of implementing a "direct output 
committer" strategy for Hive (it would be optional of course). Any chance you 
could expand some more on what the drawbacks of this approach would be?

For the issue reported in SPARK-10063, I think you should be able to add a 
config option that says the file is only closed if the Task was successful.

I know there are other concerns with things like speculative execution and task 
retries, but Hive may be able to overcome those by making sure each task 
attempt writes to the same file on S3. Since S3 follows a last-writer-wins 
approach, and each task attempt is idempotent, there should be no data issues 
(similar approach was taken in HIVE-1620).

Thoughts?

> FileSinkOperator should not rename files to final paths when S3 is the 
> default destination
> --
>
> Key: HIVE-14271
> URL: https://issues.apache.org/jira/browse/HIVE-14271
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>
> FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished 
> writing all rows to a temporary path. The problem is that S3 does not support 
> renaming.
> Two options can be considered:
> a. Use a copy operation instead. After FileSinkOperator writes all rows to 
> outPaths, then the commit method will do a copy() call instead of move().
> b. Write row by row directly to the S3 path (see HIVE-1620). This may add 
> better performance calls, but we should take care of the cleanup part in case 
> of writing errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12765) Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)

2016-10-20 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592991#comment-15592991
 ] 

Pengcheng Xiong commented on HIVE-12765:


[~ashutoshc]. the test results look fine to me. the failures are not related to 
the patch and they appear in the previous run.

> Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)
> ---
>
> Key: HIVE-12765
> URL: https://issues.apache.org/jira/browse/HIVE-12765
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12765.01.patch, HIVE-12765.02.patch, 
> HIVE-12765.03.patch, HIVE-12765.04.patch, HIVE-12765.05.patch, 
> HIVE-12765.06.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14984) Hive-WebUI access results in Request is a replay (34) attack

2016-10-20 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592987#comment-15592987
 ] 

Jimmy Xiang commented on HIVE-14984:


Good. Thanks.

> Hive-WebUI access results in Request is a replay (34) attack
> 
>
> Key: HIVE-14984
> URL: https://issues.apache.org/jira/browse/HIVE-14984
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.0
>Reporter: Venkat Sambath
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-14984.patch
>
>
> When trying to access kerberized webui of HS2, The following error is received
> GSSException: Failure unspecified at GSS-API level (Mechanism level: Request 
> is a replay (34))
> While this is not happening for RM webui (checked if kerberos webui is 
> enabled)
> To reproduce the issue 
> Try running
> curl --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt 
> http://:10002/
> from any cluster nodes
> or 
> Try accessing the URL from a VM with windows machine and firefox browser to 
> replicate the issue
> The following workaround helped, but need a permanent solution for the bug
> Workaround:
> =
> First access the index.html directly and then actual URL of webui
> curl --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt 
> http://:10002/index.html
> curl --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt 
> http://:10002
> In browser:
> First access
> http://:10002/index.html
> then
> http://:10002



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14837) JDBC: standalone jar is missing hadoop core dependencies

2016-10-20 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-14837:
--
Status: Patch Available  (was: Open)

> JDBC: standalone jar is missing hadoop core dependencies
> 
>
> Key: HIVE-14837
> URL: https://issues.apache.org/jira/browse/HIVE-14837
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Tao Li
> Attachments: HIVE-14837.1.patch
>
>
> {code}
> 2016/09/24 00:31:57 ERROR - jmeter.threads.JMeterThread: Test failed! 
> java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
> at 
> org.apache.hive.jdbc.HiveConnection.createUnderlyingTransport(HiveConnection.java:418)
> at 
> org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:438)
> at 
> org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:225)
> at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:182)
> at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14837) JDBC: standalone jar is missing hadoop core dependencies

2016-10-20 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-14837:
--
Status: Open  (was: Patch Available)

> JDBC: standalone jar is missing hadoop core dependencies
> 
>
> Key: HIVE-14837
> URL: https://issues.apache.org/jira/browse/HIVE-14837
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Tao Li
> Attachments: HIVE-14837.1.patch
>
>
> {code}
> 2016/09/24 00:31:57 ERROR - jmeter.threads.JMeterThread: Test failed! 
> java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
> at 
> org.apache.hive.jdbc.HiveConnection.createUnderlyingTransport(HiveConnection.java:418)
> at 
> org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:438)
> at 
> org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:225)
> at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:182)
> at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14968) Fix compilation failure on branch-1

2016-10-20 Thread Tao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592970#comment-15592970
 ] 

Tao Li commented on HIVE-14968:
---

cc [~daijy] Can you kick off the test again with the new patch name?

> Fix compilation failure on branch-1
> ---
>
> Key: HIVE-14968
> URL: https://issues.apache.org/jira/browse/HIVE-14968
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0
>
> Attachments: HIVE-14968-branch-1.1.patch, HIVE-14968.1.patch
>
>
> branch-1 compilation failure due to:
> HIVE-14436: Hive 1.2.1/Hitting "ql.Driver: FAILED: IllegalArgumentException 
> Error: , expected at the end of 'decimal(9'" after enabling 
> hive.optimize.skewjoin and with MR engine
> HIVE-14483 : java.lang.ArrayIndexOutOfBoundsException 
> org.apache.orc.impl.TreeReaderFactory.commonReadByteArrays
> 1.2 branch is fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12765) Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)

2016-10-20 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592968#comment-15592968
 ] 

Hive QA commented on HIVE-12765:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12834475/HIVE-12765.06.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10570 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk] 
(batchId=89)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=131)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[current_date_timestamp]
 (batchId=144)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=164)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=164)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=164)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1702/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1702/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1702/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12834475 - PreCommit-HIVE-Build

> Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)
> ---
>
> Key: HIVE-12765
> URL: https://issues.apache.org/jira/browse/HIVE-12765
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12765.01.patch, HIVE-12765.02.patch, 
> HIVE-12765.03.patch, HIVE-12765.04.patch, HIVE-12765.05.patch, 
> HIVE-12765.06.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15007) Hive 1.2.2 release planning

2016-10-20 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592911#comment-15592911
 ] 

Vaibhav Gumashta commented on HIVE-15007:
-

Thanks [~spena].

> Hive 1.2.2 release planning
> ---
>
> Key: HIVE-15007
> URL: https://issues.apache.org/jira/browse/HIVE-15007
> Project: Hive
>  Issue Type: Task
>Affects Versions: 1.2.1
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-15007-branch-1.2.patch, HIVE-15007.branch-1.2.patch
>
>
> Discussed with [~spena] about triggering unit test runs for 1.2.2 release and 
> creating a patch which will trigger precommits looks like a good way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15007) Hive 1.2.2 release planning

2016-10-20 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-15007:

Attachment: HIVE-15007-branch-1.2.patch

> Hive 1.2.2 release planning
> ---
>
> Key: HIVE-15007
> URL: https://issues.apache.org/jira/browse/HIVE-15007
> Project: Hive
>  Issue Type: Task
>Affects Versions: 1.2.1
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-15007-branch-1.2.patch, HIVE-15007.branch-1.2.patch
>
>
> Discussed with [~spena] about triggering unit test runs for 1.2.2 release and 
> creating a patch which will trigger precommits looks like a good way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14985) Remove UDF-s created during test runs

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14985:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the patch!

> Remove UDF-s created during test runs
> -
>
> Key: HIVE-14985
> URL: https://issues.apache.org/jira/browse/HIVE-14985
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-14985.2.patch, HIVE-14985.patch
>
>
> When I tried to run llap_udf.q repeatedly from my IDE then the first run was 
> a pass, but following runs were failed. 
> The query does not remove the created functions in the query file which could 
> cause problems for the follow up tests.
> The same problem could happen if a query test fails in the middle of the 
> script, and even though the file contains the removal sql commands, those are 
> not executed.
> It might be a good idea to clean up not just tables and keys, but functions 
> created during the test run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13589) beeline support prompt for password with '-p' option

2016-10-20 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592840#comment-15592840
 ] 

Hive QA commented on HIVE-13589:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12834474/HIVE-13589.10.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10574 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=132)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[current_date_timestamp]
 (batchId=144)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=164)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=164)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=164)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1701/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1701/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1701/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12834474 - PreCommit-HIVE-Build

> beeline support prompt for password with '-p' option
> 
>
> Key: HIVE-13589
> URL: https://issues.apache.org/jira/browse/HIVE-13589
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Thejas M Nair
>Assignee: Vihang Karajgaonkar
> Fix For: 2.2.0
>
> Attachments: HIVE-13589.1.patch, HIVE-13589.10.patch, 
> HIVE-13589.2.patch, HIVE-13589.3.patch, HIVE-13589.4.patch, 
> HIVE-13589.5.patch, HIVE-13589.6.patch, HIVE-13589.7.patch, 
> HIVE-13589.8.patch, HIVE-13589.9.patch
>
>
> Specifying connection string using commandline options in beeline is 
> convenient, as it gets saved in shell command history, and it is easy to 
> retrieve it from there.
> However, specifying the password in command prompt is not secure as it gets 
> displayed on screen and saved in the history.
> It should be possible to specify '-p' without an argument to make beeline 
> prompt for password.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15020) handle truncate for MM tables (not atomic yet)

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15020:

Attachment: HIVE-15020.patch

The regular truncate already works (added tests).
The truncate columns command looks completely pointless, so it won't be 
supported for MM tables (added a negative test, too). Adding support would not 
be very hard, if needed.

> handle truncate for MM tables (not atomic yet)
> --
>
> Key: HIVE-15020
> URL: https://issues.apache.org/jira/browse/HIVE-15020
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-15020.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-14733) support in HBaseStore

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-14733.
-
Resolution: Not A Problem

won't be needed - should already work after ACID integration

> support in HBaseStore
> -
>
> Key: HIVE-14733
> URL: https://issues.apache.org/jira/browse/HIVE-14733
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>
> For expediency, HBaseStore support will be done later once everything works 
> on ObjectStore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15020) handle truncate for MM tables (not atomic yet)

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15020:

Fix Version/s: hive-14535
   Status: Patch Available  (was: Open)

> handle truncate for MM tables (not atomic yet)
> --
>
> Key: HIVE-15020
> URL: https://issues.apache.org/jira/browse/HIVE-15020
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-15020.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15024) LLAP: ClassCastException: org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to org.apache.orc.impl.BufferChunk

2016-10-20 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592800#comment-15592800
 ] 

Sergey Shelukhin commented on HIVE-15024:
-

Hmm, log is at warn level.. if there's no repro I'll just add more detailed 
logging for the error case, otherwise this is hard to diagnose.

> LLAP: ClassCastException: org.apache.hadoop.hive.common.io.DiskRangeList 
> cannot be cast to org.apache.orc.impl.BufferChunk
> --
>
> Key: HIVE-15024
> URL: https://issues.apache.org/jira/browse/HIVE-15024
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Priority: Critical
>
> {code}
> Caused by: java.io.IOException: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to 
> org.apache.orc.impl.BufferChunk
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.rethrowErrorIfAny(LlapInputFormat.java:383)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.nextCvb(LlapInputFormat.java:338)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.next(LlapInputFormat.java:278)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.next(LlapInputFormat.java:167)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
> ... 23 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to 
> org.apache.orc.impl.BufferChunk
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.prepareRangesForCompressedRead(EncodedReaderImpl.java:728)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedStream(EncodedReaderImpl.java:616)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:397)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93)
> ... 6 more
> 2016-10-20T00:48:45,354 WARN  [TezTaskRunner 
> (1475017598908_0410_15_00_20_0)] 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Ignoring exception when 
> closing input part(cleanup). Exception class
> =java.io.IOException, message=java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to 
> org.apache.orc.impl.BufferChunk
> 2016-10-20T00:48:45,416 WARN  [TaskHeartbeatThread 
> (1475017598908_0410_15_00_20_0)] 
> org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter: Exiting 
> TaskReporter thread with pending queue size=2
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14950) Support integer data type

2016-10-20 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592785#comment-15592785
 ] 

Alan Gates commented on HIVE-14950:
---

Definitely don't want to be on [~leftylev]'s bad side, so I'll updated the wiki 
when I commit this. :)

> Support integer data type
> -
>
> Key: HIVE-14950
> URL: https://issues.apache.org/jira/browse/HIVE-14950
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Minor
> Attachments: HIVE-14950.1.patch, HIVE-14950.2.patch
>
>
> maybe its just me bumping into this difference again and again...
> but it's in the sql2011 standard... 
> adding an alias for int would be easy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14950) Support integer data type

2016-10-20 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592775#comment-15592775
 ] 

Pengcheng Xiong commented on HIVE-14950:


[~alangates] i agree with you. Just one note: please add INTEGER as a reserved 
keyword in wiki otherwise [~leftylev]y will come after you. :)

> Support integer data type
> -
>
> Key: HIVE-14950
> URL: https://issues.apache.org/jira/browse/HIVE-14950
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Minor
> Attachments: HIVE-14950.1.patch, HIVE-14950.2.patch
>
>
> maybe its just me bumping into this difference again and again...
> but it's in the sql2011 standard... 
> adding an alias for int would be easy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14950) Support integer data type

2016-10-20 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592736#comment-15592736
 ] 

Alan Gates commented on HIVE-14950:
---

I'm +1 on this patch.  As for the failed test, all of the above tests except 
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk] and 
org.apache.hive.beeline.TestBeelineArgParsing pass for me with the patch, and 
those two fail without the patch as well, so I don't think this breaks any 
additional tests.

[~ashutoshc], [~pxiong], others, any comments on this before I commit it?


> Support integer data type
> -
>
> Key: HIVE-14950
> URL: https://issues.apache.org/jira/browse/HIVE-14950
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Minor
> Attachments: HIVE-14950.1.patch, HIVE-14950.2.patch
>
>
> maybe its just me bumping into this difference again and again...
> but it's in the sql2011 standard... 
> adding an alias for int would be easy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15019) handle import for MM tables

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15019:

Attachment: HIVE-15019.patch

Fixed patch. Export will be done in a separate JIRA

> handle import for MM tables
> ---
>
> Key: HIVE-15019
> URL: https://issues.apache.org/jira/browse/HIVE-15019
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-15019.WIP.patch, HIVE-15019.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-15020) handle truncate for MM tables (not atomic yet)

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-15020:
---

Assignee: Sergey Shelukhin

> handle truncate for MM tables (not atomic yet)
> --
>
> Key: HIVE-15020
> URL: https://issues.apache.org/jira/browse/HIVE-15020
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15027) make sure export takes MM information into account

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15027:

Description: 
Export would currently blindly copy the directories; it should take MM 
information into account.
It's relatively easy to do in the CopyWork created from ExportSemanticAnalyzer, 
but I will leave it undone for now in case there's some better way to do it 
after ACID integration

> make sure export takes MM information into account
> --
>
> Key: HIVE-15027
> URL: https://issues.apache.org/jira/browse/HIVE-15027
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>
> Export would currently blindly copy the directories; it should take MM 
> information into account.
> It's relatively easy to do in the CopyWork created from 
> ExportSemanticAnalyzer, but I will leave it undone for now in case there's 
> some better way to do it after ACID integration



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14887) Reduce the memory requirements for tests

2016-10-20 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592723#comment-15592723
 ] 

Hive QA commented on HIVE-14887:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12834335/HIVE-14887.05.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10564 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=132)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=133)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[current_date_timestamp]
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=91)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=164)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=164)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=164)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1700/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1700/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1700/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12834335 - PreCommit-HIVE-Build

> Reduce the memory requirements for tests
> 
>
> Key: HIVE-14887
> URL: https://issues.apache.org/jira/browse/HIVE-14887
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14887.01.patch, HIVE-14887.02.patch, 
> HIVE-14887.03.patch, HIVE-14887.04.patch, HIVE-14887.05.patch
>
>
> The clusters that we spin up end up requiring 16GB at times. Also the maven 
> arguments seem a little heavy weight.
> Reducing this will allow for additional ptest drones per box, which should 
> bring down the runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15024) LLAP: ClassCastException: org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to org.apache.orc.impl.BufferChunk

2016-10-20 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592692#comment-15592692
 ] 

Gopal V commented on HIVE-15024:


The logs are on /tmp/application_1475017598908_0409.log  on the perf-cluster.

I can download them in a couple of hours & upload them here, if you want.

> LLAP: ClassCastException: org.apache.hadoop.hive.common.io.DiskRangeList 
> cannot be cast to org.apache.orc.impl.BufferChunk
> --
>
> Key: HIVE-15024
> URL: https://issues.apache.org/jira/browse/HIVE-15024
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Priority: Critical
>
> {code}
> Caused by: java.io.IOException: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to 
> org.apache.orc.impl.BufferChunk
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.rethrowErrorIfAny(LlapInputFormat.java:383)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.nextCvb(LlapInputFormat.java:338)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.next(LlapInputFormat.java:278)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.next(LlapInputFormat.java:167)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
> ... 23 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to 
> org.apache.orc.impl.BufferChunk
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.prepareRangesForCompressedRead(EncodedReaderImpl.java:728)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedStream(EncodedReaderImpl.java:616)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:397)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93)
> ... 6 more
> 2016-10-20T00:48:45,354 WARN  [TezTaskRunner 
> (1475017598908_0410_15_00_20_0)] 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Ignoring exception when 
> closing input part(cleanup). Exception class
> =java.io.IOException, message=java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to 
> org.apache.orc.impl.BufferChunk
> 2016-10-20T00:48:45,416 WARN  [TaskHeartbeatThread 
> (1475017598908_0410_15_00_20_0)] 
> org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter: Exiting 
> TaskReporter thread with pending queue size=2
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15026) Option to not merge the views

2016-10-20 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592677#comment-15592677
 ] 

Xuefu Zhang commented on HIVE-15026:


In general, Hive "expands" view/cte in the logical layer as they are not 
materialized. However, the problem isn't applicable for Hive on Spark though I 
know it's there for MapReduce. Hive on Spark caches the result if it determines 
that a temp result (such as the output of your common table expression) is 
reusable.

If you're using other engines, you might consider create a temp table (so as to 
materialize the view/cte) and use it in your subsequent processing.

> Option to not merge the views
> -
>
> Key: HIVE-15026
> URL: https://issues.apache.org/jira/browse/HIVE-15026
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer, Physical Optimizer
>Reporter: Carlos Martinez Moller
>
> Note: I am trying to simplify a real case scenario we are having and 
> simplifying the queries for the example. Hope they make sense and that the 
> proposal I am doing can be understood. The real query is a lot more complex 
> and long.
> When performing a query of this type:
> --
> SELECT COLUMNA, COLUMNB, MAX (COLUMNC)
> FROM TABLE_A
> WHERE COLUMNA=1 AND COLUMND='Case 1'
> UNION ALL
> SELECT COLUMNA, COLUMNB, MAX (COLUMNC)
> FROM TABLE_A
> WHERE COLUMNA=10 AND COLUMNE='Case 2'
> --
> This creates Three Stages. First Stage is FULL SCAN of TABLE_A + Filter 
> (COLUMNA=1/COLUMND='Case 1'),  Second Stage is FULL SCAN of TABLE_A again + 
> Filter (COLUMNA=10/COLUMNE='Case 2'), and third stage is the UNION ALL.
> TABLE_A has 2TB data of information.
> But COLUMNA=1 and COLUMNA=10 filter all together only 2GB of information.
> So I thought to use:
> --
> WITH TEMP_VIEW AS
> (SELECT COLUMNA,COLUMNB,COLUMNC,COLUMND
> FROM TABLE_A
> WHERE COLUMNA=1 AND COLUMNA=10)
> SELECT COLUMNA, COLUMNB, MAX (COLUMNC)
> FROM TEMP_VIEW
> WHERE COLUMNA=1 AND COLUMND='Case 1'
> UNION ALL
> SELECT COLUMNA, COLUMNB, MAX (COLUMNC)
> FROM TEMP_VIEW
> WHERE COLUMNA=10 AND COLUMNE='Case 2'
> ---
> I thought that with this it would create 4 Stages:
> - Stage 1: Full Scan of TABLE_A and generate intermediate data
> - Stage 2: In the data of Stage 1 Filter (COLUMNA=1/COLUMND='Case 1')
> - Stage 3: In the data of Stage 1 Filter (COLUMNA=10/COLUMNE='Case 2')
> - Stage 4: Union ALL
> With this instead of 4TB being read from disk, only 2TB+4GB (twice going 
> through the view) would be read (In our case complexity is even bigger and we 
> will be saving 20TB reading)
> But it does the same than in the original query. It internally pushes the 
> predicates of the "WITH" query in the two parts of the UNION.
> It would be good to have a control on this, or for the optimizer to choose 
> the best approach using histogram/statistics information.
> For those knowing Oracle RDBMS this is equivalent to the MERGE/NO_MERGE and 
> NEST behaviour:
> http://www.dba-oracle.com/t_hint_no_merge.htm as an explanation...
> Other approaches for my example could apply, as partitioning by COLUMNA of 
> BUCKETING. But are not applicable in our case as COLUMNA is not commonly used 
> when accessing this table.
> The point of this JIRA is to add a functionality similar to the one of Oracle 
> (not Merging the query, but generating an in-memory/disk temporary view) both 
> for "WITH" clauses and VIEWS.
> This is very very commonly used in Data Ware Houses managing big amounts of 
> data and provides big performance benefits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15024) LLAP: ClassCastException: org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to org.apache.orc.impl.BufferChunk

2016-10-20 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592650#comment-15592650
 ] 

Sergey Shelukhin commented on HIVE-15024:
-

[~gopalv] are there more logs, or a repro?

> LLAP: ClassCastException: org.apache.hadoop.hive.common.io.DiskRangeList 
> cannot be cast to org.apache.orc.impl.BufferChunk
> --
>
> Key: HIVE-15024
> URL: https://issues.apache.org/jira/browse/HIVE-15024
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Priority: Critical
>
> {code}
> Caused by: java.io.IOException: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to 
> org.apache.orc.impl.BufferChunk
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.rethrowErrorIfAny(LlapInputFormat.java:383)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.nextCvb(LlapInputFormat.java:338)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.next(LlapInputFormat.java:278)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.next(LlapInputFormat.java:167)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
> ... 23 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to 
> org.apache.orc.impl.BufferChunk
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.prepareRangesForCompressedRead(EncodedReaderImpl.java:728)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedStream(EncodedReaderImpl.java:616)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:397)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93)
> ... 6 more
> 2016-10-20T00:48:45,354 WARN  [TezTaskRunner 
> (1475017598908_0410_15_00_20_0)] 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Ignoring exception when 
> closing input part(cleanup). Exception class
> =java.io.IOException, message=java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.io.DiskRangeList cannot be cast to 
> org.apache.orc.impl.BufferChunk
> 2016-10-20T00:48:45,416 WARN  [TaskHeartbeatThread 
> (1475017598908_0410_15_00_20_0)] 
> org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter: Exiting 
> TaskReporter thread with pending queue size=2
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14950) Support integer data type

2016-10-20 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-14950:
--
Hadoop Flags: Incompatible change

> Support integer data type
> -
>
> Key: HIVE-14950
> URL: https://issues.apache.org/jira/browse/HIVE-14950
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Minor
> Attachments: HIVE-14950.1.patch, HIVE-14950.2.patch
>
>
> maybe its just me bumping into this difference again and again...
> but it's in the sql2011 standard... 
> adding an alias for int would be easy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15017) Random job failures with MapReduce and Tez

2016-10-20 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592620#comment-15592620
 ] 

Sergey Shelukhin commented on HIVE-15017:
-

Looks like some setup issue... can you enable debug logging to see this line: 
{noformat}
LOG.debug("initApplication: " + Arrays.toString(commandArray));
{noformat}
Is HADOOP_YARN_HOME set for the container?
Is yarn.nodemanager.linux-container-executor.path set to something non-default?
Otherwise, is there a file under Yarn home called bin/container-executor?

> Random job failures with MapReduce and Tez
> --
>
> Key: HIVE-15017
> URL: https://issues.apache.org/jira/browse/HIVE-15017
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.0
> Environment: Hadoop 2.7.2, Hive 2.1.0
>Reporter: Alexandre Linte
>Priority: Critical
> Attachments: hive_cli_mr.txt, hive_cli_tez.txt, 
> nodemanager_logs_mr_job.txt, yarn_container_tez_job_datanode05.txt, 
> yarn_container_tez_job_datanode06.txt, yarn_syslog_mr_job.txt, 
> yarn_syslog_tez_job.txt
>
>
> Since Hive 2.1.0, we are facing a blocking issue on our cluster. All the jobs 
> are failing randomly on mapreduce and tez as well. 
> In both case, we don't have any ERROR or WARN message in the logs. You can 
> find attached:
> - hive cli output errors 
> - yarn logs for a tez and mapreduce job
> - nodemanager logs (mr only, we have the same logs with tez)
> Note: This issue doesn't exist with Pig jobs (mr + tez), Spark jobs (mr), so 
> this cannot be an Hadoop / Yarn issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14993) make WriteEntity distinguish writeType

2016-10-20 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14993:
--
Attachment: HIVE-14993.6.patch

> make WriteEntity distinguish writeType
> --
>
> Key: HIVE-14993
> URL: https://issues.apache.org/jira/browse/HIVE-14993
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14993.2.patch, HIVE-14993.3.patch, 
> HIVE-14993.4.patch, HIVE-14993.5.patch, HIVE-14993.6.patch, HIVE-14993.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14887) Reduce the memory requirements for tests

2016-10-20 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592607#comment-15592607
 ] 

Hive QA commented on HIVE-14887:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12834335/HIVE-14887.05.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10564 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=132)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=133)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[current_date_timestamp]
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_1] 
(batchId=90)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=91)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=90)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=164)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=164)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=164)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1699/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1699/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1699/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12834335 - PreCommit-HIVE-Build

> Reduce the memory requirements for tests
> 
>
> Key: HIVE-14887
> URL: https://issues.apache.org/jira/browse/HIVE-14887
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14887.01.patch, HIVE-14887.02.patch, 
> HIVE-14887.03.patch, HIVE-14887.04.patch, HIVE-14887.05.patch
>
>
> The clusters that we spin up end up requiring 16GB at times. Also the maven 
> arguments seem a little heavy weight.
> Reducing this will allow for additional ptest drones per box, which should 
> bring down the runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0

2016-10-20 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592600#comment-15592600
 ] 

Jesus Camacho Rodriguez commented on HIVE-15023:


Your solution works; as I said, maybe you could just add a comment to the _if_ 
clause. Expect some additional plan improvements.

> SimpleFetchOptimizer needs to optimize limit=0
> --
>
> Key: HIVE-15023
> URL: https://issues.apache.org/jira/browse/HIVE-15023
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15023.01.patch
>
>
> on current master
> {code}
> hive> explain select key from src limit 0;
> OK
> STAGE DEPENDENCIES:
>   Stage-0 is a root stage
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: 0
>   Processor Tree:
> TableScan
>   alias: src
>   Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: key (type: string)
> outputColumnNames: _col0
> Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE 
> Column stats: NONE
> Limit
>   Number of rows: 0
>   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
> stats: NONE
>   ListSink
> Time taken: 7.534 seconds, Fetched: 20 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0

2016-10-20 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592592#comment-15592592
 ] 

Pengcheng Xiong commented on HIVE-15023:


[~jcamachorodriguez], I proposed the same plan to move L200 ahead before L119 
before. And I got lots of test case failures. Thus, I think it is not simple to 
move that line, as you said. :) Does my patch solves all your problems or not? 
which one still fails?

> SimpleFetchOptimizer needs to optimize limit=0
> --
>
> Key: HIVE-15023
> URL: https://issues.apache.org/jira/browse/HIVE-15023
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15023.01.patch
>
>
> on current master
> {code}
> hive> explain select key from src limit 0;
> OK
> STAGE DEPENDENCIES:
>   Stage-0 is a root stage
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: 0
>   Processor Tree:
> TableScan
>   alias: src
>   Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: key (type: string)
> outputColumnNames: _col0
> Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE 
> Column stats: NONE
> Limit
>   Number of rows: 0
>   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
> stats: NONE
>   ListSink
> Time taken: 7.534 seconds, Fetched: 20 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0

2016-10-20 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15023:
---
Status: Patch Available  (was: Open)

> SimpleFetchOptimizer needs to optimize limit=0
> --
>
> Key: HIVE-15023
> URL: https://issues.apache.org/jira/browse/HIVE-15023
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15023.01.patch
>
>
> on current master
> {code}
> hive> explain select key from src limit 0;
> OK
> STAGE DEPENDENCIES:
>   Stage-0 is a root stage
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: 0
>   Processor Tree:
> TableScan
>   alias: src
>   Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: key (type: string)
> outputColumnNames: _col0
> Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE 
> Column stats: NONE
> Limit
>   Number of rows: 0
>   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
> stats: NONE
>   ListSink
> Time taken: 7.534 seconds, Fetched: 20 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0

2016-10-20 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15023:
---
Status: Open  (was: Patch Available)

> SimpleFetchOptimizer needs to optimize limit=0
> --
>
> Key: HIVE-15023
> URL: https://issues.apache.org/jira/browse/HIVE-15023
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15023.01.patch
>
>
> on current master
> {code}
> hive> explain select key from src limit 0;
> OK
> STAGE DEPENDENCIES:
>   Stage-0 is a root stage
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: 0
>   Processor Tree:
> TableScan
>   alias: src
>   Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: key (type: string)
> outputColumnNames: _col0
> Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE 
> Column stats: NONE
> Limit
>   Number of rows: 0
>   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
> stats: NONE
>   ListSink
> Time taken: 7.534 seconds, Fetched: 20 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12765) Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)

2016-10-20 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12765:
---
Status: Patch Available  (was: Open)

> Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)
> ---
>
> Key: HIVE-12765
> URL: https://issues.apache.org/jira/browse/HIVE-12765
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12765.01.patch, HIVE-12765.02.patch, 
> HIVE-12765.03.patch, HIVE-12765.04.patch, HIVE-12765.05.patch, 
> HIVE-12765.06.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12765) Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)

2016-10-20 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12765:
---
Status: Open  (was: Patch Available)

> Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)
> ---
>
> Key: HIVE-12765
> URL: https://issues.apache.org/jira/browse/HIVE-12765
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12765.01.patch, HIVE-12765.02.patch, 
> HIVE-12765.03.patch, HIVE-12765.04.patch, HIVE-12765.05.patch, 
> HIVE-12765.06.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12765) Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)

2016-10-20 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12765:
---
Attachment: HIVE-12765.06.patch

> Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)
> ---
>
> Key: HIVE-12765
> URL: https://issues.apache.org/jira/browse/HIVE-12765
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12765.01.patch, HIVE-12765.02.patch, 
> HIVE-12765.03.patch, HIVE-12765.04.patch, HIVE-12765.05.patch, 
> HIVE-12765.06.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13589) beeline support prompt for password with '-p' option

2016-10-20 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-13589:
---
Attachment: HIVE-13589.10.patch

Attaching the patch which addresses review comments. [~Ferd] Can you please 
take a look? Sorry was not able to work on this patch for long time but it is 
ready for review again. I have updated it on the review board too. Thanks a lot!

> beeline support prompt for password with '-p' option
> 
>
> Key: HIVE-13589
> URL: https://issues.apache.org/jira/browse/HIVE-13589
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Thejas M Nair
>Assignee: Vihang Karajgaonkar
> Fix For: 2.2.0
>
> Attachments: HIVE-13589.1.patch, HIVE-13589.10.patch, 
> HIVE-13589.2.patch, HIVE-13589.3.patch, HIVE-13589.4.patch, 
> HIVE-13589.5.patch, HIVE-13589.6.patch, HIVE-13589.7.patch, 
> HIVE-13589.8.patch, HIVE-13589.9.patch
>
>
> Specifying connection string using commandline options in beeline is 
> convenient, as it gets saved in shell command history, and it is easy to 
> retrieve it from there.
> However, specifying the password in command prompt is not secure as it gets 
> displayed on screen and saved in the history.
> It should be possible to specify '-p' without an argument to make beeline 
> prompt for password.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14753) Track the number of open/closed/abandoned sessions in HS2

2016-10-20 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592480#comment-15592480
 ] 

Szehon Ho commented on HIVE-14753:
--

Sorry for delay, +1 to me

> Track the number of open/closed/abandoned sessions in HS2
> -
>
> Key: HIVE-14753
> URL: https://issues.apache.org/jira/browse/HIVE-14753
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Fix For: 2.2.0
>
> Attachments: HIVE-14753.1.patch, HIVE-14753.2.patch, 
> HIVE-14753.3.patch, HIVE-14753.patch
>
>
> We should be able to track the nr. of sessions since the startup of the HS2 
> instance as well as the average lifetime of a session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14984) Hive-WebUI access results in Request is a replay (34) attack

2016-10-20 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592474#comment-15592474
 ] 

Szehon Ho commented on HIVE-14984:
--

Thanks a lot Barna.  FYI [~jxiang]

> Hive-WebUI access results in Request is a replay (34) attack
> 
>
> Key: HIVE-14984
> URL: https://issues.apache.org/jira/browse/HIVE-14984
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.0
>Reporter: Venkat Sambath
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-14984.patch
>
>
> When trying to access kerberized webui of HS2, The following error is received
> GSSException: Failure unspecified at GSS-API level (Mechanism level: Request 
> is a replay (34))
> While this is not happening for RM webui (checked if kerberos webui is 
> enabled)
> To reproduce the issue 
> Try running
> curl --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt 
> http://:10002/
> from any cluster nodes
> or 
> Try accessing the URL from a VM with windows machine and firefox browser to 
> replicate the issue
> The following workaround helped, but need a permanent solution for the bug
> Workaround:
> =
> First access the index.html directly and then actual URL of webui
> curl --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt 
> http://:10002/index.html
> curl --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt 
> http://:10002
> In browser:
> First access
> http://:10002/index.html
> then
> http://:10002



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >