[jira] [Created] (HIVE-10907) Hive on Tez: Classcast exception in some cases with SMB joins
Vikram Dixit K created HIVE-10907: - Summary: Hive on Tez: Classcast exception in some cases with SMB joins Key: HIVE-10907 URL: https://issues.apache.org/jira/browse/HIVE-10907 Project: Hive Issue Type: Bug Reporter: Vikram Dixit K Assignee: Vikram Dixit K In cases where there is a mix of Map side work and reduce side work, we get a classcast exception because we assume homogeneity in the code. We need to fix this correctly. For now this is a workaround. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10912) LLAP: Exception in InputInitializer when creating HiveSplitGenerator
Siddharth Seth created HIVE-10912: - Summary: LLAP: Exception in InputInitializer when creating HiveSplitGenerator Key: HIVE-10912 URL: https://issues.apache.org/jira/browse/HIVE-10912 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth {code} 2015-06-03 13:46:32,212 ERROR [Dispatcher thread: Central] exec.Utilities: Failed to load plan: hdfs://localhost:8020/tmp/hive/sseth/9c4ce145-f7f4-49c4-a615-28ce154f7f1d/hive_2015-06-03_13-46-29_283_23518 java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.GlobalWorkMapFactory.get(GlobalWorkMapFactory.java:85) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:389) at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:299) at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.init(HiveSplitGenerator.java:94) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:69) at org.apache.tez.common.ReflectionUtils.createClazzInstance(ReflectionUtils.java:98) at org.apache.tez.dag.app.dag.RootInputInitializerManager.createInitializer(RootInputInitializerManager.java:137) at org.apache.tez.dag.app.dag.RootInputInitializerManager.runInputInitializers(RootInputInitializerManager.java:114) at org.apache.tez.dag.app.dag.impl.VertexImpl.setupInputInitializerManager(VertexImpl.java:4422) at org.apache.tez.dag.app.dag.impl.VertexImpl.access$4300(VertexImpl.java:200) at org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.handleInitEvent(VertexImpl.java:3271) at org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:3221) at org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:3202) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57) at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1850) at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:199) at org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2001) at org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1987) at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10908) Hive on tez: SMB join needs to work with different type of work items (map side with reduce side)
Vikram Dixit K created HIVE-10908: - Summary: Hive on tez: SMB join needs to work with different type of work items (map side with reduce side) Key: HIVE-10908 URL: https://issues.apache.org/jira/browse/HIVE-10908 Project: Hive Issue Type: Improvement Components: Tez Affects Versions: 1.3.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K This is related to HIVE-10907. This is going to be the actual enhancement/fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10910) Alter table drop partition queries in encrypted zone failing to remove data from HDFS
Aswathy Chellammal Sreekumar created HIVE-10910: --- Summary: Alter table drop partition queries in encrypted zone failing to remove data from HDFS Key: HIVE-10910 URL: https://issues.apache.org/jira/browse/HIVE-10910 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.0 Reporter: Aswathy Chellammal Sreekumar Assignee: Eugene Koifman Alter table query trying to drop partition removes metadata of partition but fails to remove the data from HDFS hive create table table_1(name string, age int, gpa double) partitioned by (b string) stored as textfile; OK Time taken: 0.732 seconds hive alter table table_1 add partition (b='2010-10-10'); OK Time taken: 0.496 seconds hive show partitions table_1; OK b=2010-10-10 Time taken: 0.781 seconds, Fetched: 1 row(s) hive alter table table_1 drop partition (b='2010-10-10'); FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Got exception: java.io.IOException Failed to move to trash: hdfs://ip-address:8020/warehouse-dir/table_1/b=2010-10-10 hive show partitions table_1; OK Time taken: 0.622 seconds -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Creating branch-1
Hadoop uses a Target Version field. Not sure if this was done for all projects. +Vinod On Jun 3, 2015, at 9:16 AM, Alan Gates alanfga...@gmail.commailto:alanfga...@gmail.com wrote: I don't think using Affects Version will work because it is used to list which versions of Hive the bug affects, unless you're proposing being able to parse affected version into branch (ie 1.3.0 = branch-1). I like the idea of customizing JIRA, though I don't know how hard it is. We could also use the labels field. It would run against master by default and you could also add a label to run against an additional branch. It would have to find a patch matching that branch in order to run. Alan. [cid:part1.08000808.03000103@gmail.com] Thejas Nairmailto:thejas.n...@gmail.com June 3, 2015 at 7:51 Thanks for the insights Sergio! Using 'Affects Version' sounds like a good idea. However, for the case where it needs to be executed against both branch-1 and master, I think it would be more intuitive to use Affects Version/s: branch-master branch-1 , as the version number in master branch will keep increasing. We might be able to request for a custom field in jira (say Test branches) for this as well. But we could probably start with the 'Affects Version' approach. [cid:part1.08000808.03000103@gmail.com] Sergio Penamailto:sergio.p...@cloudera.com June 2, 2015 at 15:03 Hi Alan, Currently, the test system executes tests on a specific branch only if there is a Jenkins job assigned to it, like trunk or spark. Any other branch will not work. We will need to create a job for branch-1, modify the jenkins-submit-build.sh to add the new profile, and add a new properties file to the Jenkins instance that contains branch information. This is a little tedious for every branch we create. Also, I don't think the test system will grab two patches (branch-1 master) to execute the tests on different branches. It will get the latest one you uploaded. What about if we use the 'Affects Version/s' field of the ticket to specify which branches the patch needs to be executed? Or as you said, use hints on the comments. For instance: - Affects Version/s: branch-1 # Tests on branch-1 only - Affects Version/s: 2.0.0 branch-1 # Tests on branch-1 and master - Affects Version/s: branch-spark # Tests on branch-spark only If we use 'branch-xxx' as a naming convention for our branches, then we can detect the branch from the ticket details. And if x.x.x version is specified, then just execute them from master. Also, branch-1 would need to be executed with MR1, right? Then the patch file would need to be named 'HIVE--mr1.patch' so that it uses the MR1 environment. Right now the code that parses this info is on process_jira function on 'jenkins-common.sh', and it is called by 'jenkins-submit-build.sh'. We can parse different branches there, and let jenkins-submit-build.sh call the correct job with specific branch details. Any other ideas? - Sergio [cid:part1.08000808.03000103@gmail.com] Alan Gatesmailto:alanfga...@gmail.com June 1, 2015 at 16:19 Based on our discussion and vote last week I'm working on creating branch-1. I plan to make the branch tomorrow. If anyone has a large commit they don't want to have to commit twice and they are close to committing it let me know so I can make sure it gets in before I branch. I'll also be updating https://cwiki.apache.org/confluence/display/Hive/HowToContribute to clarify how to handle feature and bug fix patches on master and branch-1. Also, we will need to make sure patches can be tested against master and branch-1. If I understand correctly the test system today will run a patch against a branch instead of master if the patch is named with the branch name. There are a couple of issues with this. One, people will often want to submit two versions of patches and have them both tested (one against master and one against branch-1) rather than one or the other. The second is we will want a way for one patch to be tested against both when appropriate. The first case could be handled by the system picking up both branch-1 and master patches and running them automatically. The second could be handled by hints in the comments so the system needs to run both. I'm open to other suggestions as well. Can someone familiar with the testing code point to where I'd look to see what it would take to make this work? Alan.
[jira] [Created] (HIVE-10911) Add support for date datatype in the value based windowing function
Aihua Xu created HIVE-10911: --- Summary: Add support for date datatype in the value based windowing function Key: HIVE-10911 URL: https://issues.apache.org/jira/browse/HIVE-10911 Project: Hive Issue Type: Sub-task Components: PTF-Windowing Reporter: Aihua Xu Currently date datatype is not supported in value based windowing function. For the following query with hiredate to be date type, an exception will be thrown. {{select deptno, ename, hiredate, sal, sum(sal) over (partition by deptno order by hiredate range 90 preceding) from emp;}} It's valuable to support such type with number of days as the value difference. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10909) Make TestFilterHooks robust
Ashutosh Chauhan created HIVE-10909: --- Summary: Make TestFilterHooks robust Key: HIVE-10909 URL: https://issues.apache.org/jira/browse/HIVE-10909 Project: Hive Issue Type: Test Components: Metastore, Tests Affects Versions: 1.2.0 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Ashutosh Chauhan Currently it fails sometimes when run in sequential order because of left over state from previous tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10913) LLAP: cache QF counters have a wrong counters
Sergey Shelukhin created HIVE-10913: --- Summary: LLAP: cache QF counters have a wrong counters Key: HIVE-10913 URL: https://issues.apache.org/jira/browse/HIVE-10913 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Also not enough data -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10917) ORC fails to read table with a 38Gb ORC file
Gopal V created HIVE-10917: -- Summary: ORC fails to read table with a 38Gb ORC file Key: HIVE-10917 URL: https://issues.apache.org/jira/browse/HIVE-10917 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 1.3.0 Reporter: Gopal V {code} hive set mapreduce.input.fileinputformat.split.maxsize=1; hive set mapreduce.input.fileinputformat.split.maxsize=1; hive alter table lineitem concatenate; .. hive dfs -ls /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem; Found 12 items -rwxr-xr-x 3 gopal supergroup 41368976599 2015-06-03 15:49 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/00_0 -rwxr-xr-x 3 gopal supergroup 36226719673 2015-06-03 15:48 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/01_0 -rwxr-xr-x 3 gopal supergroup 27544042018 2015-06-03 15:50 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/02_0 -rwxr-xr-x 3 gopal supergroup 23147063608 2015-06-03 15:44 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/03_0 -rwxr-xr-x 3 gopal supergroup 21079035936 2015-06-03 15:44 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/04_0 -rwxr-xr-x 3 gopal supergroup 13813961419 2015-06-03 15:43 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/05_0 -rwxr-xr-x 3 gopal supergroup 8155299977 2015-06-03 15:40 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/06_0 -rwxr-xr-x 3 gopal supergroup 6264478613 2015-06-03 15:40 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/07_0 -rwxr-xr-x 3 gopal supergroup 4653393054 2015-06-03 15:40 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/08_0 -rwxr-xr-x 3 gopal supergroup 3621672928 2015-06-03 15:39 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/09_0 -rwxr-xr-x 3 gopal supergroup 1460919310 2015-06-03 15:38 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/10_0 -rwxr-xr-x 3 gopal supergroup 485129789 2015-06-03 15:38 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/11_0 {code} Errors without PPD Suspicions about ORC stripe padding and stream offsets in the stream information, when concatenating. {code} Caused by: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 1 kind DATA position: 1608840 length: 1608840 range: 0 offset: 1608840 limit: 1608840 range 0 = 0 to 1608840 uncompressed: 36845 to 36845 at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:56) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:302) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:346) at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$LongTreeReader.nextVector(TreeReaderFactory.java:582) at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.nextVector(TreeReaderFactory.java:2026) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1070) ... 25 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10921) Change trunk pom version to reflect the branch-1 split
Sergey Shelukhin created HIVE-10921: --- Summary: Change trunk pom version to reflect the branch-1 split Key: HIVE-10921 URL: https://issues.apache.org/jira/browse/HIVE-10921 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-10921.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10923) encryption_join_with_different_encryption_keys.q fails on CentOS 6
Pengcheng Xiong created HIVE-10923: -- Summary: encryption_join_with_different_encryption_keys.q fails on CentOS 6 Key: HIVE-10923 URL: https://issues.apache.org/jira/browse/HIVE-10923 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Here is the stack trace {code} Task with the most failures(4): - Task ID: task_1433377676690_0015_m_00 URL: http://ip-10-0-0-249.ec2.internal:44717/taskdetails.jsp?jobid=job_1433377676690_0015tipid=task_1433377676690_0015_m_00 - Diagnostic Messages for this Task: Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {key:238,value:val_238} at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {key:238,value:val_238} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException(java.io.IOException): java.security.InvalidKeyException: Illegal key size at org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.init(JceAesCtrCryptoCodec.java:116) at org.apache.hadoop.crypto.key.KeyProviderCryptoExtension$DefaultCryptoExtension.generateEncryptedKey(KeyProviderCryptoExtension.java:264) at org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.generateEncryptedKey(KeyProviderCryptoExtension.java:371) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.generateEncryptedDataEncryptionKey(FSNamesystem.java:2489) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2620) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2519) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:566) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:394) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) Caused by: java.security.InvalidKeyException: Illegal key size at javax.crypto.Cipher.checkCryptoPerm(Cipher.java:1024) at javax.crypto.Cipher.implInit(Cipher.java:790) at javax.crypto.Cipher.chooseProvider(Cipher.java:849) at javax.crypto.Cipher.init(Cipher.java:1348) at javax.crypto.Cipher.init(Cipher.java:1282) at org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.init(JceAesCtrCryptoCodec.java:113) ... 16 more at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:577) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:675) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508) ... 9 more Caused by:
[jira] [Created] (HIVE-10924) add support for MERGE statement
Eugene Koifman created HIVE-10924: - Summary: add support for MERGE statement Key: HIVE-10924 URL: https://issues.apache.org/jira/browse/HIVE-10924 Project: Hive Issue Type: Bug Components: Query Planning, Query Processor Affects Versions: 1.2.0 Reporter: Eugene Koifman Assignee: Eugene Koifman add support for MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10916) ORC fails to read table with a 38Gb ORC file
Gopal V created HIVE-10916: -- Summary: ORC fails to read table with a 38Gb ORC file Key: HIVE-10916 URL: https://issues.apache.org/jira/browse/HIVE-10916 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 1.3.0 Reporter: Gopal V {code} hive set mapreduce.input.fileinputformat.split.maxsize=1; hive set mapreduce.input.fileinputformat.split.maxsize=1; hive alter table lineitem concatenate; .. hive dfs -ls /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem; Found 12 items -rwxr-xr-x 3 gopal supergroup 41368976599 2015-06-03 15:49 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/00_0 -rwxr-xr-x 3 gopal supergroup 36226719673 2015-06-03 15:48 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/01_0 -rwxr-xr-x 3 gopal supergroup 27544042018 2015-06-03 15:50 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/02_0 -rwxr-xr-x 3 gopal supergroup 23147063608 2015-06-03 15:44 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/03_0 -rwxr-xr-x 3 gopal supergroup 21079035936 2015-06-03 15:44 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/04_0 -rwxr-xr-x 3 gopal supergroup 13813961419 2015-06-03 15:43 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/05_0 -rwxr-xr-x 3 gopal supergroup 8155299977 2015-06-03 15:40 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/06_0 -rwxr-xr-x 3 gopal supergroup 6264478613 2015-06-03 15:40 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/07_0 -rwxr-xr-x 3 gopal supergroup 4653393054 2015-06-03 15:40 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/08_0 -rwxr-xr-x 3 gopal supergroup 3621672928 2015-06-03 15:39 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/09_0 -rwxr-xr-x 3 gopal supergroup 1460919310 2015-06-03 15:38 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/10_0 -rwxr-xr-x 3 gopal supergroup 485129789 2015-06-03 15:38 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/11_0 {code} Errors without PPD Suspicions about ORC stripe padding and stream offsets in the stream information, when concatenating. {code} Caused by: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 1 kind DATA position: 1608840 length: 1608840 range: 0 offset: 1608840 limit: 1608840 range 0 = 0 to 1608840 uncompressed: 36845 to 36845 at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:56) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:302) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:346) at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$LongTreeReader.nextVector(TreeReaderFactory.java:582) at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.nextVector(TreeReaderFactory.java:2026) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1070) ... 25 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10915) ORC fails to read table with a 38Gb ORC file
Gopal V created HIVE-10915: -- Summary: ORC fails to read table with a 38Gb ORC file Key: HIVE-10915 URL: https://issues.apache.org/jira/browse/HIVE-10915 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 1.3.0 Reporter: Gopal V {code} hive set mapreduce.input.fileinputformat.split.maxsize=1; hive set mapreduce.input.fileinputformat.split.maxsize=1; hive alter table lineitem concatenate; .. hive dfs -ls /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem; Found 12 items -rwxr-xr-x 3 gopal supergroup 41368976599 2015-06-03 15:49 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/00_0 -rwxr-xr-x 3 gopal supergroup 36226719673 2015-06-03 15:48 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/01_0 -rwxr-xr-x 3 gopal supergroup 27544042018 2015-06-03 15:50 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/02_0 -rwxr-xr-x 3 gopal supergroup 23147063608 2015-06-03 15:44 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/03_0 -rwxr-xr-x 3 gopal supergroup 21079035936 2015-06-03 15:44 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/04_0 -rwxr-xr-x 3 gopal supergroup 13813961419 2015-06-03 15:43 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/05_0 -rwxr-xr-x 3 gopal supergroup 8155299977 2015-06-03 15:40 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/06_0 -rwxr-xr-x 3 gopal supergroup 6264478613 2015-06-03 15:40 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/07_0 -rwxr-xr-x 3 gopal supergroup 4653393054 2015-06-03 15:40 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/08_0 -rwxr-xr-x 3 gopal supergroup 3621672928 2015-06-03 15:39 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/09_0 -rwxr-xr-x 3 gopal supergroup 1460919310 2015-06-03 15:38 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/10_0 -rwxr-xr-x 3 gopal supergroup 485129789 2015-06-03 15:38 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/11_0 {code} Errors without PPD Suspicious offsets in the stream information - {code} Caused by: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 1 kind DATA position: 1608840 length: 1608840 range: 0 offset: 1608840 limit: 1608840 range 0 = 0 to 1608840 uncompressed: 36845 to 36845 at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:56) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:302) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:346) at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$LongTreeReader.nextVector(TreeReaderFactory.java:582) at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.nextVector(TreeReaderFactory.java:2026) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1070) ... 25 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Creating branch-1
Meanwhile, should we switch trunk HiveQA builds to use hadoop-2 profile? I see they are still running hadoop-1 :) From: Vinod Kumar Vavilapalli vino...@hortonworks.commailto:vino...@hortonworks.com Reply-To: dev@hive.apache.orgmailto:dev@hive.apache.org dev@hive.apache.orgmailto:dev@hive.apache.org Date: Wednesday, June 3, 2015 at 12:17 To: dev@hive.apache.orgmailto:dev@hive.apache.org dev@hive.apache.orgmailto:dev@hive.apache.org Subject: Re: Creating branch-1 Hadoop uses a Target Version field. Not sure if this was done for all projects. +Vinod On Jun 3, 2015, at 9:16 AM, Alan Gates alanfga...@gmail.commailto:alanfga...@gmail.com wrote: I don't think using Affects Version will work because it is used to list which versions of Hive the bug affects, unless you're proposing being able to parse affected version into branch (ie 1.3.0 = branch-1). I like the idea of customizing JIRA, though I don't know how hard it is. We could also use the labels field. It would run against master by default and you could also add a label to run against an additional branch. It would have to find a patch matching that branch in order to run. Alan. [cid:part1.08000808.03000103@gmail.com] Thejas Nairmailto:thejas.n...@gmail.com June 3, 2015 at 7:51 Thanks for the insights Sergio! Using 'Affects Version' sounds like a good idea. However, for the case where it needs to be executed against both branch-1 and master, I think it would be more intuitive to use Affects Version/s: branch-master branch-1 , as the version number in master branch will keep increasing. We might be able to request for a custom field in jira (say Test branches) for this as well. But we could probably start with the 'Affects Version' approach. [cid:part1.08000808.03000103@gmail.com] Sergio Penamailto:sergio.p...@cloudera.com June 2, 2015 at 15:03 Hi Alan, Currently, the test system executes tests on a specific branch only if there is a Jenkins job assigned to it, like trunk or spark. Any other branch will not work. We will need to create a job for branch-1, modify the jenkins-submit-build.sh to add the new profile, and add a new properties file to the Jenkins instance that contains branch information. This is a little tedious for every branch we create. Also, I don't think the test system will grab two patches (branch-1 master) to execute the tests on different branches. It will get the latest one you uploaded. What about if we use the 'Affects Version/s' field of the ticket to specify which branches the patch needs to be executed? Or as you said, use hints on the comments. For instance: - Affects Version/s: branch-1 # Tests on branch-1 only - Affects Version/s: 2.0.0 branch-1 # Tests on branch-1 and master - Affects Version/s: branch-spark # Tests on branch-spark only If we use 'branch-xxx' as a naming convention for our branches, then we can detect the branch from the ticket details. And if x.x.x version is specified, then just execute them from master. Also, branch-1 would need to be executed with MR1, right? Then the patch file would need to be named 'HIVE--mr1.patch' so that it uses the MR1 environment. Right now the code that parses this info is on process_jira function on 'jenkins-common.sh', and it is called by 'jenkins-submit-build.sh'. We can parse different branches there, and let jenkins-submit-build.sh call the correct job with specific branch details. Any other ideas? - Sergio [cid:part1.08000808.03000103@gmail.com] Alan Gatesmailto:alanfga...@gmail.com June 1, 2015 at 16:19 Based on our discussion and vote last week I'm working on creating branch-1. I plan to make the branch tomorrow. If anyone has a large commit they don't want to have to commit twice and they are close to committing it let me know so I can make sure it gets in before I branch. I'll also be updating https://cwiki.apache.org/confluence/display/Hive/HowToContribute to clarify how to handle feature and bug fix patches on master and branch-1. Also, we will need to make sure patches can be tested against master and branch-1. If I understand correctly the test system today will run a patch against a branch instead of master if the patch is named with the branch name. There are a couple of issues with this. One, people will often want to submit two versions of patches and have them both tested (one against master and one against branch-1) rather than one or the other. The second is we will want a way for one patch to be tested against both when appropriate. The first case could be handled by the system picking up both branch-1 and master patches and running them automatically. The second could be handled by hints in the comments so the system needs to run both. I'm open to other suggestions as well. Can someone familiar with the testing code point to where I'd look to see what it would take to make this work? Alan.
Re: Creating branch-1
Do the hadoop jenkins scripts use some regex match on 'target version' to identify the branch to be used ? On Wed, Jun 3, 2015 at 12:17 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: Hadoop uses a Target Version field. Not sure if this was done for all projects. +Vinod On Jun 3, 2015, at 9:16 AM, Alan Gates alanfga...@gmail.com wrote: I don't think using Affects Version will work because it is used to list which versions of Hive the bug affects, unless you're proposing being able to parse affected version into branch (ie 1.3.0 = branch-1). I like the idea of customizing JIRA, though I don't know how hard it is. We could also use the labels field. It would run against master by default and you could also add a label to run against an additional branch. It would have to find a patch matching that branch in order to run. Alan. Thejas Nair thejas.n...@gmail.com June 3, 2015 at 7:51 Thanks for the insights Sergio! Using 'Affects Version' sounds like a good idea. However, for the case where it needs to be executed against both branch-1 and master, I think it would be more intuitive to use Affects Version/s: branch-master branch-1 , as the version number in master branch will keep increasing. We might be able to request for a custom field in jira (say Test branches) for this as well. But we could probably start with the 'Affects Version' approach. Sergio Pena sergio.p...@cloudera.com June 2, 2015 at 15:03 Hi Alan, Currently, the test system executes tests on a specific branch only if there is a Jenkins job assigned to it, like trunk or spark. Any other branch will not work. We will need to create a job for branch-1, modify the jenkins-submit-build.sh to add the new profile, and add a new properties file to the Jenkins instance that contains branch information. This is a little tedious for every branch we create. Also, I don't think the test system will grab two patches (branch-1 master) to execute the tests on different branches. It will get the latest one you uploaded. What about if we use the 'Affects Version/s' field of the ticket to specify which branches the patch needs to be executed? Or as you said, use hints on the comments. For instance: - Affects Version/s: branch-1 # Tests on branch-1 only - Affects Version/s: 2.0.0 branch-1 # Tests on branch-1 and master - Affects Version/s: branch-spark # Tests on branch-spark only If we use 'branch-xxx' as a naming convention for our branches, then we can detect the branch from the ticket details. And if x.x.x version is specified, then just execute them from master. Also, branch-1 would need to be executed with MR1, right? Then the patch file would need to be named 'HIVE--mr1.patch' so that it uses the MR1 environment. Right now the code that parses this info is on process_jira function on 'jenkins-common.sh', and it is called by 'jenkins-submit-build.sh'. We can parse different branches there, and let jenkins-submit-build.sh call the correct job with specific branch details. Any other ideas? - Sergio Alan Gates alanfga...@gmail.com June 1, 2015 at 16:19 Based on our discussion and vote last week I'm working on creating branch-1. I plan to make the branch tomorrow. If anyone has a large commit they don't want to have to commit twice and they are close to committing it let me know so I can make sure it gets in before I branch. I'll also be updating https://cwiki.apache.org/confluence/display/Hive/HowToContribute to clarify how to handle feature and bug fix patches on master and branch-1. Also, we will need to make sure patches can be tested against master and branch-1. If I understand correctly the test system today will run a patch against a branch instead of master if the patch is named with the branch name. There are a couple of issues with this. One, people will often want to submit two versions of patches and have them both tested (one against master and one against branch-1) rather than one or the other. The second is we will want a way for one patch to be tested against both when appropriate. The first case could be handled by the system picking up both branch-1 and master patches and running them automatically. The second could be handled by hints in the comments so the system needs to run both. I'm open to other suggestions as well. Can someone familiar with the testing code point to where I'd look to see what it would take to make this work? Alan.
[jira] [Created] (HIVE-10918) ORC fails to read table with a 38Gb ORC file
Gopal V created HIVE-10918: -- Summary: ORC fails to read table with a 38Gb ORC file Key: HIVE-10918 URL: https://issues.apache.org/jira/browse/HIVE-10918 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 1.3.0 Reporter: Gopal V {code} hive set mapreduce.input.fileinputformat.split.maxsize=1; hive set mapreduce.input.fileinputformat.split.maxsize=1; hive alter table lineitem concatenate; .. hive dfs -ls /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem; Found 12 items -rwxr-xr-x 3 gopal supergroup 41368976599 2015-06-03 15:49 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/00_0 -rwxr-xr-x 3 gopal supergroup 36226719673 2015-06-03 15:48 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/01_0 -rwxr-xr-x 3 gopal supergroup 27544042018 2015-06-03 15:50 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/02_0 -rwxr-xr-x 3 gopal supergroup 23147063608 2015-06-03 15:44 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/03_0 -rwxr-xr-x 3 gopal supergroup 21079035936 2015-06-03 15:44 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/04_0 -rwxr-xr-x 3 gopal supergroup 13813961419 2015-06-03 15:43 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/05_0 -rwxr-xr-x 3 gopal supergroup 8155299977 2015-06-03 15:40 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/06_0 -rwxr-xr-x 3 gopal supergroup 6264478613 2015-06-03 15:40 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/07_0 -rwxr-xr-x 3 gopal supergroup 4653393054 2015-06-03 15:40 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/08_0 -rwxr-xr-x 3 gopal supergroup 3621672928 2015-06-03 15:39 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/09_0 -rwxr-xr-x 3 gopal supergroup 1460919310 2015-06-03 15:38 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/10_0 -rwxr-xr-x 3 gopal supergroup 485129789 2015-06-03 15:38 /apps/hive/warehouse/tpch_orc_flat_1000.db/lineitem/11_0 {code} Errors without PPD Suspicions about ORC stripe padding and stream offsets in the stream information, when concatenating. {code} Caused by: java.io.EOFException: Read past end of RLE integer from compressed stream Stream for column 1 kind DATA position: 1608840 length: 1608840 range: 0 offset: 1608840 limit: 1608840 range 0 = 0 to 1608840 uncompressed: 36845 to 36845 at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:56) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:302) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:346) at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$LongTreeReader.nextVector(TreeReaderFactory.java:582) at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.nextVector(TreeReaderFactory.java:2026) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1070) ... 25 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10922) In HS2 doAs=false mode, file system related errors in one query causes other failures
Thejas M Nair created HIVE-10922: Summary: In HS2 doAs=false mode, file system related errors in one query causes other failures Key: HIVE-10922 URL: https://issues.apache.org/jira/browse/HIVE-10922 Project: Hive Issue Type: Bug Affects Versions: 1.2.0, 1.0.0, 1.1.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Warehouse class has a few methods that close file system object on errors. With doAs=false, since all queries use the same HS2 ugi, the filesystem object is shared across queries/threads. When the close on one filesystem object gets called, it leads to filesystem object used in other threads also get closed and any files registered for deletion on exit also getting deleted. There is also no close being done in case of the happy code path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10914) LLAP: fix hadoop-1 build for good by removing llap-server from hadoop-1 build
Sergey Shelukhin created HIVE-10914: --- Summary: LLAP: fix hadoop-1 build for good by removing llap-server from hadoop-1 build Key: HIVE-10914 URL: https://issues.apache.org/jira/browse/HIVE-10914 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin LLAP won't ever work with hadoop 1, so no point in building it -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10920) LLAP: elevator reads some useless data even if all RGs are eliminated by SARG
Sergey Shelukhin created HIVE-10920: --- Summary: LLAP: elevator reads some useless data even if all RGs are eliminated by SARG Key: HIVE-10920 URL: https://issues.apache.org/jira/browse/HIVE-10920 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: llap -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10919) Windows: create table with JsonSerDe failed via beeline unless you add hcatalog core jar to classpath
Hari Sankar Sivarama Subramaniyan created HIVE-10919: Summary: Windows: create table with JsonSerDe failed via beeline unless you add hcatalog core jar to classpath Key: HIVE-10919 URL: https://issues.apache.org/jira/browse/HIVE-10919 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Before we run HiveServer2 tests, we create table via beeline. And 'create table' with JsonSerDe failed on Winodws. It works on Linux: {noformat} 0: jdbc:hive2://localhost:10001 create external table all100kjson( 0: jdbc:hive2://localhost:10001 s string, 0: jdbc:hive2://localhost:10001 i int, 0: jdbc:hive2://localhost:10001 d double, 0: jdbc:hive2://localhost:10001 m mapstring, string, 0: jdbc:hive2://localhost:10001 bb arraystructa: int, b: string, 0: jdbc:hive2://localhost:10001 t timestamp) 0: jdbc:hive2://localhost:10001 row format serde 'org.apache.hive.hcatalog.data.JsonSerDe' 0: jdbc:hive2://localhost:10001 WITH SERDEPROPERTIES ('timestamp.formats'='-MM-dd\'T\'HH:mm:ss') 0: jdbc:hive2://localhost:10001 STORED AS TEXTFILE 0: jdbc:hive2://localhost:10001 location '/user/hcat/tests/data/all100kjson'; Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLT ask. Cannot validate serde: org.apache.hive.hcatalog.data.JsonSerDe (state=08S01,code=1) {noformat} hive.log shows: {noformat} 2015-05-21 21:59:17,004 ERROR operation.Operation (SQLOperation.java:run(209)) - Error running hive query: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: org.apache.hive.hcatalog.data.JsonSerDe at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:315) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:156) at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71) at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Cannot validate serde: org.apache.hive.hcatalog.data.JsonSerDe at org.apache.hadoop.hive.ql.exec.DDLTask.validateSerDe(DDLTask.java:3871) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4011) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:306) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1650) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1409) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154) ... 11 more Caused by: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.data.JsonSerDe not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101) at org.apache.hadoop.hive.ql.exec.DDLTask.validateSerDe(DDLTask.java:3865) ... 21 more {noformat} If you do add the hcatalog jar to classpath, it works: {noformat}0: jdbc:hive2://localhost:10001 add jar hdfs:///tmp/testjars/hive-hcatalog-core-1.2.0.2.3.0.0-2079.jar; INFO : converting to local hdfs:///tmp/testjars/hive-hcatalog-core-1.2.0.2.3.0.0-2079.jar INFO : Added [/C:/Users/hadoop/AppData/Local/Temp/bc941dac-3bca-4287-a490-8a65c2dac220_resources/hive-hcatalog-core-1.2 .0.2.3.0.0-2079.jar] to class path INFO : Added resources: [hdfs:///tmp/testjars/hive-hcatalog-core-1.2.0.2.3.0.0-2079.jar] No rows affected (0.304 seconds) 0: jdbc:hive2://localhost:10001 create external table all100kjson( 0: jdbc:hive2://localhost:10001 s string, 0:
Re: Review Request 34752: Beeline-CLI: Implement CLI source command using Beeline functionality
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34752/#review86368 --- Ship it! Ship It! - Xuefu Zhang On June 1, 2015, 3:17 a.m., cheng xu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34752/ --- (Updated June 1, 2015, 3:17 a.m.) Review request for hive, chinna and Xuefu Zhang. Bugs: HIVE-10821 https://issues.apache.org/jira/browse/HIVE-10821 Repository: hive-git Description --- Add source command support for CLI using beeline Diffs - beeline/src/java/org/apache/hive/beeline/BeeLine.java 4a82635 beeline/src/java/org/apache/hive/beeline/Commands.java 4c60525 beeline/src/test/org/apache/hive/beeline/cli/TestHiveCli.java cc0b598 Diff: https://reviews.apache.org/r/34752/diff/ Testing --- Newly created UT passed Thanks, cheng xu
[jira] [Created] (HIVE-10903) Add hive.in.test for HoS tests [Spark Branch]
Rui Li created HIVE-10903: - Summary: Add hive.in.test for HoS tests [Spark Branch] Key: HIVE-10903 URL: https://issues.apache.org/jira/browse/HIVE-10903 Project: Hive Issue Type: Test Reporter: Rui Li -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10905) QuitExit fails ending with ';' [beeline-cli Branch]
Chinna Rao Lalam created HIVE-10905: --- Summary: QuitExit fails ending with ';' [beeline-cli Branch] Key: HIVE-10905 URL: https://issues.apache.org/jira/browse/HIVE-10905 Project: Hive Issue Type: Bug Affects Versions: beeline-cli-branch Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam In CLI quit and exit will expect ending ';' In Updated CLI quit and exit without ending ; is working. quit and exit ending with ';' throwing exception. Support quit and exit with ending ';' for the compatibility; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Creating branch-1
Thanks for the insights Sergio! Using 'Affects Version' sounds like a good idea. However, for the case where it needs to be executed against both branch-1 and master, I think it would be more intuitive to use Affects Version/s: branch-master branch-1 , as the version number in master branch will keep increasing. We might be able to request for a custom field in jira (say Test branches) for this as well. But we could probably start with the 'Affects Version' approach. On Tue, Jun 2, 2015 at 3:03 PM, Sergio Pena sergio.p...@cloudera.com wrote: Hi Alan, Currently, the test system executes tests on a specific branch only if there is a Jenkins job assigned to it, like trunk or spark. Any other branch will not work. We will need to create a job for branch-1, modify the jenkins-submit-build.sh to add the new profile, and add a new properties file to the Jenkins instance that contains branch information. This is a little tedious for every branch we create. Also, I don't think the test system will grab two patches (branch-1 master) to execute the tests on different branches. It will get the latest one you uploaded. What about if we use the 'Affects Version/s' field of the ticket to specify which branches the patch needs to be executed? Or as you said, use hints on the comments. For instance: - Affects Version/s: branch-1 # Tests on branch-1 only - Affects Version/s: 2.0.0 branch-1 # Tests on branch-1 and master - Affects Version/s: branch-spark # Tests on branch-spark only If we use 'branch-xxx' as a naming convention for our branches, then we can detect the branch from the ticket details. And if x.x.x version is specified, then just execute them from master. Also, branch-1 would need to be executed with MR1, right? Then the patch file would need to be named 'HIVE--mr1.patch' so that it uses the MR1 environment. Right now the code that parses this info is on process_jira function on 'jenkins-common.sh', and it is called by 'jenkins-submit-build.sh'. We can parse different branches there, and let jenkins-submit-build.sh call the correct job with specific branch details. Any other ideas? - Sergio On Mon, Jun 1, 2015 at 6:19 PM, Alan Gates alanfga...@gmail.com wrote: Based on our discussion and vote last week I'm working on creating branch-1. I plan to make the branch tomorrow. If anyone has a large commit they don't want to have to commit twice and they are close to committing it let me know so I can make sure it gets in before I branch. I'll also be updating https://cwiki.apache.org/confluence/display/Hive/HowToContribute to clarify how to handle feature and bug fix patches on master and branch-1. Also, we will need to make sure patches can be tested against master and branch-1. If I understand correctly the test system today will run a patch against a branch instead of master if the patch is named with the branch name. There are a couple of issues with this. One, people will often want to submit two versions of patches and have them both tested (one against master and one against branch-1) rather than one or the other. The second is we will want a way for one patch to be tested against both when appropriate. The first case could be handled by the system picking up both branch-1 and master patches and running them automatically. The second could be handled by hints in the comments so the system needs to run both. I'm open to other suggestions as well. Can someone familiar with the testing code point to where I'd look to see what it would take to make this work? Alan.
[jira] [Created] (HIVE-10904) Use beeline-log4j.properties for migrated CLI [beeline-cli Branch]
Chinna Rao Lalam created HIVE-10904: --- Summary: Use beeline-log4j.properties for migrated CLI [beeline-cli Branch] Key: HIVE-10904 URL: https://issues.apache.org/jira/browse/HIVE-10904 Project: Hive Issue Type: Bug Affects Versions: beeline-cli-branch Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Updated CLI printing logs on the console. Use beeline-log4j.properties for redirecting to file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Creating branch-1
I don't think using Affects Version will work because it is used to list which versions of Hive the bug affects, unless you're proposing being able to parse affected version into branch (ie 1.3.0 = branch-1). I like the idea of customizing JIRA, though I don't know how hard it is. We could also use the labels field. It would run against master by default and you could also add a label to run against an additional branch. It would have to find a patch matching that branch in order to run. Alan. Thejas Nair mailto:thejas.n...@gmail.com June 3, 2015 at 7:51 Thanks for the insights Sergio! Using 'Affects Version' sounds like a good idea. However, for the case where it needs to be executed against both branch-1 and master, I think it would be more intuitive to use Affects Version/s: branch-master branch-1 , as the version number in master branch will keep increasing. We might be able to request for a custom field in jira (say Test branches) for this as well. But we could probably start with the 'Affects Version' approach. Sergio Pena mailto:sergio.p...@cloudera.com June 2, 2015 at 15:03 Hi Alan, Currently, the test system executes tests on a specific branch only if there is a Jenkins job assigned to it, like trunk or spark. Any other branch will not work. We will need to create a job for branch-1, modify the jenkins-submit-build.sh to add the new profile, and add a new properties file to the Jenkins instance that contains branch information. This is a little tedious for every branch we create. Also, I don't think the test system will grab two patches (branch-1 master) to execute the tests on different branches. It will get the latest one you uploaded. What about if we use the 'Affects Version/s' field of the ticket to specify which branches the patch needs to be executed? Or as you said, use hints on the comments. For instance: - Affects Version/s: branch-1 # Tests on branch-1 only - Affects Version/s: 2.0.0 branch-1 # Tests on branch-1 and master - Affects Version/s: branch-spark # Tests on branch-spark only If we use 'branch-xxx' as a naming convention for our branches, then we can detect the branch from the ticket details. And if x.x.x version is specified, then just execute them from master. Also, branch-1 would need to be executed with MR1, right? Then the patch file would need to be named 'HIVE--mr1.patch' so that it uses the MR1 environment. Right now the code that parses this info is on process_jira function on 'jenkins-common.sh', and it is called by 'jenkins-submit-build.sh'. We can parse different branches there, and let jenkins-submit-build.sh call the correct job with specific branch details. Any other ideas? - Sergio Alan Gates mailto:alanfga...@gmail.com June 1, 2015 at 16:19 Based on our discussion and vote last week I'm working on creating branch-1. I plan to make the branch tomorrow. If anyone has a large commit they don't want to have to commit twice and they are close to committing it let me know so I can make sure it gets in before I branch. I'll also be updating https://cwiki.apache.org/confluence/display/Hive/HowToContribute to clarify how to handle feature and bug fix patches on master and branch-1. Also, we will need to make sure patches can be tested against master and branch-1. If I understand correctly the test system today will run a patch against a branch instead of master if the patch is named with the branch name. There are a couple of issues with this. One, people will often want to submit two versions of patches and have them both tested (one against master and one against branch-1) rather than one or the other. The second is we will want a way for one patch to be tested against both when appropriate. The first case could be handled by the system picking up both branch-1 and master patches and running them automatically. The second could be handled by hints in the comments so the system needs to run both. I'm open to other suggestions as well. Can someone familiar with the testing code point to where I'd look to see what it would take to make this work? Alan.
Hive-0.14 - Build # 973 - Fixed
Changes for Build #972 Changes for Build #973 No tests ran. The Apache Jenkins build system has built Hive-0.14 (build #973) Status: Fixed Check console output at https://builds.apache.org/job/Hive-0.14/973/ to view the results.
[jira] [Created] (HIVE-10925) Non-static threadlocals in metastore code can potentially cause memory leak
Vaibhav Gumashta created HIVE-10925: --- Summary: Non-static threadlocals in metastore code can potentially cause memory leak Key: HIVE-10925 URL: https://issues.apache.org/jira/browse/HIVE-10925 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.2.0, 1.0.0, 0.14.0, 0.12.0, 0.11.0, 1.1.0, 0.13 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta There are many places where non-static threadlocals are used. I can't seem to find a good logic of using them. However, they can potentially result in leaking objects if for example they are created in a long running thread every time the thread handles a new session. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10906) Value based UDAF function throws NPE
Aihua Xu created HIVE-10906: --- Summary: Value based UDAF function throws NPE Key: HIVE-10906 URL: https://issues.apache.org/jira/browse/HIVE-10906 Project: Hive Issue Type: Sub-task Components: PTF-Windowing Reporter: Aihua Xu The following query throws NPE. {noformat} select key, value, min(value) over (partition by key range between unbounded preceding and current row) from small; FAILED: NullPointerException null 2015-06-03 13:48:09,268 ERROR [main]: ql.Driver (SessionState.java:printError(957)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.WindowingSpec.validateValueBoundary(WindowingSpec.java:293) at org.apache.hadoop.hive.ql.parse.WindowingSpec.validateWindowFrame(WindowingSpec.java:281) at org.apache.hadoop.hive.ql.parse.WindowingSpec.validateAndMakeEffective(WindowingSpec.java:155) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genWindowingPlan(SemanticAnalyzer.java:11965) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8910) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8868) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9713) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9606) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10079) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:327) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10090) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1124) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1172) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1061) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1051) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)