Re: Error : "Cannot create a path from empty string"
Sorry for the delay in responding. I am not trying to run a separate job/query on the intermediate output of another job/query. The error is happening in a multi-job mapreduce query. When one step finishes, the next step is trying to read the output of the previous job and fails with the "can not create a Path from an empty string". Thanks, Viral On Wed, May 13, 2015 at 10:28 PM, unmesha sreeveni wrote: > > On Thu, May 7, 2015 at 9:57 PM, Viral Bajaria > wrote: > >> Since the output is from an intermediate step it was already cleaned up >> and I wasn't able to check. > > > So that means , path will be empty right. > So the stack trace says right. > java.lang.IllegalArgumentException: Can not create a Path from an empty > string > You are fetching for an empty Path. > > > -- > *Thanks & Regards * > > > *Unmesha Sreeveni U.B* > *Hadoop, Bigdata Developer* > *Centre for Cyber Security | Amrita Vishwa Vidyapeetham* > http://www.unmeshasreeveni.blogspot.in/ > > >
Hive schema on read
Hi All, Since hive is schema on read when we try to write data that is different data type into a column it doesn't throw any error. When we try to read it , it actually shoe NULl if its a different data type. Are there any options to throw error if data is of different data type when we try to insert or read Thanks Giri
RE: Index Rebuild - DUG failes due to vertex failure
HI Marc, Regardless of whether you rebuild an index or not I came across checking whether indexes are used in Hive. In so far as I know indexes are not fully implemented in Hive and Hive does not use the index. See the attached emails. HTH Mich Talebzadeh http://talebzadehmich.wordpress.com Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7. co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4 Publications due shortly: Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility. From: Marc Seeger [mailto:m...@web-computing.de] Sent: 15 May 2015 12:53 To: user@hive.apache.org Subject: Re: Index Rebuild - DUG failes due to vertex failure At Hive-Wiki indexing is described: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Indexing and https://cwiki.apache.org/confluence/display/Hive/IndexDev As I used it first, Index was build. The index table were filled with data and the performance increased. But without changes at configuration it does not work anymore on any table. Followed you got the result of my explain-query. 0: jdbc:hive2://localhost:1> EXPLAIN ALTER INDEX ix_key ON DbTest.Tbl_test REBUILD; +---+--+ | Explain | +---+--+ | STAGE DEPENDENCIES: | | Stage-1 is a root stage | | Stage-2 depends on stages: Stage-1 | | Stage-0 depends on stages: Stage-2 | | Stage-3 depends on stages: Stage-0 | | Stage-4 depends on stages: Stage-1 | | Stage-5 depends on stages: Stage-1 | | | | STAGE PLANS: | | Stage: Stage-1 | | Tez | | Edges: | | Reducer 2 <- Map 1 (SIMPLE_EDGE) | | DagName: hive_20150515134040_03126f31-d054-4bb0-9f58-99f72cd8d1ab:94 | | Vertices: | | Map 1 | | Map Operator Tree: | | TableScan | | alias: Tbl_test
Re: Repeated Hive start-up issues
Hi Anand, That depends on issue. You have to understand namenode logs. Sent from really tiny device :) On Friday, May 15, 2015, Anand Murali wrote: > Hi: > > Many thanks for replying. Can you please tell me how to fix namenode safe > mode issue. I am new to Hadoop. > > Thanks > > Regards > > Anand > > Sent from my iPhone > > On 15-May-2015, at 7:14 pm, Xuefu Zhang > wrote: > > Your namenode is in safe mode, as the exception shows. You need to > verify/fix that before trying Hive. > > Secondly, "!=" may not work as expected. Try "<>" or other simpler query > first. > > --Xuefu > > On Fri, May 15, 2015 at 6:17 AM, Anand Murali > wrote: > >> Hi All: >> >> I have installed Hadoop-2.6, Hive 1.1 and try to start hive and get the >> following, first time when I start the cluster >> >> $hive >> >> Logging initialized using configuration in >> jar:file:/home/anand_vihar/hive-1.1.0/lib/hive-common-1.1.0.jar!/hive-log4j.properties >> SLF4J: Class path contains multiple SLF4J bindings. >> SLF4J: Found binding in >> [jar:file:/home/anand_vihar/hive-1.1.0/lib/hive-jdbc-1.1.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class] >> SLF4J: Found binding in >> [jar:file:/home/anand_vihar/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] >> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an >> explanation. >> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] >> Exception in thread "main" java.lang.RuntimeException: >> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): >> Cannot create directory >> /tmp/hive/anand_vihar/a9d68b70-01b4-4d4d-9d06-1f86efc3b2bc. Name node is in >> safe mode. >> The reported blocks 2 has reached the threshold 0.9990 of total blocks 2. >> The number of live datanodes 1 has reached the minimum number 0. In safe >> mode extension. Safe mode will be turned off automatically in 13 seconds. >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1364) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4216) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4191) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813) >> at >> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600) >> at >> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) >> at >> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) >> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) >> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) >> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:415) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) >> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) >> >> at >> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:472) >> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) >> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:606) >> at org.apache.hadoop.util.RunJar.run(RunJar.java:221) >> at org.apache.hadoop.util.RunJar.main(RunJar.java:136) >> Caused by: >> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): >> Cannot create directory >> /tmp/hive/anand_vihar/a9d68b70-01b4-4d4d-9d06-1f86efc3b2bc. Name node is in >> safe mode. >> The reported blocks 2 has reached the threshold 0.9990 of total blocks 2. >> The number of live datanodes 1 has reached the minimum number 0. In safe >> mode extension. Safe mode will be turned off automatically in 13 seconds. >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1364) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4216) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4191) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813) >> at >> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTrans
Re: Repeated Hive start-up issues
Hi: Many thanks for replying. Can you please tell me how to fix namenode safe mode issue. I am new to Hadoop. Thanks Regards Anand Sent from my iPhone > On 15-May-2015, at 7:14 pm, Xuefu Zhang wrote: > > Your namenode is in safe mode, as the exception shows. You need to verify/fix > that before trying Hive. > > Secondly, "!=" may not work as expected. Try "<>" or other simpler query > first. > > --Xuefu > >> On Fri, May 15, 2015 at 6:17 AM, Anand Murali wrote: >> Hi All: >> >> I have installed Hadoop-2.6, Hive 1.1 and try to start hive and get the >> following, first time when I start the cluster >> >> $hive >> >> Logging initialized using configuration in >> jar:file:/home/anand_vihar/hive-1.1.0/lib/hive-common-1.1.0.jar!/hive-log4j.properties >> SLF4J: Class path contains multiple SLF4J bindings. >> SLF4J: Found binding in >> [jar:file:/home/anand_vihar/hive-1.1.0/lib/hive-jdbc-1.1.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class] >> SLF4J: Found binding in >> [jar:file:/home/anand_vihar/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] >> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an >> explanation. >> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] >> Exception in thread "main" java.lang.RuntimeException: >> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): >> Cannot create directory >> /tmp/hive/anand_vihar/a9d68b70-01b4-4d4d-9d06-1f86efc3b2bc. Name node is in >> safe mode. >> The reported blocks 2 has reached the threshold 0.9990 of total blocks 2. >> The number of live datanodes 1 has reached the minimum number 0. In safe >> mode extension. Safe mode will be turned off automatically in 13 seconds. >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1364) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4216) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4191) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813) >> at >> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600) >> at >> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) >> at >> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) >> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) >> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) >> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:415) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) >> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) >> >> at >> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:472) >> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) >> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:606) >> at org.apache.hadoop.util.RunJar.run(RunJar.java:221) >> at org.apache.hadoop.util.RunJar.main(RunJar.java:136) >> Caused by: >> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): >> Cannot create directory >> /tmp/hive/anand_vihar/a9d68b70-01b4-4d4d-9d06-1f86efc3b2bc. Name node is in >> safe mode. >> The reported blocks 2 has reached the threshold 0.9990 of total blocks 2. >> The number of live datanodes 1 has reached the minimum number 0. In safe >> mode extension. Safe mode will be turned off automatically in 13 seconds. >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1364) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4216) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4191) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813) >> at >> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600) >> at >> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMe
Re: Repeated Hive start-up issues
Your namenode is in safe mode, as the exception shows. You need to verify/fix that before trying Hive. Secondly, "!=" may not work as expected. Try "<>" or other simpler query first. --Xuefu On Fri, May 15, 2015 at 6:17 AM, Anand Murali wrote: > Hi All: > > I have installed Hadoop-2.6, Hive 1.1 and try to start hive and get the > following, first time when I start the cluster > > $hive > > Logging initialized using configuration in > jar:file:/home/anand_vihar/hive-1.1.0/lib/hive-common-1.1.0.jar!/hive-log4j.properties > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/home/anand_vihar/hive-1.1.0/lib/hive-jdbc-1.1.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/home/anand_vihar/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > Exception in thread "main" java.lang.RuntimeException: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): > Cannot create directory > /tmp/hive/anand_vihar/a9d68b70-01b4-4d4d-9d06-1f86efc3b2bc. Name node is in > safe mode. > The reported blocks 2 has reached the threshold 0.9990 of total blocks 2. > The number of live datanodes 1 has reached the minimum number 0. In safe > mode extension. Safe mode will be turned off automatically in 13 seconds. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1364) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4216) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4191) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) > > at > org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:472) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): > Cannot create directory > /tmp/hive/anand_vihar/a9d68b70-01b4-4d4d-9d06-1f86efc3b2bc. Name node is in > safe mode. > The reported blocks 2 has reached the threshold 0.9990 of total blocks 2. > The number of live datanodes 1 has reached the minimum number 0. In safe > mode extension. Safe mode will be turned off automatically in 13 seconds. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1364) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4216) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4191) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035
Repeated Hive start-up issues
Hi All: I have installed Hadoop-2.6, Hive 1.1 and try to start hive and get the following, first time when I start the cluster $hive Logging initialized using configuration in jar:file:/home/anand_vihar/hive-1.1.0/lib/hive-common-1.1.0.jar!/hive-log4j.properties SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/anand_vihar/hive-1.1.0/lib/hive-jdbc-1.1.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/anand_vihar/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot create directory /tmp/hive/anand_vihar/a9d68b70-01b4-4d4d-9d06-1f86efc3b2bc. Name node is in safe mode. The reported blocks 2 has reached the threshold 0.9990 of total blocks 2. The number of live datanodes 1 has reached the minimum number 0. In safe mode extension. Safe mode will be turned off automatically in 13 seconds. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1364) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4216) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4191) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:472) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot create directory /tmp/hive/anand_vihar/a9d68b70-01b4-4d4d-9d06-1f86efc3b2bc. Name node is in safe mode. The reported blocks 2 has reached the threshold 0.9990 of total blocks 2. The number of live datanodes 1 has reached the minimum number 0. In safe mode extension. Safe mode will be turned off automatically in 13 seconds. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1364) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4216) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4191) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) at org.apache.hadoop.ipc.Client.call(Client.java:1468) at org.apache.hadoop.ipc.Client.c
Re: Index Rebuild - DUG failes due to vertex failure
At Hive-Wiki indexing is described: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Indexing and https://cwiki.apache.org/confluence/display/Hive/IndexDev As I used it first, Index was build. The index table were filled with data and the performance increased. But without changes at configuration it does not work anymore on any table. Followed you got the result of my explain-query. 0: jdbc:hive2://localhost:1> EXPLAIN ALTER INDEX ix_key ON DbTest.Tbl_test REBUILD; +---+--+ | Explain | +---+--+ | STAGE DEPENDENCIES: | | Stage-1 is a root stage | | Stage-2 depends on stages: Stage-1 | | Stage-0 depends on stages: Stage-2 | | Stage-3 depends on stages: Stage-0 | | Stage-4 depends on stages: Stage-1 | | Stage-5 depends on stages: Stage-1 | | | | STAGE PLANS: | | Stage: Stage-1 | | Tez | | Edges: | | Reducer 2 <- Map 1 (SIMPLE_EDGE) | | DagName: hive_20150515134040_03126f31-d054-4bb0-9f58-99f72cd8d1ab:94 | | Vertices: | | Map 1 | | Map Operator Tree: | | TableScan | | alias: Tbl_test | | Statistics: Num rows: 7014810 Data size: 4773111850 Basic stats: COMPLETE Column stats: NONE| | Select Operator | | expressions: TEST_KEY (type: bigint), INPUT__FILE__NAME (type: string), BLOCK__OFFSET__INSIDE__FILE (type: bigint)| | outputColumnNames: TEST_KEY, INPUT__FILE__NAME, BLOCK__OFFSET__INSIDE__FILE | | Statistics: Num rows: 7014810 Data size: 4773111850 Basic stats: COMPLETE Column stats: NONE | | Group By Operator | | aggregations: collect_set(BLOCK__OFFSET__INSIDE__FILE) | | keys: TEST_KEY (type: bigint), INPUT__FILE__NAME (type: string) | | mode: hash | | outputColumnNames: _col0, _col1, _col2 | | Statistics: Num rows: 7014810 Data size: 4773111850 Basic stats: COMPLETE Column stats: NONE| | Reduce Output Operator | | key expressions: _col0 (type: bigint), _col1 (type: string) | | sort order: ++ | | Map-reduce partition columns: _col0 (type: bigint)| | Statistics: Num rows: 7014810 Data size: 4773111850 Basic stats: COMPLETE Column stats: NONE | | value expressions: _col2 (type: array) | | Reducer 2 | |
Re: Index Rebuild - DUG failes due to vertex failure
Hi Marc, As far as I know indexes do not work in hive. Have you checked it with explain? Thanks, Mich On 15/5/2015, "Marc Seeger" wrote: >Hi, > >I'm using Hive14 on a HDP2.2-Cluster and have a problem with indexing on >Hive. I can create an index. >create INDEX ix_key ON TABLE DbTest.Tbl_test(TEST_KEY) > >as 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH >DEFERRED REBUILD; > >After that I loaded data into the table and built the index. > >ALTER INDEX ix_key ON DbTest.Tbl_test REBUILD; > >Hive built the index and it works fine, performance increased. Now I want >to rebuild the index, but always get an error: > >INFO : Session is already open >INFO : Tez session was closed. Reopening... >INFO : Session re-established. >INFO : > >ERROR : Status: Failed >ERROR : Vertex failed, vertexName=Map 1, >vertexId=vertex_1426585957958_2810_1_00, diagnostics=[Vertex >vertex_1426585957958_2810_1_00 [Map 1] killed/failed due >to:ROOT_INPUT_INIT_FAILURE, Vertex Input: Tbl_test initializer failed, >vertex=vertex_1426585957958_2810_1_00 [Map 1], >java.lang.NullPointerException >at >org.apache.hadoop.hive.ql.exec.tez.DynamicPartitionPruner.initialize(DynamicPartitionPruner.java:135) >at >org.apache.hadoop.hive.ql.exec.tez.DynamicPartitionPruner.prune(DynamicPartitionPruner.java:100) >at >org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:109) >at >org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245) >at >org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239) >at java.security.AccessController.doPrivileged(Native Method) >at javax.security.auth.Subject.doAs(Subject.java:415) >at >org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) >at >org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239) >at >org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226) >at java.util.concurrent.FutureTask.run(FutureTask.java:262) >at >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >at >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >at java.lang.Thread.run(Thread.java:745) >] >ERROR : Vertex killed, vertexName=Reducer 2, >vertexId=vertex_1426585957958_2810_1_01, diagnostics=[Vertex received >Kill in INITED state., Vertex vertex_1426585957958_2810_1_01 [Reducer >2] killed/failed due to:null] >ERROR : DAG failed due to vertex failure. failedVertices:1 killedVertices:1 >Error: Error while processing statement: FAILED: Execution Error, >return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask >(state=08S01,code=2) > >The base table exists, I can run queries against it. The index table exists >too. If I create a new index on another table and run rebuild-command, I >got the same error. I tried the command with beeline and CLI without any >effect to result. >Thanks for your help, >Marc > >
Index Rebuild - DUG failes due to vertex failure
Hi, I'm using Hive14 on a HDP2.2-Cluster and have a problem with indexing on Hive. I can create an index. create INDEX ix_key ON TABLE DbTest.Tbl_test(TEST_KEY) as 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED REBUILD; After that I loaded data into the table and built the index. ALTER INDEX ix_key ON DbTest.Tbl_test REBUILD; Hive built the index and it works fine, performance increased. Now I want to rebuild the index, but always get an error: INFO : Session is already open INFO : Tez session was closed. Reopening... INFO : Session re-established. INFO : ERROR : Status: Failed ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1426585957958_2810_1_00, diagnostics=[Vertex vertex_1426585957958_2810_1_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: Tbl_test initializer failed, vertex=vertex_1426585957958_2810_1_00 [Map 1], java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.DynamicPartitionPruner.initialize(DynamicPartitionPruner.java:135) at org.apache.hadoop.hive.ql.exec.tez.DynamicPartitionPruner.prune(DynamicPartitionPruner.java:100) at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:109) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ] ERROR : Vertex killed, vertexName=Reducer 2, vertexId=vertex_1426585957958_2810_1_01, diagnostics=[Vertex received Kill in INITED state., Vertex vertex_1426585957958_2810_1_01 [Reducer 2] killed/failed due to:null] ERROR : DAG failed due to vertex failure. failedVertices:1 killedVertices:1 Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask (state=08S01,code=2) The base table exists, I can run queries against it. The index table exists too. If I create a new index on another table and run rebuild-command, I got the same error. I tried the command with beeline and CLI without any effect to result. Thanks for your help, Marc
Re: user matching query does not exist
Thank you Nitin, When the user runs the query via Hive command line. The query succeeds query like select * from railway; as per the link provided you i fire the command ./manage.py clearsessions ; i get the error. On Fri, May 15, 2015 at 12:32 PM, Nitin Pawar wrote: > this is related to djnago > see this on how to clear sessions from django > > http://www.opencsw.org/community/questions/289/how-to-clear-the-django-session-cache > > On Fri, May 15, 2015 at 12:24 PM, amit kumar wrote: > >> Yes it is happening for hue only, can u plz suggest how i cleaning up >> hue session from server ? >> >> The query is succeed in hive command line. >> >> On Fri, May 15, 2015 at 11:52 AM, Nitin Pawar >> wrote: >> >>> Is this happening for Hue? >>> >>> If yes, may be you can try cleaning up hue sessions from server. (this >>> may clean all users active sessions from hue so be careful while doing it) >>> >>> >>> >>> On Fri, May 15, 2015 at 11:31 AM, amit kumar wrote: >>> i am using CDH 5.2.1, Any pointers will be of immense help. Thanks On Fri, May 15, 2015 at 9:43 AM, amit kumar wrote: > Hi, > > After re-create my account in Hue, i receives “User matching query > does not exist” when attempting to perform hive query. > > The query is succeed in hive command line. > > Please suggest on this, > > > Thanks you > Amit > >>> >>> >>> -- >>> Nitin Pawar >>> >> >> > > > -- > Nitin Pawar >
RE: Partition Columns
Hi Appan, I think the answer is that the parser is not able to detect that partitions are useful in Query 2, because the where condition is on a derived field. i.e. Hive can tell that if you say where some_partition_field=”some partition value” then it only needs to scan that partition, but if you bury the partition columns in a derived field like in Query 2 it is unable to spot that and so does a full table scan. I think (but don’t know for sure) that this will be fairly typical of all SQL engines. Your best bet is to use direct conditions like in Query 1. In this case it may have been better for you to persist a field containing the whole date and partition on that instead, in order to make it simpler to pick up a date range along the lines of Query2. Thanks, Martin. From: Appan Thirumaligai [mailto:appanhiv...@gmail.com] Sent: 15 May 2015 01:18 To: user@hive.apache.org Subject: Re: Partition Columns Mungeol, I did check the # of mappers and that did not change between the two queries but when I ran a count(*) query the total execution time reduced significantly for Query1 vs Query2. Also, the amount data the query reads does change when the where clause changes. I still can't explain why one is faster over the other. Thanks, Appan On Thu, May 14, 2015 at 4:46 PM, Mungeol Heo mailto:mungeol@gmail.com>> wrote: Hi, Appan. you can just simply check the amount of data your query reads from the table. or the number of the mapper for running that query. then, you can know whether it filtering or scanning all table. Of course, it is a lazy approach. but, you can give a try. I think query 1 should work fine. because I am using a lot of that kind of queries and it works fine for me. Thanks, mungeol On Fri, May 15, 2015 at 8:31 AM, Appan Thirumaligai mailto:appanhiv...@gmail.com>> wrote: > I agree with you Viral. I see the same behavior as well. We are on Hive 0.13 > for the cluster where I'm testing this. > > On Thu, May 14, 2015 at 2:16 PM, Viral Bajaria > mailto:viral.baja...@gmail.com>> > wrote: >> >> Hi Appan, >> >> In my experience I have seen that Query 2 does not use partition pruning >> because it's not a straight up filtering and involves using functions (aka >> UDFs). >> >> What version of Hive are you using ? >> >> Thanks, >> Viral >> >> >> >> On Thu, May 14, 2015 at 1:48 PM, Appan Thirumaligai >> mailto:appanhiv...@gmail.com>> wrote: >>> >>> Hi, >>> >>> I have a question on Hive Optimizer. I have a table with partition >>> columns eg.,Sales partitioned by year, month, day. Assume that I have two >>> years worth of data on this table. I'm running two queries on this table. >>> >>> Query 1: Select * from Sales where year=2015 and month = 5 and day >>> between 1 and 7 >>> >>> Query 2: Select * from Sales where concat_ws('-',cast(year as >>> string),lpad(cast(month as string),2,'0'),lpad(cast(day as string),2,'0')) >>> between '2015-01-01' and '2015-01-07' >>> >>> When I ran Explain command on the above two queries I get a Filter >>> operation for the 2nd Query and there is no Filter Operation for the first >>> query. >>> >>> My question is: Do both queries use the partitions or is it used only in >>> Query 1 and for Query 2 it will be a scan of all the data? >>> >>> Thanks for your help. >>> >>> Thanks, >>> Appan >> >> > Registered in England and Wales at Players House, 300 Attercliffe Common, Sheffield, S9 2AG. Company number 05935923. This email and its attachments are confidential and are intended solely for the use of the addressed recipient. Any views or opinions expressed are those of the author and do not necessarily represent Jaywing. If you are not the intended recipient, you must not forward or show this to anyone or take any action based upon it. Please contact the sender if you received this in error.
Re: user matching query does not exist
this is related to djnago see this on how to clear sessions from django http://www.opencsw.org/community/questions/289/how-to-clear-the-django-session-cache On Fri, May 15, 2015 at 12:24 PM, amit kumar wrote: > Yes it is happening for hue only, can u plz suggest how i cleaning up hue > session from server ? > > The query is succeed in hive command line. > > On Fri, May 15, 2015 at 11:52 AM, Nitin Pawar > wrote: > >> Is this happening for Hue? >> >> If yes, may be you can try cleaning up hue sessions from server. (this >> may clean all users active sessions from hue so be careful while doing it) >> >> >> >> On Fri, May 15, 2015 at 11:31 AM, amit kumar wrote: >> >>> i am using CDH 5.2.1, >>> >>> Any pointers will be of immense help. >>> >>> >>> >>> Thanks >>> >>> >>> >>> On Fri, May 15, 2015 at 9:43 AM, amit kumar wrote: >>> Hi, After re-create my account in Hue, i receives “User matching query does not exist” when attempting to perform hive query. The query is succeed in hive command line. Please suggest on this, Thanks you Amit >>> >>> >> >> >> -- >> Nitin Pawar >> > > -- Nitin Pawar