Re: fails to alter table concatenate
can you try doing same by changing the query engine from tez to mr1? not sure if its hive bug or tez bug On Tue, Jun 30, 2015 at 1:46 PM, patcharee patcharee.thong...@uni.no wrote: Hi, I am using hive 0.14. It fails to alter table concatenate occasionally (see the exception below). It is strange that it fails from time to time not predictable. Is there any suggestion/clue? hive alter table 4dim partition(zone=2,z=15,year=2005,month=4) CONCATENATE; VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED File MergeFAILED -1 00 -1 0 0 VERTICES: 00/01 [--] 0%ELAPSED TIME: 1435651968.00 s Status: Failed Vertex failed, vertexName=File Merge, vertexId=vertex_1435307579867_0041_1_00, diagnostics=[Vertex vertex_1435307579867_0041_1_00 [File Merge] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: [hdfs://service-10-0.local:8020/apps/hive/warehouse/wrf_tables/4dim/zone=2/z=15/year=2005/month=4] initializer failed, vertex=vertex_1435307579867_0041_1_00 [File Merge], java.lang.NullPointerException at org.apache.hadoop.hive.ql.io .HiveInputFormat.init(HiveInputFormat.java:265) at org.apache.hadoop.hive.ql.io .CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:452) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateOldSplits(MRInputHelpers.java:441) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateInputSplitsToMem(MRInputHelpers.java:295) at org.apache.tez.mapreduce.common.MRInputAMSplitGenerator.initialize(MRInputAMSplitGenerator.java:124) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ] DAG failed due to vertex failure. failedVertices:1 killedVertices:0 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.DDLTask BR, Patcharee -- Nitin Pawar
Re: fails to alter table concatenate
Actually it works on mr. So the problem is from tez. thanks! BR, Patcharee On 30. juni 2015 10:23, Nitin Pawar wrote: can you try doing same by changing the query engine from tez to mr1? not sure if its hive bug or tez bug On Tue, Jun 30, 2015 at 1:46 PM, patcharee patcharee.thong...@uni.no mailto:patcharee.thong...@uni.no wrote: Hi, I am using hive 0.14. It fails to alter table concatenate occasionally (see the exception below). It is strange that it fails from time to time not predictable. Is there any suggestion/clue? hive alter table 4dim partition(zone=2,z=15,year=2005,month=4) CONCATENATE; VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED File MergeFAILED -1 00 -1 0 0 VERTICES: 00/01 [--] 0% ELAPSED TIME: 1435651968.00 s Status: Failed Vertex failed, vertexName=File Merge, vertexId=vertex_1435307579867_0041_1_00, diagnostics=[Vertex vertex_1435307579867_0041_1_00 [File Merge] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: [hdfs://service-10-0.local:8020/apps/hive/warehouse/wrf_tables/4dim/zone=2/z=15/year=2005/month=4] initializer failed, vertex=vertex_1435307579867_0041_1_00 [File Merge], java.lang.NullPointerException at org.apache.hadoop.hive.ql.io http://org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265) at org.apache.hadoop.hive.ql.io http://org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:452) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateOldSplits(MRInputHelpers.java:441) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateInputSplitsToMem(MRInputHelpers.java:295) at org.apache.tez.mapreduce.common.MRInputAMSplitGenerator.initialize(MRInputAMSplitGenerator.java:124) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ] DAG failed due to vertex failure. failedVertices:1 killedVertices:0 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.DDLTask BR, Patcharee -- Nitin Pawar
fails to alter table concatenate
Hi, I am using hive 0.14. It fails to alter table concatenate occasionally (see the exception below). It is strange that it fails from time to time not predictable. Is there any suggestion/clue? hive alter table 4dim partition(zone=2,z=15,year=2005,month=4) CONCATENATE; VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED File MergeFAILED -1 00 -1 0 0 VERTICES: 00/01 [--] 0%ELAPSED TIME: 1435651968.00 s Status: Failed Vertex failed, vertexName=File Merge, vertexId=vertex_1435307579867_0041_1_00, diagnostics=[Vertex vertex_1435307579867_0041_1_00 [File Merge] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: [hdfs://service-10-0.local:8020/apps/hive/warehouse/wrf_tables/4dim/zone=2/z=15/year=2005/month=4] initializer failed, vertex=vertex_1435307579867_0041_1_00 [File Merge], java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:452) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateOldSplits(MRInputHelpers.java:441) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateInputSplitsToMem(MRInputHelpers.java:295) at org.apache.tez.mapreduce.common.MRInputAMSplitGenerator.initialize(MRInputAMSplitGenerator.java:124) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ] DAG failed due to vertex failure. failedVertices:1 killedVertices:0 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.DDLTask BR, Patcharee
Re: ApacheCON EU HBase Track Submissions
Get your submissions in, the deadline is imminent! On Thu, Jun 25, 2015 at 11:30 AM, Nick Dimiduk ndimi...@apache.org wrote: Hello developers, users, speakers, As part of ApacheCON's inaugural Apache: Big Data, I'm hoping to see a HBase: NoSQL + SQL track come together. The idea is to showcase the growing ecosystem of applications and tools built on top of and around Apache HBase. To have a track, we need content, and that's where YOU come in. CFP for ApacheCon closes in one week, July 1. Get your HBase + Hive talks submitted so we can pull together a full day of great HBase ecosystem talks! Already planning to submit a talk on Hive? Work in HBase and we'll get it promoted as part of the track! Thanks, Nick ApacheCON EU Sept 28 - Oct 2 Corinthia Hotel, Budapest, Hungary (a beautiful venue in an awesome city!) http://apachecon.eu/ CFP link: http://events.linuxfoundation.org/cfp/dashboard
Re: schedule data ingestion to hive table using ftp
Hi, So, I want to schedule data ingestion to hive from ftp. I have to schedule a job to check for files that are getting generated and when they get generated, move it to hdfs. There is no ³best² way unfortunately. The options start with Apache Oozie, the bog standard solution. Then there¹s Falcon which uses Oozie to run things inside, but handles it closer to hive¹s use-cases. And there¹s the combination of Azkaban + Gobblin from Linkedin. For those who prefer Python to Java, there¹s Luigi from Spotify. If you¹re feeling really lazy, you can go through the NFS mount option in HDFS, so that you can use regular cron to curl sftp - nfsv3 into that. The last option, is totally tied to unix cron, so it is not the best for terabyte scale but it¹s the one that is the easiest to fix when it breaks. Cheers, Gopal
Re: Show table in Spark
please check on spark userlist. I don't think its related to hive On Tue, Jun 30, 2015 at 4:42 PM, Vinod Kuamr vinod.rajan1...@yahoo.com wrote: Hi Folks, Can anyone please let me know how to show content of dataframe in spark? when I using *dt.show()* ( here df is dataframe) I am getting following result [image: Inline image] I am using Scala version 1.3.1 in windows 8 Thanks in advance, Vinod -- Nitin Pawar
Re: Can't access file in Distributed Cache in Hive 1.1.0
Hi Try set hive.fetch.task.conversion=minimal; in hive cli to get an MR job rather than a local fetch task. hth Gabriel Balan On 6/30/2015 5:22 AM, Zsolt Tóth wrote: Thank you for your answer. The plans are identical for Hive 1.0.0 and Hive 1.1.0. You're right, Hive-1.1.0 does not start a MapReduce job for the query, while Hive-1.0.0 does. Should I file a JIRA for this issue? 2015-05-07 21:17 GMT+02:00 Jason Dere jd...@hortonworks.com mailto:jd...@hortonworks.com: Is this on Hive CLI, or using HiveServer2? Can you run explain select in_file('a', './testfile') from a; from both Hive 1.0.0 and hive 1.1.0 and see if they look different? One possibile thing that might be happening here is that in Hive-1.1.0, this query is being executed without the need for a map/reduce job, in which case the working directory for the query is probably the local working directory from when Hive was invoked. I don't think the Distributed Cache will be working correctly in this case, because the UDF is not running in a map/reduce task. If a map-reduce job is kicked off for the query and the UDF is running in this m/r task environment, then the distributed cache will likely be working fine. If there is a way to ensure the query with your UDF runs as part of a map/reduce job this may do the trick. Adding an order-by will do it, but maybe other people on this list may have a better way of making this happen. On May 7, 2015, at 3:28 AM, Zsolt Tóth toth.zsolt@gmail.com mailto:toth.zsolt@gmail.com wrote: Does this error occur for anyone else? It might be a serious issue. 2015-05-05 13:59 GMT+02:00 Zsolt Tóth toth.zsolt@gmail.com mailto:toth.zsolt@gmail.com: Hi, I've just upgraded to Hive 1.1.0 and it looks like there is a problem with the distributed cache. I use ADD FILE, then an UDF that wants to read the file. The following syntax works in Hive 1.0.0 but Hive can't find the file in 1.1.0 (testfile exists on hdfs, the built-in udf in_file is just an example): add file hdfs:///tmp/testfile; select in_file('a', './testfile') from a; However, it works with the local path: select in_file('a', '/tmp/462e6854-10f3-4a68-a290-615e6e9d60ff_resources/testfile') from a; When I try to list the files in the directory ./ in Hive 1.1.0, it lists the cluster's root directory. It looks like the working directory changed in Hive 1.1.0. Is this intended? If so, how can I access the files in the distributed cache added with ADD FILE? Regards, Zsolt -- The statements and opinions expressed here are my own and do not necessarily represent those of Oracle Corporation.
Re: fails to alter table concatenate
Move to user@hive. BCC’ed user@tez. — Hitesh On Jun 30, 2015, at 1:44 AM, patcharee patcharee.thong...@uni.no wrote: Hi, I am using hive 0.14 + tez 0.5. It fails to alter table concatenate occasionally (see the exception below). It is strange that it fails from time to time not predictable. However, it works on mr. Is there any suggestion/clue? hive alter table 4dim partition(zone=2,z=15,year=2005,month=4) CONCATENATE; VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED File MergeFAILED -1 00 -1 0 0 VERTICES: 00/01 [--] 0%ELAPSED TIME: 1435651968.00 s Status: Failed Vertex failed, vertexName=File Merge, vertexId=vertex_1435307579867_0041_1_00, diagnostics=[Vertex vertex_1435307579867_0041_1_00 [File Merge] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: [hdfs://service-10-0.local:8020/apps/hive/warehouse/wrf_tables/4dim/zone=2/z=15/year=2005/month=4] initializer failed, vertex=vertex_1435307579867_0041_1_00 [File Merge], java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:452) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateOldSplits(MRInputHelpers.java:441) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateInputSplitsToMem(MRInputHelpers.java:295) at org.apache.tez.mapreduce.common.MRInputAMSplitGenerator.initialize(MRInputAMSplitGenerator.java:124) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ] DAG failed due to vertex failure. failedVertices:1 killedVertices:0 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.DDLTask BR, Patcharee
Re: Hive indexing optimization
Index doesn’t seems to be kicking in this case. Please file a bug for this. Thanks John From: Bennie Leo tben...@hotmail.commailto:tben...@hotmail.com Reply-To: user@hive.apache.orgmailto:user@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org Date: Monday, June 29, 2015 at 5:25 PM To: user@hive.apache.orgmailto:user@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org Subject: RE: Hive indexing optimization I've attached the output. Thanks. B Subject: Re: Hive indexing optimization From: jpullokka...@hortonworks.commailto:jpullokka...@hortonworks.com To: user@hive.apache.orgmailto:user@hive.apache.org Date: Mon, 29 Jun 2015 19:17:44 + Could you post explain extended output? From: Bennie Leo tben...@hotmail.commailto:tben...@hotmail.com Reply-To: user@hive.apache.orgmailto:user@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org Date: Monday, June 29, 2015 at 10:35 AM To: user@hive.apache.orgmailto:user@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org Subject: RE: Hive indexing optimization Here is the explain output: STAGE PLANS: Stage: Stage-1 Tez Edges: Reducer 2 - Map 1 (SIMPLE_EDGE), Map 3 (SIMPLE_EDGE) Vertices: Map 1 Map Operator Tree: TableScan alias: logontable filterExpr: isipv4(ip) (type: boolean) Statistics: Num rows: 0 Data size: 550 Basic stats: PARTIAL Column stats: NONE Filter Operator predicate: isipv4(ip) (type: boolean) Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE Select Operator expressions: ip (type: bigint) outputColumnNames: _col0 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE Reduce Output Operator sort order: Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE value expressions: _col0 (type: bigint) Map 3 Map Operator Tree: TableScan alias: ipv4geotable Statistics: Num rows: 41641243 Data size: 5144651200 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: startip (type: bigint), endip (type: bigint), country (type: string) outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 41641243 Data size: 5144651200 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: Statistics: Num rows: 41641243 Data size: 5144651200 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: bigint), _col1 (type: bigint), _col2 (type: string) Reducer 2 Reduce Operator Tree: Join Operator condition map: Left Outer Join0 to 1 condition expressions: 0 {VALUE._col0} 1 {VALUE._col0} {VALUE._col1} {VALUE._col2} filter predicates: 0 {isipv4(VALUE._col0)} 1 outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 43281312 Data size: 5020632576 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: ((_col1 = _col0) and (_col0 = _col2)) (type: boolean) Statistics: Num rows: 5209034 Data size: 497855986 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: bigint), _col3 (type: string) outputColumnNames: _col0, _col1 Statistics: Num rows: 5209034 Data size: 497855986 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 5209034 Data size: 497855986 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Thank you, B Subject: Re: Hive indexing optimization From: jpullokka...@hortonworks.commailto:jpullokka...@hortonworks.com To: user@hive.apache.orgmailto:user@hive.apache.org CC: tben...@hotmail.commailto:tben...@hotmail.com Date: Sat, 27 Jun 2015 16:02:08 + SELECT StartIp, EndIp, Country FROM ipv4geotable” should have been rewritten as a scan
alter table on multiple partitions
Hi, I have a table partitioned by a, b, c, d column. I want to alter concatenate this table. Is it possible to use wildcard in alter command to alter several partitions at a time? For ex. alter table TestHive partition (a=1, b=*, c=2, d=*) CONCATENATE; BR, Patcharee
schedule data ingestion to hive table using ftp
Maybe this is not exactly a question for hive user group, however I do not know of any other better place. So, I want to schedule data ingestion to hive from ftp. I have to schedule a job to check for files that are getting generated and when they get generated, move it to hdfs. Can anyone suggest the best way to do it. -- Thanking You, Ayazur Rehman
Show table in Spark
Hi Folks, Can anyone please let me know how to show content of dataframe in spark? when I using dt.show() ( here df is dataframe) I am getting following result I am using Scala version 1.3.1 in windows 8 Thanks in advance,Vinod