Re: What to choose, hive 0.11 (marked as stable release ) or hive 0.12
any thought regarding what is more stable? hive 0.11 or hive 0.12? On Thu, Feb 27, 2014 at 12:42 PM, twinkle sachdeva twinkle.sachd...@gmail.com wrote: Hi, I am planning to use hive for my use case, but I am confused between hive 0.12 and hive 0.11. Hive 0.12 has been there for reasonable amount of time , while hive 0.11 has been marked as stable release. Are there any known critical issues which are there in hive 0.12, that it has not been marked as stable or is it due to some policy which drives marking a release as stable due to which it has not been marked stable. Please provide some inputs. Thanks and Regards, Twinkle
Re: What to choose, hive 0.11 (marked as stable release ) or hive 0.12
here is a mail from Edward on a different thread All stable really is is a sym link, Hive is heavily unit and integration tested. Also the release is not made after some manual testing as well. releases have historically been very stable. 12 has been out for some time. You can use 0.12, It has more things compared to 0.11 and have not seen anyone complaining much about that release yet. Thanks, Nitin On Thu, Feb 27, 2014 at 4:24 PM, twinkle sachdeva twinkle.sachd...@gmail.com wrote: any thought regarding what is more stable? hive 0.11 or hive 0.12? On Thu, Feb 27, 2014 at 12:42 PM, twinkle sachdeva twinkle.sachd...@gmail.com wrote: Hi, I am planning to use hive for my use case, but I am confused between hive 0.12 and hive 0.11. Hive 0.12 has been there for reasonable amount of time , while hive 0.11 has been marked as stable release. Are there any known critical issues which are there in hive 0.12, that it has not been marked as stable or is it due to some policy which drives marking a release as stable due to which it has not been marked stable. Please provide some inputs. Thanks and Regards, Twinkle -- Nitin Pawar
Hive query parser bug resulting in FAILED: NullPointerException null
Hi all, we've experienced a bug which seems to be caused by having a query constraint involving partitioned columns. The following query results in FAILED: NullPointerException null being returned nearly instantly: EXPLAIN SELECT col1 FROM tbl1 WHERE (part_col1 = 2014 AND part_col2 = 2) OR part_col1 2014; The exception doesn't happen if any of the conditions are removed. The table is defined like the following: CREATE TABLE tbl1 ( col1STRING, ... col12 STRING ) PARTITIONED BY (part_col1 INT, part_col2 TINYINT, part_col3 TINYINT) STORED AS SEQUENCEFILE; Unfortunately I cannot construct a test case to replicate this. Seen as though it appears to be a query parser bug, I thought the following would replicate it: CREATE TABLE tbl2 LIKE tbl1; EXPLAIN SELECT col1 FROM tbl2 WHERE (part_col1 = 2014 AND part_col2 = 2) OR part_col1 2014; But it does not. Could it somehow be data specific? Does the query parser use partition information? Are there any logs I could see to investigate this further? Or is this a known bug? We're using hive 0.10.0-cdh4.4.0. Cheers, Krishna
Re: Metastore performance on HDFS-backed table with 15000+ partitions
Thanks everyone for the feedback. Just to follow up in case someone else runs into this: I can confirm that local client works around the OOMEs, but it's still very slow. It does seem like we were hitting some combination of HIVE-4051 and HIVE-5158. We'll try reducing partition count first, and then switch to 0.12.0 if that doesn't improve things significantly. Fwiw - http://www.slideshare.net/oom65/optimize-hivequeriespptx also has has some good rules-of-thumb. Norbert On Sat, Feb 22, 2014 at 1:27 PM, Stephen Sprague sprag...@gmail.com wrote: yeah. That traceback pretty much spells it out - its metastore related and that's where the partitions are stored. I'm with the others on this. HiveServer2 is still a little jankey on memory management. I bounce mine once a day at midnight just to play it safe (and because i can.) Again, for me, i use the hive local client for production jobs and remote client for adhoc stuff. you may wish to confirm the local hive client has no problem with your query. other than that you either increase your heap size on the HS2 process and hope for the best and/or file a bug report. bottom line hiveserver2 isn't production bullet proof just yet, IMHO. Others may disagree. Regards, Stephen. On Sat, Feb 22, 2014 at 9:50 AM, Norbert Burger norbert.bur...@gmail.comwrote: Thanks all for the quick feedback. I'm a bit surprised to learn 15k is considered too much, but we can work around it. I guess I'm also curious why the query planner needs to know about all partitions even in the case of simple select/limit queries, where the query might target only a single partition. Here's the client-side OOME with HADOOP_HEAPSIZE=2048: https://gist.githubusercontent.com/nburger/3286d2052060e2efe161/raw/dc30231086803c1d33b9137b5844d2d0e20e350d/gistfile1.txt This was from a CDH4.3.0 client hitting HIveServer2. Any idea what's consuming the heap? Norbert On Sat, Feb 22, 2014 at 10:32 AM, Edward Capriolo edlinuxg...@gmail.comwrote: Dont make tbales with that many partitions. It is an anti pattern. I hwve tables with 2000 partitions a day and that is rewlly to many. Hive needs go load that informqtion into memory to plan the query. On Saturday, February 22, 2014, Terje Marthinussen tmarthinus...@gmail.com wrote: Query optimizer in hive is awful on memory consumption. 15k partitions sounds a bit early for it to fail though.. What is your heap size? Regards, Terje On 22 Feb 2014, at 12:05, Norbert Burger norbert.bur...@gmail.com wrote: Hi folks, We are running CDH 4.3.0 Hive (0.10.0+121) with a MySQL metastore. In Hive, we have an external table backed by HDFS which has a 3-level partitioning scheme that currently has 15000+ partitions. Within the last day or so, queries against this table have started failing. A simple query which shouldn't take very long at all (select * from ... limit 10) fails after several minutes with a client OOME. I get the same outcome on count(*) queries (which I thought wouldn't send any data back to the client). Increasing heap on both client and server JVMs (via HADOOP_HEAPSIZE) doesn't have any impact. We were only able to work around the client OOMEs by reducing the number of partitions in the table. Looking at the MySQL querylog, my thought is that the Hive client is quite busy making requests for partitions that doesn't contribute to the query. Has anyone else had similar experience against tables this size? Thanks, Norbert -- Sorry this was sent from mobile. Will do less grammar and spell check than usual.
Log Progress of Queries
Hi all, I was using hive-0.11 and I used to get the query status from log files. But I changed from 0.11.0 to 0.12.0 and, even if it's configured, hive is not more generating the logs with the progress of the queries. Does the query status have been disabled or may be I've misconfigured hive? These are my configs: property namehive.querylog.location/name value/tmp/${user.name}/value description Location of Hive run time structured log file /description /property property namehive.querylog.enable.plan.progress/name valuetrue/value description Whether to log the plan's progress every time a job's progress is checked. These logs are written to the location specified by hive.querylog.location /description /property This is the logging I used to get. Counters plan={queryId:xxx_20131213115858_3699e7ff-8ff5-4dd7-91df-983b0588682b,queryType:null,queryAttributes:{queryString: insert overwrite table q7_volume_shipping_tmp select* from ( selectn1.n_name as supp_nation, n2.n_name as cust_nation, n1.n_nationkey as s_nationkey, n2.n_nationkey as c_nationkey fromnation n1 join nation n2on n1.n_name = 'FRANCE' and n2.n_name = 'GERMANY' UNION ALL selectn1.n_name as supp_nation, n2.n_name as cust_nation, n1.n_nationkey as s_nationkey,n2.n_nationkey as c_nationkey fromnation n1 join nation n2on n2.n_name = 'FRANCE' and n1.n_name = 'GERMANY' ) a},queryCounters:null,stageGraph:{nodeType:STAGE,roots:null,adjacencyList:[{node:Stage-1,children:[Stage-2],adjacencyType:CONJUNCTIVE},{node:Stage-10,children:[Stage-2],adjacencyType:CONJUNCTIVE},{node:Stage-2,children:[Stage-8],adjacencyType:CONJUNCTIVE},{node:Stage-2,children:[Stage-8],adjacencyType:CONJUNCTIVE},{node:Stage-8,children:[Stage-5,Stage-4,Stage-6],adjacencyType:DISJUNCTIVE},{node:Stage-8,children:[Stage-5,Stage-4,Stage-6],adjacencyType:DISJUNCTIVE},{node:Stage-5,children:[Stage-0],adjacencyType:CONJUNCTIVE},{node:Stage-4,children:[Stage- Thanks in advance, Edson Ramiro
RE: Metastore performance on HDFS-backed table with 15000+ partitions
That is good to know. We are using Hive 0.9. Right now the biggest table contains 2 years data, and we partitioned by hour, as the data volume is big. So right now, it has 2*365*24 around 17000+ partitions. So far we didn't see too much problem yet, but I do have some concerns about it. We are using IBM BigInsight, which is using derby as the hive metastore, not as mysql as my most experience was on. Yong From: norbert.bur...@gmail.com Date: Thu, 27 Feb 2014 07:57:05 -0500 Subject: Re: Metastore performance on HDFS-backed table with 15000+ partitions To: user@hive.apache.org Thanks everyone for the feedback. Just to follow up in case someone else runs into this: I can confirm that local client works around the OOMEs, but it's still very slow. It does seem like we were hitting some combination of HIVE-4051 and HIVE-5158. We'll try reducing partition count first, and then switch to 0.12.0 if that doesn't improve things significantly. Fwiw - http://www.slideshare.net/oom65/optimize-hivequeriespptx also has has some good rules-of-thumb. Norbert On Sat, Feb 22, 2014 at 1:27 PM, Stephen Sprague sprag...@gmail.com wrote: yeah. That traceback pretty much spells it out - its metastore related and that's where the partitions are stored. I'm with the others on this. HiveServer2 is still a little jankey on memory management. I bounce mine once a day at midnight just to play it safe (and because i can.) Again, for me, i use the hive local client for production jobs and remote client for adhoc stuff. you may wish to confirm the local hive client has no problem with your query. other than that you either increase your heap size on the HS2 process and hope for the best and/or file a bug report. bottom line hiveserver2 isn't production bullet proof just yet, IMHO. Others may disagree. Regards, Stephen. On Sat, Feb 22, 2014 at 9:50 AM, Norbert Burger norbert.bur...@gmail.com wrote: Thanks all for the quick feedback. I'm a bit surprised to learn 15k is considered too much, but we can work around it. I guess I'm also curious why the query planner needs to know about all partitions even in the case of simple select/limit queries, where the query might target only a single partition. Here's the client-side OOME with HADOOP_HEAPSIZE=2048: https://gist.githubusercontent.com/nburger/3286d2052060e2efe161/raw/dc30231086803c1d33b9137b5844d2d0e20e350d/gistfile1.txt This was from a CDH4.3.0 client hitting HIveServer2. Any idea what's consuming the heap? Norbert On Sat, Feb 22, 2014 at 10:32 AM, Edward Capriolo edlinuxg...@gmail.com wrote: Dont make tbales with that many partitions. It is an anti pattern. I hwve tables with 2000 partitions a day and that is rewlly to many. Hive needs go load that informqtion into memory to plan the query. On Saturday, February 22, 2014, Terje Marthinussen tmarthinus...@gmail.com wrote: Query optimizer in hive is awful on memory consumption. 15k partitions sounds a bit early for it to fail though.. What is your heap size? Regards, Terje On 22 Feb 2014, at 12:05, Norbert Burger norbert.bur...@gmail.com wrote: Hi folks, We are running CDH 4.3.0 Hive (0.10.0+121) with a MySQL metastore. In Hive, we have an external table backed by HDFS which has a 3-level partitioning scheme that currently has 15000+ partitions. Within the last day or so, queries against this table have started failing. A simple query which shouldn't take very long at all (select * from ... limit 10) fails after several minutes with a client OOME. I get the same outcome on count(*) queries (which I thought wouldn't send any data back to the client). Increasing heap on both client and server JVMs (via HADOOP_HEAPSIZE) doesn't have any impact. We were only able to work around the client OOMEs by reducing the number of partitions in the table. Looking at the MySQL querylog, my thought is that the Hive client is quite busy making requests for partitions that doesn't contribute to the query. Has anyone else had similar experience against tables this size? Thanks, Norbert -- Sorry this was sent from mobile. Will do less grammar and spell check than usual.
RE: Hive query parser bug resulting in FAILED: NullPointerException null
Can you reproduce with an empty table? I can't reproduce it. Also, can you paste the stack trace? Yong From: krishnanj...@gmail.com Date: Thu, 27 Feb 2014 12:44:28 + Subject: Hive query parser bug resulting in FAILED: NullPointerException null To: user@hive.apache.org Hi all, we've experienced a bug which seems to be caused by having a query constraint involving partitioned columns. The following query results in FAILED: NullPointerException null being returned nearly instantly: EXPLAIN SELECT col1FROM tbl1WHERE(part_col1 = 2014 AND part_col2 = 2)OR part_col1 2014; The exception doesn't happen if any of the conditions are removed. The table is defined like the following: CREATE TABLE tbl1 ( col1STRING, ... col12 STRING)PARTITIONED BY (part_col1 INT, part_col2 TINYINT, part_col3 TINYINT)STORED AS SEQUENCEFILE; Unfortunately I cannot construct a test case to replicate this. Seen as though it appears to be a query parser bug, I thought the following would replicate it: CREATE TABLE tbl2 LIKE tbl1;EXPLAIN SELECT col1FROM tbl2WHERE(part_col1 = 2014 AND part_col2 = 2)OR part_col1 2014; But it does not. Could it somehow be data specific? Does the query parser use partition information? Are there any logs I could see to investigate this further? Or is this a known bug? We're using hive 0.10.0-cdh4.4.0. Cheers, Krishna
Re: java.lang.RuntimeException: cannot find field key from [0:_col0, 1:_col2, 2:_col3]
Hi, I wrote a similar simple UDTF and a new table. This simple UDTF does work on hive 0.10. But my original one doesn't. Still don't understand why. Does the fact that the original query works with the setting 'set hive.optimize.ppd=true' give any clue ? Please let me know. On Tuesday, February 25, 2014 3:28 PM, java8964 java8...@hotmail.com wrote: Works for me on 0.10. Yong Date: Tue, 25 Feb 2014 11:37:32 -0800 From: kumarbuyonl...@yahoo.com Subject: Re: java.lang.RuntimeException: cannot find field key from [0:_col0, 1:_col2, 2:_col3] To: user@hive.apache.org Hi, Thanks for looking into it. I am also trying this on hive 0.11 to see if it works there. If you get a chance to reproduce this problem on hive 0.10, please let me know. Thanks. On Monday, February 24, 2014 10:59 PM, java8964 java8...@hotmail.com wrote: My guess is that your UDTF will return an array of struct. I don't have Hive 0.10 in handy right now, but I write a simple UDTF to return an array of struct to test on Hive 0.12 release. hive desc test; OK id int None name string None Time taken: 0.074 seconds, Fetched: 2 row(s) hive select * from test; OK 1Apples,Bananas,Carrots Time taken: 0.08 seconds, Fetched: 1 row(s) The pair UDTF will output Apples,Bananas,Carrots to Apples, Bananas Apples, Carrots Bananas, Carrots an array of 2 elements struct. hive select id, name, m1, m2 from test lateral view pair(name) p as m1, m2 where m1 is not null; OK 1Apples,Bananas,CarrotsApplesBananas 1Apples,Bananas,CarrotsApplesCarrots 1Apples,Bananas,CarrotsBananasCarrots Time taken: 7.683 seconds, Fetched: 3 row(s) hive select id, name, m1, m2 from test lateral view pair(name) p as m1, m2 where m1 = 'Apples'; OK 1Apples,Bananas,CarrotsApplesBananas 1Apples,Bananas,CarrotsApplesCarrots Time taken: 7.726 seconds, Fetched: 2 row(s) hive set hive.optimize.ppd=true; hive select id, name, m1, m2 from test lateral view pair(name) p as m1, m2 where m1 is not null; Total MapReduce jobs = 1 OK 1Apples,Bananas,CarrotsApplesBananas 1Apples,Bananas,CarrotsApplesCarrots 1Apples,Bananas,CarrotsBananasCarrots Time taken: 7.716 seconds, Fetched: 3 row(s) I cannot reproduce your error in Hive 0.12, as you can see. I can test on Hive 0.10 tomorrow when I have time, but can your test your case in Hive 0.12, or review your UDTF again? Yong Date: Mon, 24 Feb 2014 07:09:44 -0800 From: kumarbuyonl...@yahoo.com Subject: Re: java.lang.RuntimeException: cannot find field key from [0:_col0, 1:_col2, 2:_col3] To: user@hive.apache.org; kumarbuyonl...@yahoo.com As suggested, I changed the query like this: select x.f1,x,f2,x,f3,x.f4 from ( select e.f1 as f1,e.f2 as f2,e.f3 as f3,e.f4 as f4 from mytable LATERAL VIEW myfunc(p1,p2,p3,p4) e as f1,f2,f3,f4 where lang=123) x where x.f3 is not null; And it still doesn't work. I am getting the same error. If anyone has any ideas, please let me know. Thanks. On Friday, February 21, 2014 11:27 AM, Kumar V kumarbuyonl...@yahoo.com wrote: Line 316 in my UDTF where is shows the error is the line where I call forward(). The whole trace is : Caused by: java.lang.RuntimeException: cannot find field key from [0:_col0, 1:_col2, 2:_col6, 3:_col7, 4:_col8, 5:_col9] at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:346) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:143) at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57) at org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator.initialize(ExprNodeFieldEvaluator.java:55) at org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator.initialize(ExprNodeFieldEvaluator.java:55) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:128) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:128) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:128) at org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:85) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.LateralViewJoinOperator.processOp(LateralViewJoinOperator.java:133) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:112) at org.apache.hadoop.hive.ql.udf.generic.UDTFCollector.collect(UDTFCollector.java:44) at
Re: ORC 'BETWEEN' Error
Hi Martin This is an known issue and its fixed in hive trunk. It should be available in 0.13 release. https://issues.apache.org/jira/browse/HIVE-5601 Thanks Prasanth Jayachandran On Feb 26, 2014, at 8:55 AM, Martin, Nick nimar...@pssd.com wrote: Hi all, (Running Hive 12.0) I have two tables and both are stored as ORC. I attempted to insert via select from tbl1 to tbl2 using ‘BETWEEN’ in my where clause to narrow down some dates. Something like so: “Insert into tbl1 select col1, col2 from tbl2 where col1 between 2 and 4” I kept hitting the error pasted below. So, I switched to a different approach to see if it would work: “Insert into tbl1 select col1,col2 from tbl2 where col1=2 and col1=4” Hit the same error. When I just use “where col1=2” in the where clause the insert will run fine. Is this expected? 2014-02-26 11:22:53,755 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2014-02-26 11:22:53,782 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 2014-02-26 11:22:53,902 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2014-02-26 11:22:53,930 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSinkAdapter: Sink ganglia started 2014-02-26 11:22:53,975 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2014-02-26 11:22:53,975 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started 2014-02-26 11:22:53,985 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens: 2014-02-26 11:22:53,985 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1392147432508_1108, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@249c2715) 2014-02-26 11:22:54,057 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now. 2014-02-26 11:22:54,352 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2014-02-26 11:22:54,363 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 2014-02-26 11:22:54,409 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /hdfs/01/hadoop/yarn/local/usercache/myusername/appcache/application_1392147432508_1108,/hdfs/02/hadoop/yarn/local/usercache/myusername/appcache/application_1392147432508_1108,/hdfs/03/hadoop/yarn/local/usercache/myusername/appcache/application_1392147432508_1108,/hdfs/04/hadoop/yarn/local/usercache/myusername/appcache/application_1392147432508_1108,/hdfs/05/hadoop/yarn/local/usercache/myusername/appcache/application_1392147432508_1108,/hdfs/06/hadoop/yarn/local/usercache/myusername/appcache/application_1392147432508_1108,/hdfs/07/hadoop/yarn/local/usercache/myusername/appcache/application_1392147432508_1108,/hdfs/08/hadoop/yarn/local/usercache/myusername/appcache/application_1392147432508_1108,/hdfs/09/hadoop/yarn/local/usercache/myusername/appcache/application_1392147432508_1108,/hdfs/10/hadoop/yarn/local/usercache/myusername/appcache/application_1392147432508_1108,/hdfs/11/hadoop/yarn/local/usercache/myusername/appcache/application_1392147432508_1108,/hdfs/12/hadoop/yarn/local/usercache/myusername/appcache/application_1392147432508_1108 2014-02-26 11:22:54,481 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2014-02-26 11:22:54,486 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 2014-02-26 11:22:54,542 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id 2014-02-26 11:22:54,542 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap 2014-02-26 11:22:54,543 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.local.dir is deprecated. Instead, use mapreduce.cluster.local.dir 2014-02-26 11:22:54,543 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.cache.localFiles is deprecated. Instead, use mapreduce.job.cache.local.files 2014-02-26 11:22:54,543 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id 2014-02-26 11:22:54,544 INFO [main] org.apache.hadoop.conf.Configuration.deprecation:
move hive tables from one cluster to another cluster
Dear experts, I want to move my hive tables from one cluster to another cluster. how can i do it? Thanks Soniya.
Re: move hive tables from one cluster to another cluster
1. you can use distcp to copy the files to the new cluster 2. rebuild metadata On Thu, Feb 27, 2014 at 8:07 PM, soniya B soniya.bigd...@gmail.com wrote: Dear experts, I want to move my hive tables from one cluster to another cluster. how can i do it? Thanks Soniya.
Re: move hive tables from one cluster to another cluster
Hi, I have moved warehouse file to another cluster. but still i didn't see the tables on the other cluster. How to rebulid the metadata? Thanks Soniya On Fri, Feb 28, 2014 at 9:26 AM, Krishnan K kkrishna...@gmail.com wrote: 1. you can use distcp to copy the files to the new cluster 2. rebuild metadata On Thu, Feb 27, 2014 at 8:07 PM, soniya B soniya.bigd...@gmail.comwrote: Dear experts, I want to move my hive tables from one cluster to another cluster. how can i do it? Thanks Soniya.