date:20121105

Re: Alter table is giving error

2012-11-05 Thread Dean Wampler

The RECOVER PARTITIONS is an enhancement added by Amazon to their version
of Hive.

http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html

shameless-plus
  Chapter 21 of Programming Hive discusses this feature and other aspects
of using Hive in EMR.
/shameless-plug

dean

On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta chunky.gu...@vizury.comwrote:

 Hi,

 I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive
 version 0.8.1 (I configured everything) . I have created a table using :-

 CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT
 DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION 's3://my-location/data/';

 Now I am trying to recover partition using :-

 ALTER TABLE XXX RECOVER PARTITIONS;

 but I am getting this error :- FAILED: Parse Error: line 1:12 cannot
 recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table statement

 Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3 and
 Hive version 0.8.1 (Configured by EMR), works fine.

 So is this a version issue or am I missing some configuration changes in
 EC2 setup ?
 I am not able to find exact solution for this problem on internet. Please
 help me.

 Thanks,
 Chunky.






-- 
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330

Re: Alter table is giving error

2012-11-05 Thread Chunky Gupta

Hi Dean,

Actually I was having Hadoop and Hive cluster on EMR and I have S3 storage
containing logs which updates daily and having partition with date(dt). And
I was using this recover partition.
Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster. So,
what is the alternate of using recover partition in this case, if you have
any idea ?
I found one way of individually partitioning all dates, so I have to write
script for that to do so for all dates. Is there any easiest way other than
this ?

Thanks,
Chunky

On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler
dean.wamp...@thinkbiganalytics.com wrote:

The RECOVER PARTITIONS is an enhancement added by Amazon to their version
of Hive.

http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html

shameless-plus
Chapter 21 of Programming Hive discusses this feature and other aspects
of using Hive in EMR.
/shameless-plug

dean

On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta chunky.gu...@vizury.comwrote:

Hi,

I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive
version 0.8.1 (I configured everything) . I have created a table using :-

CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT
DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION 's3://my-location/data/';

Now I am trying to recover partition using :-

ALTER TABLE XXX RECOVER PARTITIONS;

but I am getting this error :- FAILED: Parse Error: line 1:12 cannot
recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table statement

Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3 and
Hive version 0.8.1 (Configured by EMR), works fine.

So is this a version issue or am I missing some configuration changes in
EC2 setup ?
I am not able to find exact solution for this problem on internet. Please
help me.

Thanks,
Chunky.

--
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330

Re: Alter table is giving error

2012-11-05 Thread Dean Wampler

Writing a script to add the external partitions individually is the only way I
know of.

Sent from my rotary phone.

On Nov 5, 2012, at 8:19 AM, Chunky Gupta chunky.gu...@vizury.com wrote:

Hi Dean,

Actually I was having Hadoop and Hive cluster on EMR and I have S3 storage
containing logs which updates daily and having partition with date(dt). And I
was using this recover partition.
Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster. So,
what is the alternate of using recover partition in this case, if you have
any idea ?
I found one way of individually partitioning all dates, so I have to write
script for that to do so for all dates. Is there any easiest way other than
this ?

Thanks,
Chunky

On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler
dean.wamp...@thinkbiganalytics.com wrote:
The RECOVER PARTITIONS is an enhancement added by Amazon to their version of
Hive.

http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html

shameless-plus
Chapter 21 of Programming Hive discusses this feature and other aspects of
using Hive in EMR.
/shameless-plug

dean

On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta chunky.gu...@vizury.com wrote:
Hi,

I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive
version 0.8.1 (I configured everything) . I have created a table using :-

CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT DELIMITED
FIELDS TERMINATED BY 'WWW' LOCATION 's3://my-location/data/';

Now I am trying to recover partition using :-

ALTER TABLE XXX RECOVER PARTITIONS;

but I am getting this error :- FAILED: Parse Error: line 1:12 cannot
recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table statement

Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3 and
Hive version 0.8.1 (Configured by EMR), works fine.

So is this a version issue or am I missing some configuration changes in
EC2 setup ?
I am not able to find exact solution for this problem on internet. Please
help me.

Thanks,
Chunky.

--
Dean Wampler, Ph.D.
thinkbiganalytics.com
+1-312-339-1330

ClassNotFoundException when use hive java client of hive + hbase integration

2012-11-05 Thread Cheng Su

Hi, all. I have a hive+hbase integration cluster.

When I try to execute query through the java client of hive, sometimes
a ClassNotFoundException happens.

My java code :

final Connection conn = DriverManager.getConnection(URL);
final ResultSet rs = conn.executeQuery(SELECT count(*) FROM
test_table WHERE (source = '0' AND ur_createtime BETWEEN
'2012103100' AND '20121031235959'));

I can execute the sql:SELECT count(*) FROM test_table WHERE (source =
'0' AND ur_createtime BETWEEN '2012103100' AND '20121031235959')
in hive cli mode, and get the query result, so there is no error in my
sql.

The client side exception:

Caused by: java.sql.SQLException: Query returned non-zero code: 9,
cause: FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.MapRedTask
at 
org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:189)
... 23 more

The server side exception(hadoop-jobtracker):

2012-11-05 18:55:39,443 INFO org.apache.hadoop.mapred.TaskInProgress:
Error from attempt_201210301133_0112_m_00_3: java.io.IOException:
Cannot create an instance of InputSplit class =
org.apache.hadoop.hive.hbase.HBaseSplit:org.apache.hadoop.hive.hbase.HBaseSplit
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:146)
at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:396)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Unknown Source)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hive.hbase.HBaseSplit
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Unknown Source)
at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:819)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:143)
... 10 more


My hive-env.sh

export 
HIVE_AUX_JARS_PATH=/data/install/hive-0.9.0/lib/hive-hbase-handler-0.9.0.jar,/data/install/hive-0.9.0/lib/hbase-0.92.0.jar,/data/install/hive-0.9.0/lib/zookeeper-3.4.2.jar


My hive-site.xml

property
namehive.zookeeper.quorum/name
valuehadoop01,hadoop02,hadoop03/value
descriptionThe list of zookeeper servers to talk to. This is
only needed for read/write locks./description
/property


And I start thrift service as below:

hive --service hiveserver -p 1 


The server side error log says that HBaseSplit is not found. But why?
How can I fix this?

-- 

Regards,
Cheng Su

Re: Alter table is giving error

2012-11-05 Thread Edward Capriolo

Recover partitions should work the same way for different file systems.

Edward

On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler
dean.wamp...@thinkbiganalytics.com wrote:
Writing a script to add the external partitions individually is the only way
I know of.

Sent from my rotary phone.

On Nov 5, 2012, at 8:19 AM, Chunky Gupta chunky.gu...@vizury.com wrote:

Hi Dean,

Thanks,
Chunky

On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler
dean.wamp...@thinkbiganalytics.com wrote:

The RECOVER PARTITIONS is an enhancement added by Amazon to their version
of Hive.

http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html

shameless-plus
Chapter 21 of Programming Hive discusses this feature and other aspects
of using Hive in EMR.
/shameless-plug

dean

On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta chunky.gu...@vizury.com
wrote:

Hi,

I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive
version 0.8.1 (I configured everything) . I have created a table using :-

CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT
DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION 's3://my-location/data/';

Now I am trying to recover partition using :-

ALTER TABLE XXX RECOVER PARTITIONS;

but I am getting this error :- FAILED: Parse Error: line 1:12 cannot
recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table statement

Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3 and
Hive version 0.8.1 (Configured by EMR), works fine.

So is this a version issue or am I missing some configuration changes in
EC2 setup ?
I am not able to find exact solution for this problem on internet. Please
help me.

Thanks,
Chunky.

--
Dean Wampler, Ph.D.
thinkbiganalytics.com
+1-312-339-1330

Hive compression with external table

2012-11-05 Thread Krishna Rao

Hi all,

I'm looking into finding a suitable format to store data in HDFS, so that
it's available for processing by Hive. Ideally I would like to satisfy the
following:

1. store the data in a format that is readable by multiple Hadoop projects
(eg. Pig, Mahout, etc.), not just Hive
2. work with a Hive external table
3. store data in a compressed format that is splittable

(1) is a requirement because Hive isn't appropriate for all the problems
that we want to throw at Hadoop.

(2) is really more of a consequence of (1). Ideally we want the data stored
in some open format that is compressed in HDFS.
This way we can just point Hive, Pig, Mahout, etc at it depending on the
problem.

(3) is obviously so it plays well with Hadoop.

Gzip is no good because it is not splittable. Snappy looked promising, but
it is splittable only if used with a non-external Hive table.
LZO also looked promising, but I wonder about whether it is future proof
given the licencing issues surrounding it.

So far, the only solution I could find that satisfies all the above seems
to be bzip2 compression, but concerns about its performance make me wary
about choosing it.

Is bzip2 the only option I have? Or have I missed some other compression
option?

Cheers,

Krishna

Re: Hive compression with external table

2012-11-05 Thread Edward Capriolo

Compression is a confusing issue. Sequence files that are in block
format are always split table regardless of what compression for the
block is chosen.The Programming Hive book has an entire section
dedicated to the permutations of compression options.

Edward
On Mon, Nov 5, 2012 at 10:57 AM, Krishna Rao krishnanj...@gmail.com wrote:
 Hi all,

 I'm looking into finding a suitable format to store data in HDFS, so that
 it's available for processing by Hive. Ideally I would like to satisfy the
 following:

 1. store the data in a format that is readable by multiple Hadoop projects
 (eg. Pig, Mahout, etc.), not just Hive
 2. work with a Hive external table
 3. store data in a compressed format that is splittable

 (1) is a requirement because Hive isn't appropriate for all the problems
 that we want to throw at Hadoop.

 (2) is really more of a consequence of (1). Ideally we want the data stored
 in some open format that is compressed in HDFS.
 This way we can just point Hive, Pig, Mahout, etc at it depending on the
 problem.

 (3) is obviously so it plays well with Hadoop.

 Gzip is no good because it is not splittable. Snappy looked promising, but
 it is splittable only if used with a non-external Hive table.
 LZO also looked promising, but I wonder about whether it is future proof
 given the licencing issues surrounding it.

 So far, the only solution I could find that satisfies all the above seems to
 be bzip2 compression, but concerns about its performance make me wary about
 choosing it.

 Is bzip2 the only option I have? Or have I missed some other compression
 option?

 Cheers,

 Krishna

Re: Alter table is giving error

2012-11-05 Thread Mark Grover

Chunky,
I have used recover partitions command on EMR, and that worked fine.

However, take a look at https://issues.apache.org/jira/browse/HIVE-874. Seems
like msck command in Apache Hive does the same thing. Try it out and let us
know it goes.

Mark

On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo edlinuxg...@gmail.comwrote:

Recover partitions should work the same way for different file systems.

Edward

On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler
dean.wamp...@thinkbiganalytics.com wrote:
Writing a script to add the external partitions individually is the only
way
I know of.

Sent from my rotary phone.

On Nov 5, 2012, at 8:19 AM, Chunky Gupta chunky.gu...@vizury.com
wrote:

Hi Dean,

Actually I was having Hadoop and Hive cluster on EMR and I have S3
storage
containing logs which updates daily and having partition with date(dt).
And
I was using this recover partition.
Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster. So,
what is the alternate of using recover partition in this case, if you
have
any idea ?
I found one way of individually partitioning all dates, so I have to
write
script for that to do so for all dates. Is there any easiest way other
than
this ?

Thanks,
Chunky

On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler
dean.wamp...@thinkbiganalytics.com wrote:

The RECOVER PARTITIONS is an enhancement added by Amazon to their
version
of Hive.

http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html

shameless-plus
Chapter 21 of Programming Hive discusses this feature and other
aspects
of using Hive in EMR.
/shameless-plug

dean

On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta chunky.gu...@vizury.com
wrote:

Hi,

I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive
version 0.8.1 (I configured everything) . I have created a table using
:-

CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT
DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION 's3://my-location/data/';

Now I am trying to recover partition using :-

ALTER TABLE XXX RECOVER PARTITIONS;

but I am getting this error :- FAILED: Parse Error: line 1:12 cannot
recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table
statement

Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3
and
Hive version 0.8.1 (Configured by EMR), works fine.

So is this a version issue or am I missing some configuration changes
in
EC2 setup ?
I am not able to find exact solution for this problem on internet.
Please
help me.

Thanks,
Chunky.

--
Dean Wampler, Ph.D.
thinkbiganalytics.com
+1-312-339-1330

Hive 0.7.1 with MySQL 5.5 as metastore

2012-11-05 Thread Venkatesh Kavuluri

I am working on copying existing Hive metadata (Hive 0.7.1 with MySQL 5.1) to a 
new cluster environment (Hive 0.7.1 with MySQL 5.5). I copied over the 
metastore tables and modified the data under SDS (sub-directories) table to 
reflect the new data path. However I am getting MySQL integrity constraint 
violation against SDS.SD_ID column while trying to create new Hive tables. Is 
this a problem with the MySQL version I am using ? Does Hive 0.7.1 support 
MySQL 5.5 as the metastore.
Thanks,Venkatesh

Re: Hive 0.7.1 with MySQL 5.5 as metastore

2012-11-05 Thread Edward Capriolo

Moving underlying data files around is not the correct way to perform
an upgrade.

https://dev.mysql.com/doc/refman/5.5/en/upgrading-from-previous-series.html

I would do a mysqldump and then re-insert the data for maximum comparability.

On Mon, Nov 5, 2012 at 6:21 PM, Venkatesh Kavuluri
vkavul...@outlook.com wrote:
 I am working on copying existing Hive metadata (Hive 0.7.1 with MySQL 5.1)
 to a new cluster environment (Hive 0.7.1 with MySQL 5.5). I copied over the
 metastore tables and modified the data under SDS (sub-directories) table to
 reflect the new data path. However I am getting MySQL integrity constraint
 violation against SDS.SD_ID column while trying to create new Hive tables.
 Is this a problem with the MySQL version I am using ? Does Hive 0.7.1
 support MySQL 5.5 as the metastore.

 Thanks,
 Venkatesh

RE: Hive 0.7.1 with MySQL 5.5 as metastore

2012-11-05 Thread Venkatesh Kavuluri

Sorry for the confusion, the problem is not with the MySQL version upgrade - I 
have indeed performed the upgrade by doing a mysqldump and restoring the data.
The problem is with how Hive 0.7.1 is interacting with the same metastore data 
on a different version of MySQL server.

 Date: Mon, 5 Nov 2012 18:31:37 -0500
 Subject: Re: Hive 0.7.1 with MySQL 5.5 as metastore
 From: edlinuxg...@gmail.com
 To: user@hive.apache.org

 Moving underlying data files around is not the correct way to perform
 an upgrade.

 https://dev.mysql.com/doc/refman/5.5/en/upgrading-from-previous-series.html

 I would do a mysqldump and then re-insert the data for maximum comparability.

 On Mon, Nov 5, 2012 at 6:21 PM, Venkatesh Kavuluri
 vkavul...@outlook.com wrote:
  I am working on copying existing Hive metadata (Hive 0.7.1 with MySQL 5.1)
  to a new cluster environment (Hive 0.7.1 with MySQL 5.5). I copied over the
  metastore tables and modified the data under SDS (sub-directories) table to
  reflect the new data path. However I am getting MySQL integrity constraint
  violation against SDS.SD_ID column while trying to create new Hive tables.
  Is this a problem with the MySQL version I am using ? Does Hive 0.7.1
  support MySQL 5.5 as the metastore.

  Thanks,
  Venkatesh

Re: ClassNotFoundException when use hive java client of hive + hbase integration

2012-11-05 Thread Mark Grover

Cheng,
You will have to add the appropriate HBase related jars to your class path.

You can do that by running add jar command(s) or put it in aux_lib. See
this thread for reference:
http://mail-archives.apache.org/mod_mbox/hive-user/201103.mbox/%3caanlktingqlgknqmizgoi+szfnexgcat8caqtovf8j...@mail.gmail.com%3E

Mark


On Mon, Nov 5, 2012 at 6:53 AM, Cheng Su scarcer...@gmail.com wrote:

 Hi, all. I have a hive+hbase integration cluster.

 When I try to execute query through the java client of hive, sometimes
 a ClassNotFoundException happens.

 My java code :

 final Connection conn = DriverManager.getConnection(URL);
 final ResultSet rs = conn.executeQuery(SELECT count(*) FROM
 test_table WHERE (source = '0' AND ur_createtime BETWEEN
 '2012103100' AND '20121031235959'));

 I can execute the sql:SELECT count(*) FROM test_table WHERE (source =
 '0' AND ur_createtime BETWEEN '2012103100' AND '20121031235959')
 in hive cli mode, and get the query result, so there is no error in my
 sql.

 The client side exception:

 Caused by: java.sql.SQLException: Query returned non-zero code: 9,
 cause: FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.MapRedTask
 at
 org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:189)
 ... 23 more

 The server side exception(hadoop-jobtracker):

 2012-11-05 18:55:39,443 INFO org.apache.hadoop.mapred.TaskInProgress:
 Error from attempt_201210301133_0112_m_00_3: java.io.IOException:
 Cannot create an instance of InputSplit class =

 org.apache.hadoop.hive.hbase.HBaseSplit:org.apache.hadoop.hive.hbase.HBaseSplit
 at
 org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:146)
 at
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
 at
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
 at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:396)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Unknown Source)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
 at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.ClassNotFoundException:
 org.apache.hadoop.hive.hbase.HBaseSplit
 at java.net.URLClassLoader$1.run(Unknown Source)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(Unknown Source)
 at java.lang.ClassLoader.loadClass(Unknown Source)
 at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
 at java.lang.ClassLoader.loadClass(Unknown Source)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Unknown Source)
 at
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:819)
 at
 org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:143)
 ... 10 more


 My hive-env.sh

 export
 HIVE_AUX_JARS_PATH=/data/install/hive-0.9.0/lib/hive-hbase-handler-0.9.0.jar,/data/install/hive-0.9.0/lib/hbase-0.92.0.jar,/data/install/hive-0.9.0/lib/zookeeper-3.4.2.jar


 My hive-site.xml

 property
 namehive.zookeeper.quorum/name
 valuehadoop01,hadoop02,hadoop03/value
 descriptionThe list of zookeeper servers to talk to. This is
 only needed for read/write locks./description
 /property


 And I start thrift service as below:

 hive --service hiveserver -p 1 


 The server side error log says that HBaseSplit is not found. But why?
 How can I fix this?

 --

 Regards,
 Cheng Su

Re: Hive 0.7.1 with MySQL 5.5 as metastore

2012-11-05 Thread Mark Grover

Venkatesh,
What's the exact integrity constraint error you are seeing?

I'd be curious to see if you restored the data from the mysqldump onto a
separate schema/db on MySQL 5.1 server whether you still get the error or
not.

Mark

On Mon, Nov 5, 2012 at 3:37 PM, Venkatesh Kavuluri vkavul...@outlook.comwrote:

 Sorry for the confusion, the problem is not with the MySQL version upgrade
 - I have indeed performed the upgrade by doing a mysqldump and restoring
 the data.

 The problem is with how Hive 0.7.1 is interacting with the same metastore
 data on a different version of MySQL server.

  Date: Mon, 5 Nov 2012 18:31:37 -0500
  Subject: Re: Hive 0.7.1 with MySQL 5.5 as metastore
  From: edlinuxg...@gmail.com
  To: user@hive.apache.org

 
  Moving underlying data files around is not the correct way to perform
  an upgrade.
 
 
 https://dev.mysql.com/doc/refman/5.5/en/upgrading-from-previous-series.html
 
  I would do a mysqldump and then re-insert the data for maximum
 comparability.
 
  On Mon, Nov 5, 2012 at 6:21 PM, Venkatesh Kavuluri
  vkavul...@outlook.com wrote:
   I am working on copying existing Hive metadata (Hive 0.7.1 with MySQL
 5.1)
   to a new cluster environment (Hive 0.7.1 with MySQL 5.5). I copied
 over the
   metastore tables and modified the data under SDS (sub-directories)
 table to
   reflect the new data path. However I am getting MySQL integrity
 constraint
   violation against SDS.SD_ID column while trying to create new Hive
 tables.
   Is this a problem with the MySQL version I am using ? Does Hive 0.7.1
   support MySQL 5.5 as the metastore.
  
   Thanks,
   Venkatesh

RE: Hive 0.7.1 with MySQL 5.5 as metastore

2012-11-05 Thread Venkatesh Kavuluri

Hi Mark,
I just started to restore the data to a separate MySQL 5.1 schema, will try to 
create a table and post back here.
I copied the error stack trace below.
Nov  5 22:24:02 127.0.0.1/127.0.0.1 local3:[ETLManager] ERROR [pool-2-thread-1] 
exec.MoveTask - Failed with exception Insert of object 
org.apache.hadoop.hive.metastore.model.MStorageDescriptor@1db0454f using 
statement INSERT INTO `SDS` 
(`SD_ID`,`LOCATION`,`OUTPUT_FORMAT`,`IS_COMPRESSED`,`NUM_BUCKETS`,`INPUT_FORMAT`,`SERDE_ID`)
 VALUES (?,?,?,?,?,?,?) failed : Duplicate entry '5152711' for key 
'PRIMARY'javax.jdo.JDODataStoreException: Insert of object 
org.apache.hadoop.hive.metastore.model.MStorageDescriptor@1db0454f using 
statement INSERT INTO `SDS` 
(`SD_ID`,`LOCATION`,`OUTPUT_FORMAT`,`IS_COMPRESSED`,`NUM_BUCKETS`,`INPUT_FORMAT`,`SERDE_ID`)
 VALUES (?,?,?,?,?,?,?) failed : Duplicate entry '5152711' for key 'PRIMARY'   
  at 
org.datanucleus.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:313)
   at org.datanucleus.jdo.JDOTransaction.commit(JDOTransaction.java:132)   at 
org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:315)
 at 
org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:172)
  at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$29.run(HiveMetaStore.java:1687)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$29.run(HiveMetaStore.java:1684)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.executeWithRetry(HiveMetaStore.java:307)
   at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table(HiveMetaStore.java:1684)
   at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:166)
   at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:354)
at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1194)at 
org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:197)   at 
org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:131)   at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)  at 
org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1063)at 
org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900)at 
org.apache.hadoop.hive.ql.Driver.run(Driver.java:748)
Thanks,Venkatesh
Date: Mon, 5 Nov 2012 17:17:46 -0800
Subject: Re: Hive 0.7.1 with MySQL 5.5 as metastore
From: grover.markgro...@gmail.com
To: user@hive.apache.org

Venkatesh,What's the exact integrity constraint error you are seeing?
I'd be curious to see if you restored the data from the mysqldump onto a 
separate schema/db on MySQL 5.1 server whether you still get the error or not. 

Mark

On Mon, Nov 5, 2012 at 3:37 PM, Venkatesh Kavuluri vkavul...@outlook.com 
wrote:




Sorry for the confusion, the problem is not with the MySQL version upgrade - I 
have indeed performed the upgrade by doing a mysqldump and restoring the data.
The problem is with how Hive 0.7.1 is interacting with the same metastore data 
on a different version of MySQL server.


 Date: Mon, 5 Nov 2012 18:31:37 -0500
 Subject: Re: Hive 0.7.1 with MySQL 5.5 as metastore
 From: edlinuxg...@gmail.com
 To: user@hive.apache.org

 
 Moving underlying data files around is not the correct way to perform
 an upgrade.
 
 https://dev.mysql.com/doc/refman/5.5/en/upgrading-from-previous-series.html

 
 I would do a mysqldump and then re-insert the data for maximum comparability.
 
 On Mon, Nov 5, 2012 at 6:21 PM, Venkatesh Kavuluri
 vkavul...@outlook.com wrote:

  I am working on copying existing Hive metadata (Hive 0.7.1 with MySQL 5.1)
  to a new cluster environment (Hive 0.7.1 with MySQL 5.5). I copied over the
  metastore tables and modified the data under SDS (sub-directories) table to

  reflect the new data path. However I am getting MySQL integrity constraint
  violation against SDS.SD_ID column while trying to create new Hive tables.
  Is this a problem with the MySQL version I am using ? Does Hive 0.7.1

  support MySQL 5.5 as the metastore.
 
  Thanks,
  Venkatesh

hive integrate with hbase, map to existed hbase table report column family not exist

2012-11-05 Thread Chris Gong

hi all:
now, I'm map to an existed hbase table, i got the following infomation as:

FAILED: Error in metadata: MetaException(message:Column Family
data is not defined in hbase table df_money_files)
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask
 
   my hive QL is 

create external table hbase_money_files (rowkey string, 
user_no string,
mon int,
mon_sn int,
group_no int,
sn int,
write_sect_no string,
write_sn int,
business_place_code string,
power_no int,
trans_group  int,
price_code string,
ts_flag string,
elec_type_code string,
trade_type_code string,
ms_mode string,
user_ms_type string,
write_power double,
chg_powerdouble,
add_powerdouble,
kb_power double,
share_power  double,
total_power  double,
total_money  double,
num_moneydouble,
add_money1   double,
add_money2   double,
add_money3   double,
add_money4   double,
add_money5   double,
add_money6   double,
add_money7   double,
add_money8   double,
add_money9   double,
add_money10  double,
rp_power double,
rp_money double,
should_money double,
create_date  string,
creator  string,
warrant_no   int,
line_codestring,
trans_no string,
add_taxflag  string,
write_date   string,
compute_date string,
calculator_id string,
statusstring,
user_type1string,
rela_user_no  string,
part_sn   int,
have_ext string,
id_fragment  string,
check_date   string,
check_manstring,
start_date  string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'  
WITH SERDEPROPERTIES (hbase.columns.mapping = :key,data:user_no,
data:mon,
data:mon_sn,
data:group_no,
data:sn,
data:write_sect_no,
data:write_sn,
data:business_place_code,
data:power_no,
data:trans_group,
data:price_code,
data:ts_flag,
data:elec_type_code,
data:trade_type_code,
data:ms_mode,
data:user_ms_type,
data:write_power,
data:chg_power,
data:add_power,
data:kb_power,
data:share_power,
data:total_power,
data:total_money,
data:num_money,
data:add_money1,
data:add_money2,
data:add_money3,
data:add_money4,
data:add_money5,
data:add_money6,
data:add_money7,
data:add_money8,
data:add_money9,
data:add_money10,
data:rp_power,
data:rp_money,
data:should_money,
data:create_date,
data:creator,
data:warrant_no,
data:line_code,
data:trans_no,
data:add_taxflag,
data:write_date,
data:compute_date,
data:calculator_id,
data:status,
data:user_type1,
data:rela_user_no,
data:part_sn,
data:have_ext,
data:id_fragment,
data:check_date,
data:check_man,
data:start_date)  
TBLPROPERTIES(hbase.table.name = df_money_files);  

however the data column does exist! when i describe the table in hbase shell, 
it reported:

hbase(main):001:0 describe 'df_money_files'
DESCRIPTION  ENABLED
 {NAME = 'df_money_files', FAMILIES = [{NAME = 'd true
 ata', BLOOMFILTER = 'NONE', REPLICATION_SCOPE = '
 0', VERSIONS = '3', COMPRESSION = 'NONE', MIN_VER
 SIONS = '0', TTL = '2147483647', BLOCKSIZE = '65
 536', IN_MEMORY = 'false', BLOCKCACHE = 'true'}]}
1 row(s) in 0.8470 seconds

i am confused now, anyone can give some infomation?



Chris Gong

Re: ClassNotFoundException when use hive java client of hive + hbase integration

2012-11-05 Thread Cheng Su

Mark, thank you so much for your suggestion.

Although I've already add necessary jars to my hive aux path, thus I
can execute my sql in hive CLI mode without getting any error.
But when I use a java client to access the tables through the thrift
service, I need to add these jars manually.
I execute the ADD JAR .jar sql and the problem is solved!

Thank you again!

On Tue, Nov 6, 2012 at 9:03 AM, Mark Grover grover.markgro...@gmail.com wrote:
 Cheng,
 You will have to add the appropriate HBase related jars to your class path.

 You can do that by running add jar command(s) or put it in aux_lib. See
 this thread for reference:
 http://mail-archives.apache.org/mod_mbox/hive-user/201103.mbox/%3caanlktingqlgknqmizgoi+szfnexgcat8caqtovf8j...@mail.gmail.com%3E

 Mark


 On Mon, Nov 5, 2012 at 6:53 AM, Cheng Su scarcer...@gmail.com wrote:

 Hi, all. I have a hive+hbase integration cluster.

 When I try to execute query through the java client of hive, sometimes
 a ClassNotFoundException happens.

 My java code :

 final Connection conn = DriverManager.getConnection(URL);
 final ResultSet rs = conn.executeQuery(SELECT count(*) FROM
 test_table WHERE (source = '0' AND ur_createtime BETWEEN
 '2012103100' AND '20121031235959'));

 I can execute the sql:SELECT count(*) FROM test_table WHERE (source =
 '0' AND ur_createtime BETWEEN '2012103100' AND '20121031235959')
 in hive cli mode, and get the query result, so there is no error in my
 sql.

 The client side exception:

 Caused by: java.sql.SQLException: Query returned non-zero code: 9,
 cause: FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.MapRedTask
 at
 org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:189)
 ... 23 more

 The server side exception(hadoop-jobtracker):

 2012-11-05 18:55:39,443 INFO org.apache.hadoop.mapred.TaskInProgress:
 Error from attempt_201210301133_0112_m_00_3: java.io.IOException:
 Cannot create an instance of InputSplit class =

 org.apache.hadoop.hive.hbase.HBaseSplit:org.apache.hadoop.hive.hbase.HBaseSplit
 at
 org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:146)
 at
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
 at
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
 at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:396)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Unknown Source)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
 at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.ClassNotFoundException:
 org.apache.hadoop.hive.hbase.HBaseSplit
 at java.net.URLClassLoader$1.run(Unknown Source)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(Unknown Source)
 at java.lang.ClassLoader.loadClass(Unknown Source)
 at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
 at java.lang.ClassLoader.loadClass(Unknown Source)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Unknown Source)
 at
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:819)
 at
 org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:143)
 ... 10 more


 My hive-env.sh

 export
 HIVE_AUX_JARS_PATH=/data/install/hive-0.9.0/lib/hive-hbase-handler-0.9.0.jar,/data/install/hive-0.9.0/lib/hbase-0.92.0.jar,/data/install/hive-0.9.0/lib/zookeeper-3.4.2.jar


 My hive-site.xml

 property
 namehive.zookeeper.quorum/name
 valuehadoop01,hadoop02,hadoop03/value
 descriptionThe list of zookeeper servers to talk to. This is
 only needed for read/write locks./description
 /property


 And I start thrift service as below:

 hive --service hiveserver -p 1 


 The server side error log says that HBaseSplit is not found. But why?
 How can I fix this?

 --

 Regards,
 Cheng Su





-- 

Regards,
Cheng Su

Which is the postgres version is compatible for hive-trunk..?

2012-11-05 Thread rohithsharma

Hi Guys,

 I am planning to use postgres as metastore with Hive.  Can anyone point me
which is the postgres version is compatible with Hive..?

 

Regards

Rohith Sharma K S

Re: Which is the postgres version is compatible for hive-trunk..?

2012-11-05 Thread Alexander Lorenz

Hey Rohith,

last time I used psql was with postgresql-8.4-701.jdbc4.jar, and was working 
great. But I guess all 8.x version should work. Postgres 9x I personally 
wouldn't choose.

best,
 Alex

On Nov 6, 2012, at 5:29 AM, rohithsharma rohithsharm...@huawei.com wrote:

 Hi Guys,
 
 I am planning to use postgres as metastore with Hive.  Can anyone point me
 which is the postgres version is compatible with Hive..?
 
 
 
 Regards
 
 Rohith Sharma K S
 

--
Alexander Alten-Lorenz
http://mapredit.blogspot.com
German Hadoop LinkedIn Group: http://goo.gl/N8pCF

Re: Alter table is giving error

Re: Alter table is giving error

Re: Alter table is giving error

ClassNotFoundException when use hive java client of hive + hbase integration

Re: Alter table is giving error

Hive compression with external table

Re: Hive compression with external table

Re: Alter table is giving error

Hive 0.7.1 with MySQL 5.5 as metastore

Re: Hive 0.7.1 with MySQL 5.5 as metastore

RE: Hive 0.7.1 with MySQL 5.5 as metastore

Re: ClassNotFoundException when use hive java client of hive + hbase integration

Re: Hive 0.7.1 with MySQL 5.5 as metastore

RE: Hive 0.7.1 with MySQL 5.5 as metastore

hive integrate with hbase, map to existed hbase table report column family not exist

Re: ClassNotFoundException when use hive java client of hive + hbase integration

Which is the postgres version is compatible for hive-trunk..?

Re: Which is the postgres version is compatible for hive-trunk..?

18 matches

Site Navigation

Mail list logo

Footer information