how to add permission controller in hive

2010-08-23 Thread shangan
Could anyone give me some suggestion on how to control the privilege of users of hive ? As you know, in production cluster, normally we put all data into the cluster,but actually we don't want every user can use all the data, from hive perspective, we want to limit the users operating only o

java.sql.SQLException: org.apache.thrift.transport.TTransportException: Cannot read. Remote side has closed. Tried to read 1 bytes, but only got 0 bytes.

2010-08-23 Thread lei liu
Hello everyone, I use JDBC to connection the hive server, sometime I receive below exception: java.sql.SQLException: org.apache.thrift.transport.TTransportException: Cannot read. Remote side has closed. Tried to read 1 bytes, but only got 0 bytes. Please tell me the eason. Thanks LiuLei

Re: java.sql.SQLException: org.apache.thrift.transport.TTransportException: Cannot read. Remote side has closed. Tried to read 1 bytes, but only got 0 bytes.

2010-08-23 Thread Adarsh Sharma
For Running Hive in Server Mode .. First U have to start service of hiveserver :: *$bin/hive --service hiveserver * and then run the code lei liu wrote: Hello everyone, I use JDBC to connection the hive server, sometime I receive below exception: java.sql.SQLException: org.apache.thrift.t

Re: java.sql.SQLException: org.apache.thrift.transport.TTransportException: Cannot read. Remote side has closed. Tried to read 1 bytes, but only got 0 bytes.

2010-08-23 Thread lei liu
Yes, you are right. I do that, but after the hive server run several days, when client connection the hive server, the client receive the exception. 2010/8/23 Adarsh Sharma > For Running Hive in Server Mode .. > First U have to start service of hiveserver :: > > > *$bin/hive --service hiveserver

RE: java.sql.SQLException: org.apache.thrift.transport.TTransportException: Cannot read. Remote side has closed. Tried to read 1 bytes, but only got 0 bytes.

2010-08-23 Thread Bennie Schut
The real reason is probably in the log of the hiveserver. It should be in the console you started the hiveserver in. I sometimes start it like this: nohup ./hive --service hiveserver >hive.log & Which would show you some errors in the hive.log file. We used to have some problems with PermGen issue

Re: java.sql.SQLException: org.apache.thrift.transport.TTransportException: Cannot read. Remote side has closed. Tried to read 1 bytes, but only got 0 bytes.

2010-08-23 Thread lei liu
Hi Bennie, Thank you for your reply I see there is below exception in hive.log : 10/08/23 14:06:23 INFO Datastore.Retrieve: Object with id "3211[OID]org.apache.hadoop.hive.metastore.model.MStorageDescriptor" not found ! 10/08/23 14:06:23 ERROR server.TThreadPoolServer: Error occurred during proces

Applying Patch to Hive

2010-08-23 Thread Adarsh Sharma
Hello everyone , I see many patches of Hadoop-Hive Can anybody Please tell me how to apply patches to Hive. Thanks in Advance.

RE: java.sql.SQLException: org.apache.thrift.transport.TTransportException: Cannot read. Remote side has closed. Tried to read 1 bytes, but only got 0 bytes.

2010-08-23 Thread Bennie Schut
Others probably have more insight on this but at least you narrowed it down to the meta store. I've had some concurrency issues with the metastore myself (HIVE-1539) so perhaps you are running into something similar? Perhaps there are some different errors before you see this error/warning like

Re: Applying Patch to Hive

2010-08-23 Thread Edward Capriolo
On Mon, Aug 23, 2010 at 4:53 AM, Adarsh Sharma wrote: > Hello everyone , > I see many patches of Hadoop-Hive > Can anybody Please tell me how to apply patches to Hive. > > Thanks in Advance. > http://wiki.apache.org/hadoop/Hive/HowToContribute#Applying_a_patch Cheers, Edward

Re: how to add permission controller in hive

2010-08-23 Thread Edward Capriolo
On Mon, Aug 23, 2010 at 3:10 AM, shangan wrote: > >    Could anyone give me some suggestion on how to control the privilege of > users of hive ? As you know, in production cluster, normally we put all data > into the cluster,but actually we don't want every user can use all the data, > from hive p

Re: I modify the HiveInputFormat.java class, but the content modified don't take effect.

2010-08-23 Thread Thiruvel Thirumoolan
Did you change hive-log4j.properties? By default the logging threshold is WARN. You have to change it to INFO. On Aug 22, 2010, at 8:03 AM, lei liu wrote: > I add one line code in HiveInputFormat.java class, example: > LOG.info("1"), then I package the codes into hive_exec.jar and put

some wiki docs for UDAF development

2010-08-23 Thread John Sichi
Mayank Lahiri, a summer intern here at Facebook, wrote up some usage docs on a few of the new functions he added. I'm linking them here in case others find them useful. http://wiki.apache.org/hadoop/Hive/StatisticsAndDataMining He also wrote up some notes on the overall process of writing your

(Self) Joins on NULLable columns takes forever

2010-08-23 Thread Rajappa Iyer
Consider the following table (I've omitted things like additional columns and the serde specification since I think they are mostly irrelevant): CREATE TABLE event_log (visit_time bigint, visitor_id string, user_id string ...) PARTITIONED BY (dt string) ROW FORMAT ...; Where visitor_id is assigne

Re: (Self) Joins on NULLable columns takes forever

2010-08-23 Thread Ted Yu
Was there a typo below (v1 -> e1) ? event_log v1 JOIN event_log e2 ON On Mon, Aug 23, 2010 at 1:36 PM, Rajappa Iyer wrote: > Consider the following table (I've omitted things like additional columns > and the serde specification since I think they are mostly irrelevant): > > CREATE TABLE event_

Re: (Self) Joins on NULLable columns takes forever

2010-08-23 Thread Rajappa Iyer
Yep, that was a typo... sorry. It should read "event_log e1" Thanks, Raj On Mon, Aug 23, 2010 at 3:42 PM, Ted Yu wrote: > Was there a typo below (v1 -> e1) ? > > > event_log v1 JOIN event_log e2 ON > > On Mon, Aug 23, 2010 at 1:36 PM, Rajappa Iyer wrote: > >> Consider the following table (I'v

Re: (Self) Joins on NULLable columns takes forever

2010-08-23 Thread Rajappa Iyer
Sure. I did three runs each. The times for the visitor_id query were 124.515, 118.673 and 115.33 seconds for user_id query with "e1.user_id is not null": 241.201, 252.091 and 238.144 seconds The slowness seems to be mainly due to Stage-1 and Stage-2 reduce. FYI, the row counts are as follows:

Job submission delays

2010-08-23 Thread Shrijeet Paliwal
Hello, If any one has an insight on reducing the hadoop job submission delays as seen by hive Or possible causes of it being high, I would love to hear it. Here is what is happening, I am submitting a very simple hive query - the lag between the time I submit it from command line and the time it a

alter table foo set location fails

2010-08-23 Thread Leo Alekseyev
Hi all, I'm looking at http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL, where there's a code example of the following: ALTER TABLE table_name [partitionSpec] SET LOCATION "new location" However, when I try to run it, I get alter table raw_log_proc_test1_renamed2 set location '/tmp/leo/hivetm

Adding partition to an existing table

2010-08-23 Thread Leo Alekseyev
Hi all, Is it possible to add partitions to an existing table?.. Basically, I have a table partitioned by date, but it should have another partition field -- that is, the hdfs paths should look something like /hdfs_location/my_table/ds=2010-08-23/chunk=0. I've had no luck with alter table add par