SocketTimeoutException when insert into HBase by Hive

2012-07-23 Thread Cdy Chen
Hi all, When I usr 447 files which are 64M each one as input to insert into HBase, it throws SocketTimeoutException. But if I use smaller input, it works well. I guess it is related to Hadoop configuration. But how to configure? Thank you! Best Regards, Chen

Possibility of defining the Output directory programmatically

2012-07-23 Thread Manisha Gayathri
Hi Is there any possibility of defining the output directory of a hive query using a Hive UDF? In my UDF, I am passing 2 parameters (as follows) and this generates a file-system URL *getFilePath( 0,testServer );* Can I use the above getFilePath( 0,testServer ) value, as the Local Directory

Re: Possibility of defining the Output directory programmatically

2012-07-23 Thread Vinod Singh
The output path in this query is already parameterized- *INSERT OVERWRITE LOCAL DIRECTORY 'file:///${hiveconf:file_name}'* Though UDF is not going to be invoked here. Thanks, Vinod 2012/7/23 Manisha Gayathri mani...@wso2.com Hi Is there any possibility of defining the output directory of a

Re: Possibility of defining the Output directory programmatically

2012-07-23 Thread Manisha Gayathri
Hi Vinod, Thanks for the prompt reply. Understood your point and sorry for not providing the complete code segment earlier. I have the getFilePath function which should return a URL like this. home/user/Desktop/logDir/logs/log_0_testServer_2012_07_22 The defined function works perfectly if I

Re: Possibility of defining the Output directory programmatically

2012-07-23 Thread Vinod Singh
SET commands are handled differently and UDFs can't be invoked there. IMO you need to pass the directory location value from outside of Hive. That is how we do. Thanks, Vinod 2012/7/23 Manisha Gayathri mani...@wso2.com Hi Vinod, Thanks for the prompt reply. Understood your point and sorry

Re: Possibility of defining the Output directory programmatically

2012-07-23 Thread Vinod Singh
We generate variables dynamically and then create a final script file by concatenating variables (SET commands) and Hive queries. Then final script is executed. Probably you can adopt something similar approach. Thanks, Vinod 2012/7/23 Manisha Gayathri mani...@wso2.com Thanks again Vinod.

Re: Possibility of defining the Output directory programmatically

2012-07-23 Thread Manisha Gayathri
Thanks Vinod. I tried concatenating variables. But that is also not possible as I see. set pqr = concat(foo,bar); set file_name= home/user/Desktop Then the file_name I am getting is *NOT* home/user/Desktop/foo_bar But what I am getting is, /home/user/Desktop/concat(foo,bar) On Mon, Jul 23,

Report Grnerating tools for hive

2012-07-23 Thread yogesh.kumar13
Hi All, I am looking for the report generating tool over Apache Hadoop-hive, Please suggest me some of these tools which are easily compatible and better. I am using Hadoop-0.20.2 and hive-0.8.1 versions. O.S - Mac OS X 10.6.8 Thanks Regards Yogesh Kumar Dhari Please do not print this

Re: Report Grnerating tools for hive

2012-07-23 Thread Bejoy KS
Hi Yogesh I know micro strategy and tableau used for reporting on top of hive, but not sure on Mac support for those. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: yogesh.kuma...@wipro.com Date: Mon, 23 Jul 2012 11:31:03 To: user@hive.apache.org

Re: Report Grnerating tools for hive

2012-07-23 Thread Nitin Pawar
microstrategy comes with a linux server but has a the BI tool limitation to windows, same goes with Tableau (Though I am not sure if they added support for mac) I have used pentaho and it worked well across linux, mac and windows. It also has open source edition (with lesser features) but that

Re: 回复: [ANNOUNCE] New PMC member - Ashutosh Chauhan

2012-07-23 Thread Aniket Mokashi
Congrats Ashutosh! ~Aniket On Wed, Jul 18, 2012 at 10:25 PM, Ashutosh Chauhan hashut...@apache.orgwrote: Thanks, Andes and Bejoy ! Ashutosh On Tue, Jul 17, 2012 at 12:52 AM, Bejoy KS bejoy...@yahoo.com wrote: ** Well deserved one. Congrats Ashutosh. Regards Bejoy KS Sent from

Structs in Hive

2012-07-23 Thread kulkarni.swar...@gmail.com
Hello, I kind of have a pretty basic question here. I am trying to read structs stored in HBase to be read by Hive. In what format should these structs be written so that they can be read? For instance, if my query has the following struct: s structa: STRING, b: STRING How should I be writing

Re: Structs in Hive

2012-07-23 Thread Edward Capriolo
in your case hbase has a custom serde, the Deserializer interface is what turns the value from the input format into something that hive can understand. HBase support uses the user specified table property columns.mapping as information for what it should parse out of the hbase result. On Mon,

RE: Report Grnerating tools for hive

2012-07-23 Thread yogesh.kumar13
Hello Nitin, Would you please share how to install Pentaho on ubuntu, and to use it with Hive. Thanks Regards Yogesh Kumar From: Nitin Pawar [nitinpawar...@gmail.com] Sent: Monday, July 23, 2012 5:27 PM To: user@hive.apache.org; bejoy...@yahoo.com

Re: Report Grnerating tools for hive

2012-07-23 Thread Nitin Pawar
just download from pentaho site and follow the instruction from the README file its straight forward On Mon, Jul 23, 2012 at 10:45 PM, yogesh.kuma...@wipro.com wrote: Hello Nitin, Would you please share how to install Pentaho on ubuntu, and to use it with Hive. Thanks Regards Yogesh

Re: Structs in Hive

2012-07-23 Thread kulkarni . swarnim
Cool. Thanks :) Also was just curious what do people generally use to write struct data in hive tables? I see that there is a STRUCT function defined that takes parameters and creates structs off them. Can we use a custom class as well? Thanks again. Sent from my iPhone On Jul 23, 2012, at

RE: Report Grnerating tools for hive

2012-07-23 Thread yogesh.kumar13
Hey Nitin I am not getting way to connect Pentaho with hive --service hiveserver, Please Nitin help and suggest. Regards Yogesh Kumar From: Nitin Pawar [nitinpawar...@gmail.com] Sent: Monday, July 23, 2012 10:56 PM To: user@hive.apache.org Subject: Re:

Re: Report Grnerating tools for hive

2012-07-23 Thread shashwat shriparv
Check out these... - Pentaho – http://www.pentaho.com/hadoop/, Business Intelligence Player Pentaho Embraces Hadoophttp://ostatic.com/blog/business-intelligence-player-pentaho-embraces-hadoop - Intellicus – Intellicus to support Hadoop framework for Large

Performance Issues in Hive with S3 and Partitions

2012-07-23 Thread richin.jain
Hi, Sorry this is an AWS Hive Specific question. I have two External Hive tables for my custom logs. 1. flat directory structure on AWS S3, no partition and files in bz2 compressed format (few big files) 2. With 3 level of partitions on AWS S3 (lot of small uncompressed files) I noticed

Re: Performance Issues in Hive with S3 and Partitions

2012-07-23 Thread Igor Tatarinov
Are you using EMR? Have you tried setting Hive.optimize.s3.query=true as mentioned in http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-version-details.html I haven't tried using that option myself. I am curious if it helps in your scenario. The above page also

Re: Hive query optimization

2012-07-23 Thread Igor Tatarinov
Here is my 2 cents. The parameters you are looking at are quite specific. Unless you know what you are doing it might be hard to set them exactly right and they shouldn't make that much of a difference - again unless you know the specifics. What worked for me is using a single wave of reducers.