custom file format

2010-08-09 Thread shimi
Hello Is there a way to load files which are not in a standard key value format? I have files with rows that are more column oriented format: key name1:value1 name2:value:2 ... and I want to use hive in order to run map reduce jobs on them. Shimi

how to call the UDF/UDAF in hive

2010-08-09 Thread lei liu
Hello everyone, Could everybody tell me how to call UDF/UDAF in hive?

RE: How to merge small files

2010-08-09 Thread Namit Jain
That's right From: lei liu [liulei...@gmail.com] Sent: Sunday, August 08, 2010 7:18 PM To: hive-user@hadoop.apache.org Subject: Re: How to merge small files Thank you for your reply. Your mean is I will execute below statement: statement.execute(set

Re: How to merge small files

2010-08-09 Thread lei liu
Could you tell me whether the query is slower if I two parameters both are true? 2010/8/9 Namit Jain nj...@facebook.com That's right From: lei liu [liulei...@gmail.com] Sent: Sunday, August 08, 2010 7:18 PM To: hive-user@hadoop.apache.org Subject:

Hwo to use JDBC client embedded mode

2010-08-09 Thread lei liu
I look see below content in http://wiki.apache.org/hadoop/Hive/HiveClientpage: For embedded mode, uri is just jdbc:hive://. How can I use JDBC client embedded mode? Could anybody give me an example?

RE: How to merge small files

2010-08-09 Thread Namit Jain
Yes, it will try to run another map-reduce job to merge the files From: lei liu [liulei...@gmail.com] Sent: Monday, August 09, 2010 8:57 AM To: hive-user@hadoop.apache.org Subject: Re: How to merge small files Could you tell me whether the query is slower

How are nulls represented in data?

2010-08-09 Thread Pradeep Kamath
Hi, What value does hive expect in the data for a column to be treated as null? I tried some permutations on a text data based table but couldn't figure out what the correct representation was. I tried empty string, the string NULL and the string null for a string column and in all three

Re: NullPointerException in GenericUDTFExplode.process()

2010-08-09 Thread Marc Limotte
Hi Paul, No nulls. I ensure that every row has at least one entry (a hyphen) before I split to create the list. Marc On Sun, Aug 8, 2010 at 8:14 PM, Paul Yang py...@facebook.com wrote: Seem like an issue that was patched already – can you check to see if the column that you are calling

Re: How are nulls represented in data?

2010-08-09 Thread Ning Zhang
How it is serialized/deserialized is determined by specific serde. NULL is serialized as \N by SimpleLazySerDe (default serde for text). RCFile (ColumnarSerDe) uses the same default parameters as LazySimpleSerDe. Unless I missed something, NULL serialization/deserialization is type independent

Re: Wondering about add jar

2010-08-09 Thread John Sichi
I don't think you can add an unarchived file to the classpath like that (Java wants either directories or jars as classpath entries). Probably you can just put that conf file in its own little jar and add that instead. JVS On Aug 8, 2010, at 12:01 PM, Edward Capriolo wrote: What do you guys

Re: How are nulls represented in data?

2010-08-09 Thread yongqiang he
Yes. In LazySimpleSerde/SequenceFile/TextFile, \N is used as NULL. (It is a table property: serialization.null.format) In ColumnSerDe/RCFile, there is no NULL stored. (zero byte, column byte length is zero). But RCFile/ColumnarSerde also use this property when do serializing to determine if a

Re: Wondering about add jar

2010-08-09 Thread Edward Capriolo
On Mon, Aug 9, 2010 at 2:53 PM, John Sichi jsi...@facebook.com wrote: I don't think you can add an unarchived file to the classpath like that (Java wants either directories or jars as classpath entries). Probably you can just put that conf file in its own little jar and add that instead.

Simulating an auto-incrementing column

2010-08-09 Thread Lars Francke
Hi, I have a problem and I hope someone has an idea on how to solve it. My dataset consists of just very simple key-value pairs of strings coming from PostgreSQL using Sqoop. 1) I need to count how often a key occurs - Easy 2) I need to count how often a key-value pair occurs - Easy I need to

Re: NullPointerException in GenericUDTFExplode.process()

2010-08-09 Thread Marc Limotte
Also wanted to mention that I'm using the Cloudera distribution of Hive (0.5.0+20-2) on CentOS. Marc On Sun, Aug 8, 2010 at 7:33 PM, Marc Limotte mslimo...@gmail.com wrote: Hi, I think I may have run into a Hive bug. And I'm not sure what's causing it or how to work around it. The reduce

RE: How to merge small files

2010-08-09 Thread Bakshi, Ankita
Hi, Sorry to hijack this thread. But I am curious if there any other in-built option to merge files in the directory before loading data into the table. I have a directory in the local file system which contains many small files. I want to load it to a single hive table. I am wondering what

Re: How to merge small files

2010-08-09 Thread Edward Capriolo
Lei, Are you still using hive 4.1 or have you upgraded, the merge options mentioned above were probable not present until 5.0 Edward On Mon, Aug 9, 2010 at 9:59 PM, Todd Lee ronnietodd...@gmail.com wrote: as long as the files are inside the same directory, hive will treat them as a table.

Errors while fetching data through Hive.

2010-08-09 Thread Adarsh Sharma
Hi all, I am working with Hadoop-0.20.1+HadoopDb+hive. I have two external table(website_master and master_seed) in Hive whose data is in Postgres and trying to fetch data through java code in Eclipse. I am able to fetch data of website_master but when i tried to fetch data of master_seed