Re: if query in hive

2011-02-03 Thread Viral Bajaria
http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF check conditional functions in the link above, it has the IF and CASE statement definitions. I am guessing some of them might not work with older version of Hive but not too sure. On Thu,

if query in hive

2011-02-03 Thread Amlan Mandal
Actually I need to port some SQL queries to hive QL. Lets say I have hive table t which has columns mobile_no, cookie, ip, access_id. Lets say I want to count unique users. My definition of of unique user = all unique mobile numbers + all unique cookie (if for them mobile number not present) + al

Re: [VOTE] Sponsoring Howl as an Apache Incubator project

2011-02-03 Thread John Sichi
Got it, thanks for the correction. JVS On Feb 3, 2011, at 4:56 PM, Alex Boisvert wrote: > Hi John, > > Just to clarify where I was going with my line of questioning. There's no > Apache policy that prevents dependencies on incubator project, whether it's > releases, snapshots or even home-m

Re: [VOTE] Sponsoring Howl as an Apache Incubator project

2011-02-03 Thread John Sichi
On Feb 3, 2011, at 5:09 PM, Alan Gates wrote: > Are you referring to the serde jar or any particular serde's we are making > use of? Both (see below). JVS [jsichi@dev1066 ~/open/howl/howl/howl/src/java/org/apache/hadoop/hive/howl] ls cli/ common/ data/ mapreduce/ pig/ rcfile/ [jsic

Re: [VOTE] Sponsoring Howl as an Apache Incubator project

2011-02-03 Thread Alan Gates
Are you referring to the serde jar or any particular serde's we are making use of? Alan. On Feb 3, 2011, at 4:30 PM, John Sichi wrote: I forgot about the serde dependencies...can you add those to the Initial Source note in [[HowlProposal]] just for completeness? JVS On Feb 3, 2011, at 3:

Re: [VOTE] Sponsoring Howl as an Apache Incubator project

2011-02-03 Thread Alex Boisvert
Hi John, Just to clarify where I was going with my line of questioning. There's no Apache policy that prevents dependencies on incubator project, whether it's releases, snapshots or even home-made hacked-together packaging of an incubator project.It's been done before and as long as the incu

Re: [VOTE] Sponsoring Howl as an Apache Incubator project

2011-02-03 Thread John Sichi
I forgot about the serde dependencies...can you add those to the Initial Source note in [[HowlProposal]] just for completeness? JVS On Feb 3, 2011, at 3:11 PM, Alan Gates wrote: > Yes, it adds Input and Output formats for MapReduce and load and store > functions for Pig. In the future it we e

Re: [VOTE] Sponsoring Howl as an Apache Incubator project

2011-02-03 Thread John Sichi
I was going off of what I read in HADOOP-3676 (which lacks a reference as well). But I guess if a release can be made from the incubator, then it's not a blocker. JVS On Feb 3, 2011, at 3:29 PM, Alex Boisvert wrote: > On Thu, Feb 3, 2011 at 11:38 AM, John Sichi wrote: > Besides the fact that

Re: Hive queries consuming 100% cpu

2011-02-03 Thread Vijay
Sorry i should've given more details. The query was limited by a partition range; I just omitted the WHERE clause in the mail. The table is not that big. For each day, there is one gzipped file. The largest file is about 250MB (close to 2GB uncompressed). I did intend to count and that was just to

Re: [VOTE] Sponsoring Howl as an Apache Incubator project

2011-02-03 Thread Alex Boisvert
On Thu, Feb 3, 2011 at 11:38 AM, John Sichi wrote: > Besides the fact that the refactoring required is significant, I don't > think this is possible to do quickly since: > > 1) Hive (unlike Pig) requires a metastore > > 2) Hive releases can't depend on an incubator project > I'm not sure what yo

Re: [VOTE] Sponsoring Howl as an Apache Incubator project

2011-02-03 Thread Alan Gates
Yes, it adds Input and Output formats for MapReduce and load and store functions for Pig. In the future it we expect it will continue to add more additional layers. Alan. On Feb 3, 2011, at 2:49 PM, John Sichi wrote: But Howl does layer on some additional code, right? https://github.com/

Re: [VOTE] Sponsoring Howl as an Apache Incubator project

2011-02-03 Thread Ashutosh Chauhan
What I am referring to is metastore/ dir of hive, part of hive code which howl cares about most. Other howl code is for additional functionalities that Howl provides (none of which lives in metastore/ dir) they are in howl/ dir. There are few build file changes, but they are trivial. Ashutosh On T

Re: [VOTE] Sponsoring Howl as an Apache Incubator project

2011-02-03 Thread John Sichi
But Howl does layer on some additional code, right? https://github.com/yahoo/howl/tree/howl/howl JVS On Feb 3, 2011, at 1:49 PM, Ashutosh Chauhan wrote: > There are none as of today. In the past, whenever we had to have > changes, we do it in a separate branch in Howl and once those get > commi

Re: Hive queries consuming 100% cpu

2011-02-03 Thread Viral Bajaria
Hey Vijay, You can go to the mapred ui, normally it runs on port 50030 of the namenode and see how many map jobs got created for your submitted query. You said that the events table has daily partitions but the example query that you have does not prune the partitions by specifying a WHERE clause

Re: [VOTE] Sponsoring Howl as an Apache Incubator project

2011-02-03 Thread Ashutosh Chauhan
There are none as of today. In the past, whenever we had to have changes, we do it in a separate branch in Howl and once those get committed to hive repo, we pull it over in our trunk and drop the branch. Ashutosh On Thu, Feb 3, 2011 at 13:41, yongqiang he wrote: > I am interested in some numbers

Re: [VOTE] Sponsoring Howl as an Apache Incubator project

2011-02-03 Thread yongqiang he
I am interested in some numbers around the lines of code changes (or files of changes) which are in Howl but not in Hive? Can anyone give some information here? Thanks Yongqiang On Thu, Feb 3, 2011 at 1:15 PM, Jeff Hammerbacher wrote: > Hey, > >> >> If we do go ahead with pulling the metastore ou

Re: [VOTE] Sponsoring Howl as an Apache Incubator project

2011-02-03 Thread Jeff Hammerbacher
Hey, > If we do go ahead with pulling the metastore out of Hive, it might make > most sense for Howl to become its own TLP rather than a subproject. > Yes, I did not read the proposal closely enough. I think an end state as a TLP makes more sense for Howl than as a Pig subproject. I'd really lov

Hive queries consuming 100% cpu

2011-02-03 Thread Vijay
Hi, The simplest of hive queries seem to be consuming 100% cpu. This is with a small 4-node cluster. The machines are pretty beefy (16 cores per machine, tons of RAM, 16 M+R maximum tasks configured, 1GB RAM for mapred.child.java.opts, etc). A simple query like "select count(1) from events" where

Re: [VOTE] Sponsoring Howl as an Apache Incubator project

2011-02-03 Thread John Sichi
Besides the fact that the refactoring required is significant, I don't think this is possible to do quickly since: 1) Hive (unlike Pig) requires a metastore 2) Hive releases can't depend on an incubator project It's worth pointing out that Howl is already using Hive's CLI+DDL (not just metasto

RE: Can't drop table

2011-02-03 Thread Tali K
We have a similar problem with not being able to drop tables, using Hive 0.6 and Hadoop 20.0 along with Postgres. Can you share with us the schema you created? We are wondering if the schema that was automatically created in our case is somehow incomplete. Thanks, Tali Date: Mon, 17 Jan

RE: Please read if you plan to use Hive 0.7.0 on Hadoop 0.20.0

2011-02-03 Thread Severance, Steve
We are not using 0.20 at eBay so we are fine with this. Steve From: Ajo Fod [mailto:ajo@gmail.com] Sent: Monday, January 31, 2011 9:49 PM To: user@hive.apache.org Subject: Re: Please read if you plan to use Hive 0.7.0 on Hadoop 0.20.0 I am new to hive and hadoop and I got the packaged versio

Re: [VOTE] Sponsoring Howl as an Apache Incubator project

2011-02-03 Thread Jay Booth
Food for thought, what if the metastore were moved to Howl more aggressively? It seems like the end state everyone's aiming for is that Hive and Pig share Howl as a metastore layer, which makes all kinds of sense.. would it increase the chances of long term success if you guys just went for it an

Re: [VOTE] Sponsoring Howl as an Apache Incubator project

2011-02-03 Thread Ashutosh Chauhan
+1 On Wed, Feb 2, 2011 at 13:18, Alan Gates wrote: > Howl is a table management system built to provide metadata and storage > management across data processing tools in Hadoop (Pig, Hive, MapReduce, > ...).  You can learn more details at http://wiki.apache.org/pig/Howl.  For > the last six month

Re: [VOTE] Sponsoring Howl as an Apache Incubator project

2011-02-03 Thread Alan Gates
Alan, I see your points. I agree with you and I am +1. (incubator/subproject is not important to me) You mentioned that hive is cautious about checking changes into the meta-store. I would not say we (hive) are cautious. Hive is getting pulled in many people in many directions (this is a goo

Re: [VOTE] Sponsoring Howl as an Apache Incubator project

2011-02-03 Thread Edward Capriolo
On Thu, Feb 3, 2011 at 12:16 AM, Alan Gates wrote: > Edward, > > I understand your concern with having a copy of the metastore code in Howl. >  However, let's separate code from governance.  The reason Howl has a copy > of Hive's metastore is not because we're proposing it for the Incubator, it >

Re: Making Thrift work with Hive in client-server mode

2011-02-03 Thread Jay Ramadorai
Sorry. I had an error in my message below. I start up Derby on the same port that is specified in hive-site. So my derby start looks like: > > nohup $DERBY_HOME/bin/startNetworkServer -h 0.0.0.0 -p & (not ) BTW, all the ports shown here are examples only. On Feb 3, 2011, at 9:22 AM, J

Making Thrift work with Hive in client-server mode

2011-02-03 Thread Jay Ramadorai
Can someone explain how the Thriftserver finds the Hive metastore? I am running with all non-default values and need to know how to connect to Thrift so it finds Hive with the right metastore. I am running Derby in server mode on a non-default port. And my metastore name is non-default. And I w