Re: need help with an error - script used to work and now it does not :-(

2013-05-16 Thread Sanjay Subramanian
:-( Still facing problems in large datasets
Were u able to solve this Edward ?
Thanks
sanjay

From: Sanjay Subramanian 
mailto:sanjay.subraman...@wizecommerce.com>>
Reply-To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Date: Thursday, May 16, 2013 8:25 PM
To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Subject: Re: need help with an error - script used to work and now it does not 
:-(

Thanks Edward…I just checked all instances of guava jars…except those in red 
all seem same version

/usr/lib/hadoop/client/guava-11.0.2.jar
/usr/lib/hadoop/client-0.20/guava-11.0.2.jar
/usr/lib/hadoop/lib/guava-11.0.2.jar
/usr/lib/hadoop-httpfs/webapps/webhdfs/WEB-INF/lib/guava-11.0.2.jar
/usr/lib/hadoop-hdfs/lib/guava-11.0.2.jar
/usr/lib/oozie/libtools/guava-11.0.2.jar
/usr/lib/hive/lib/guava-11.0.2.jar
/usr/lib/hadoop-0.20-mapreduce/lib/guava-11.0.2.jar
/usr/lib/hbase/lib/guava-11.0.2.jar
/usr/lib/flume-ng/lib/guava-11.0.2.jar
/usr/share/cmf/lib/cdh3/guava-r09-jarjar.jar
/usr/share/cmf/lib/guava-12.0.1.jar

But I made a small change in my query (I just removed the text marked in blue) 
that seemed to solve it at least for the test data set that I had….Now I need 
to run it in production for a days worth of data

Will keep u guys posted


SELECT
h.header_date_donotquery as date_,
h.header_id as impression_id,
h.header_searchsessionid as search_session_id,
h.cached_visitid as visit_id ,
split(h.server_name_donotquery,'[\.]')[0] as server,
h.cached_ip ip,
h.header_adnodeid ad_nodes,


Thanks

sanjay


From: Edward Capriolo mailto:edlinuxg...@gmail.com>>
Reply-To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Date: Thursday, May 16, 2013 7:51 PM
To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Subject: Re: need help with an error - script used to work and now it does not 
:-(

Ironically I just got a misleading error like this today. What happened was I 
upgraded to hive 0.10.One of my programs was liked to guava 15 but hive 
provides guava 09 on the classpath confusing things. I also had a similar issue 
with mismatched slf 4j and commons-logger.


On Thu, May 16, 2013 at 10:34 PM, Sanjay Subramanian 
mailto:sanjay.subraman...@wizecommerce.com>>
 wrote:

2013-05-16 18:57:21,094 FATAL [IPC Server handler 19 on 40222] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1368666339740_0135_m_000104_1 - exited : java.lang.RuntimeException: 
Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:395)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:103)
... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
... 14 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:103)
... 17 more
Caused by: java.lang.RuntimeException: Ma

Re: need help with an error - script used to work and now it does not :-(

2013-05-16 Thread Sanjay Subramanian
Thanks Edward…I just checked all instances of guava jars…except those in red 
all seem same version

/usr/lib/hadoop/client/guava-11.0.2.jar
/usr/lib/hadoop/client-0.20/guava-11.0.2.jar
/usr/lib/hadoop/lib/guava-11.0.2.jar
/usr/lib/hadoop-httpfs/webapps/webhdfs/WEB-INF/lib/guava-11.0.2.jar
/usr/lib/hadoop-hdfs/lib/guava-11.0.2.jar
/usr/lib/oozie/libtools/guava-11.0.2.jar
/usr/lib/hive/lib/guava-11.0.2.jar
/usr/lib/hadoop-0.20-mapreduce/lib/guava-11.0.2.jar
/usr/lib/hbase/lib/guava-11.0.2.jar
/usr/lib/flume-ng/lib/guava-11.0.2.jar
/usr/share/cmf/lib/cdh3/guava-r09-jarjar.jar
/usr/share/cmf/lib/guava-12.0.1.jar

But I made a small change in my query (I just removed the text marked in blue) 
that seemed to solve it at least for the test data set that I had….Now I need 
to run it in production for a days worth of data

Will keep u guys posted


SELECT
h.header_date_donotquery as date_,
h.header_id as impression_id,
h.header_searchsessionid as search_session_id,
h.cached_visitid as visit_id ,
split(h.server_name_donotquery,'[\.]')[0] as server,
h.cached_ip ip,
h.header_adnodeid ad_nodes,


Thanks

sanjay


From: Edward Capriolo mailto:edlinuxg...@gmail.com>>
Reply-To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Date: Thursday, May 16, 2013 7:51 PM
To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Subject: Re: need help with an error - script used to work and now it does not 
:-(

Ironically I just got a misleading error like this today. What happened was I 
upgraded to hive 0.10.One of my programs was liked to guava 15 but hive 
provides guava 09 on the classpath confusing things. I also had a similar issue 
with mismatched slf 4j and commons-logger.


On Thu, May 16, 2013 at 10:34 PM, Sanjay Subramanian 
mailto:sanjay.subraman...@wizecommerce.com>>
 wrote:

2013-05-16 18:57:21,094 FATAL [IPC Server handler 19 on 40222] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1368666339740_0135_m_000104_1 - exited : java.lang.RuntimeException: 
Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:395)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:103)
... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
... 14 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:103)
... 17 more
Caused by: java.lang.RuntimeException: Map operator initialization failed
at 
org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:121)
... 22 more
Caused by: java.lang.RuntimeException: cannot find field header_date from 
[org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@2add5681,
 
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@295a4523,
 
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInsp

Re: need help with an error - script used to work and now it does not :-(

2013-05-16 Thread Edward Capriolo
Ironically I just got a misleading error like this today. What happened was
I upgraded to hive 0.10.One of my programs was liked to guava 15 but hive
provides guava 09 on the classpath confusing things. I also had a similar
issue with mismatched slf 4j and commons-logger.


On Thu, May 16, 2013 at 10:34 PM, Sanjay Subramanian <
sanjay.subraman...@wizecommerce.com> wrote:

>   2013-05-16 18:57:21,094 FATAL [IPC Server handler 19 on 40222] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
> attempt_1368666339740_0135_m_000104_1 - exited : java.lang.RuntimeException: 
> Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:395)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:103)
>   ... 9 more
> Caused by: java.lang.RuntimeException: Error in configuring object
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
>   at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
>   ... 14 more
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:103)
>   ... 17 more
> Caused by: java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:121)
>   ... 22 more*Caused by: java.lang.RuntimeException: cannot find field 
> header_date from 
> [org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@2add5681,
>  
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspect*or$MyField@295a4523,
>  
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@6571120a,
>  
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@6257828d,
>  
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@5f3c296b,
>  
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@66c360a5,
>  
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@24fe2558,
>  
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@2945c761,
>  
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@2424c672]
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:345)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldRef(UnionStructObjectInspector.java:100)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:896)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:922)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:60)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:433)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:389)
>   at 
> org.apache.hadoop.hive.ql.exec.FilterOperator.initializeOp(FilterOperator.java:78)
>   at org.apache.hadoop.hi

need help with an error - script used to work and now it does not :-(

2013-05-16 Thread Sanjay Subramanian
2013-05-16 18:57:21,094 FATAL [IPC Server handler 19 on 40222] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1368666339740_0135_m_000104_1 - exited : java.lang.RuntimeException: 
Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:395)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:103)
... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
... 14 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:103)
... 17 more
Caused by: java.lang.RuntimeException: Map operator initialization failed
at 
org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:121)
... 22 more
Caused by: java.lang.RuntimeException: cannot find field header_date from 
[org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@2add5681,
 
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@295a4523,
 
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@6571120a,
 
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@6257828d,
 
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@5f3c296b,
 
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@66c360a5,
 
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@24fe2558,
 
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@2945c761,
 
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@2424c672]
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:345)
at 
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldRef(UnionStructObjectInspector.java:100)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57)
at 
org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:896)
at 
org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:922)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:60)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:433)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:389)
at 
org.apache.hadoop.hive.ql.exec.FilterOperator.initializeOp(FilterOperator.java:78)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:433)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:389)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:166)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(Map

Hive metastore on Oracle

2013-05-16 Thread Raj Hadoop
Hi,

I am looking for suggestions from the hadoop and hive user community on the 
following -

1) How good is the choice of choosing Oracle for Hive Metastore ?

In my organization, we only use Oracle database and so we wanted to know 
whether there are any known issues with Oracle Hive Metastore. Please suggest.


Thanks,
Raj


Hive views

2013-05-16 Thread abhishek
Hi all, 

We had created views in hive, but when we restart the cluster then the views 
created in the hive are deleted automatically.
I know hive view doesn't persist after cluster restart, but I want figure it 
out a way to persist it. Can someone help me how to do that?

Regards
Abhishek 

Re: Hive Authorization and Views

2013-05-16 Thread John Omernik
Edward - I agree that hive and rdbms are different animals, so in looking
at that current work around hive authorization, I  get that the user would
still have access to the underlying file system.  We have to assume that
permissions are only enforced from a metadata perspective.  But given that
it's high on the list of questions around hive in enterprise adoption of
any data warehousing solution, it may provide enough of a control to pass
audit requirements if views could be used as the control. User can access
data directly (outside of hive) however in hive users can't access table
directly, but can access the view.   Need to think it through some more,
even in a RBDMS, sometimes certain users would be able to access the files
of the data store (administrators etc) but be controlled from a perspective
of accessing the data through the rdbms.   Great discussion, I love stuff
like this, Hive is awesome its community discussion that makes it kick ass
(excuse the language) :)



On Thu, May 16, 2013 at 4:19 PM, Sanjay Subramanian <
sanjay.subraman...@wizecommerce.com> wrote:

>  Also we have all external tables to ensure that accidental dropping of
> tables does not delete data…Plus the good part of HDFS architecture is data
> is immutable….which means u cannot update rows….u can move partitions or
> delete/insert data from hdfs which IMHO is very cool….but may not solve all
> use cases
> Regards
> sanjay
>
>   From: Edward Capriolo 
> Reply-To: "user@hive.apache.org" 
> Date: Thursday, May 16, 2013 2:05 PM
> To: "user@hive.apache.org" 
> Subject: Re: Hive Authorization and Views
>
>   The largest issue is that the RDBMS security model does not match with
> hive. Hive/Hadoop has file permissions, RDMBS have column and sometimes row
> level permissions.
>
>  When you physically have access to the underlying file (row level)
> permissions are not enforceable. The only way to enforce this type of
> security is to force users through a "turnstyle" that changes how hive
> currently works.
>
>
>
>
> On Thu, May 16, 2013 at 4:42 PM, John Omernik  wrote:
>
>> I am curious on the thoughts of the community here, this seems like
>> something many enterprises would drool over with Hive... I am not a coder
>> so the level coding involved something like this is unknown.
>>
>>
>> On Sat, May 4, 2013 at 8:31 AM, John Omernik  wrote:
>>
>>> We were doing some tests this past week with hive authorization, one of
>>> our current use "challenges" is when we have an underlying, well managed
>>> and partitioned table, and we want to allow access to certain columns in
>>> that table.  Our first thoughts went to VIEWs as that's a common use case
>>> with Relational Databases, (i.e. setup a view with only the columns you
>>> want the user to access) and set the permissions appropriately.
>>>
>>>  In testing, and this is not surprising given the the "newness" of Hive
>>> Authorization, a VIEW can not be created as to allow access to to a table
>>> without granting access to the underlying table, defeating the idea of the
>>> view as tool to manage that access.
>>>
>>>  So I wanted to put to the user group: I've done some JIRA searching
>>> and didn't find anything (I will admit my JIRA search Foo is not stellar),
>>> but is there an option that could be thrown together in Hive that would
>>> allow that use case?  Perhaps a configuration setting that would allow
>>> views to execute as a specific user (perhaps a global user, or perhaps a
>>> user specified as view creation).  This could allow the "view" to have
>>> access to underlying table, but since the view is created, and it couldn't
>>> be changed by the user, and thus you could set view "read" permissions to
>>> your user or group of users you want access.
>>>
>>>  I suppose this has challenges "i.e. can a user just create a view to
>>> bypass table level restrictions? Perhaps if this model was taken, the
>>> privilege for CREATING/MODIFYING views could be created and granted only to
>>> a superuser of some sort.  I am really just walking through ideas here as
>>> this is the one last stumbling blocks we have with Hive from an "Enterprise
>>> ready" point of view. Heck, if done right, you could almost do data masking
>>> at the view level. You have a column in your source data that is sensitive,
>>> so instead of returning that column you do a MD5 (can we have a native MD5
>>> function? :) of that column or you blank that column. If we put in strong
>>> security on the creation, modification of views, and allow views to execute
>>> as a different user that has access to source data, you have a powerful way
>>> to represent your data to all levels within your org.
>>>
>>>  Also: Since I am just brain storming here, I'd love to hear what
>>> others maybe doing around this area. Perhaps the Hive User Community can
>>> come up with a strategic plan, while at the same time share some shorter
>>> term workarounds.
>>>
>>>  Thanks!
>>>
>>
>>
>
> CONFIDENTIALITY NOTICE
> ==

Re: [ANNOUNCE] Apache Hive 0.11.0 Released

2013-05-16 Thread Dean Wampler
Congratulations!

On Thu, May 16, 2013 at 4:19 PM, Owen O'Malley  wrote:

> The Apache Hive team is proud to announce the the release of Apache
> Hive version 0.11.0.
>
> The Apache Hive data warehouse software facilitates querying and
> managing large datasets residing in distributed storage. Built on top
> of Apache Hadoop, it provides:
>
> * Tools to enable easy data extract/transform/load (ETL)
>
> * A mechanism to impose structure on a variety of data formats
>
> * Access to files stored either directly in Apache HDFS or in other
>   data storage systems such as Apache HBase
>
> * Query execution via MapReduce
>
> For Hive release details and downloads, please visit:
> http://hive.apache.org/releases.html
>
> Hive 0.11.0 Release Notes are available here:
>
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12323587&styleName=Html&projectId=12310843
>
> We would like to thank the many contributors who made this release
> possible.
>
> Regards,
>
> The Apache Hive Team
>



-- 
Dean Wampler, Ph.D.
@deanwampler
http://polyglotprogramming.com


Re: Hive Authorization and Views

2013-05-16 Thread Sanjay Subramanian
Also we have all external tables to ensure that accidental dropping of tables 
does not delete data…Plus the good part of HDFS architecture is data is 
immutable….which means u cannot update rows….u can move partitions or 
delete/insert data from hdfs which IMHO is very cool….but may not solve all use 
cases
Regards
sanjay

From: Edward Capriolo mailto:edlinuxg...@gmail.com>>
Reply-To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Date: Thursday, May 16, 2013 2:05 PM
To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Subject: Re: Hive Authorization and Views

The largest issue is that the RDBMS security model does not match with hive. 
Hive/Hadoop has file permissions, RDMBS have column and sometimes row level 
permissions.

When you physically have access to the underlying file (row level) permissions 
are not enforceable. The only way to enforce this type of security is to force 
users through a "turnstyle" that changes how hive currently works.




On Thu, May 16, 2013 at 4:42 PM, John Omernik 
mailto:j...@omernik.com>> wrote:
I am curious on the thoughts of the community here, this seems like something 
many enterprises would drool over with Hive... I am not a coder so the level 
coding involved something like this is unknown.


On Sat, May 4, 2013 at 8:31 AM, John Omernik 
mailto:j...@omernik.com>> wrote:
We were doing some tests this past week with hive authorization, one of our 
current use "challenges" is when we have an underlying, well managed and 
partitioned table, and we want to allow access to certain columns in that 
table.  Our first thoughts went to VIEWs as that's a common use case with 
Relational Databases, (i.e. setup a view with only the columns you want the 
user to access) and set the permissions appropriately.

In testing, and this is not surprising given the the "newness" of Hive 
Authorization, a VIEW can not be created as to allow access to to a table 
without granting access to the underlying table, defeating the idea of the view 
as tool to manage that access.

So I wanted to put to the user group: I've done some JIRA searching and didn't 
find anything (I will admit my JIRA search Foo is not stellar), but is there an 
option that could be thrown together in Hive that would allow that use case?  
Perhaps a configuration setting that would allow views to execute as a specific 
user (perhaps a global user, or perhaps a user specified as view creation).  
This could allow the "view" to have access to underlying table, but since the 
view is created, and it couldn't be changed by the user, and thus you could set 
view "read" permissions to your user or group of users you want access.

I suppose this has challenges "i.e. can a user just create a view to bypass 
table level restrictions? Perhaps if this model was taken, the privilege for 
CREATING/MODIFYING views could be created and granted only to a superuser of 
some sort.  I am really just walking through ideas here as this is the one last 
stumbling blocks we have with Hive from an "Enterprise ready" point of view. 
Heck, if done right, you could almost do data masking at the view level. You 
have a column in your source data that is sensitive, so instead of returning 
that column you do a MD5 (can we have a native MD5 function? :) of that column 
or you blank that column. If we put in strong security on the creation, 
modification of views, and allow views to execute as a different user that has 
access to source data, you have a powerful way to represent your data to all 
levels within your org.

Also: Since I am just brain storming here, I'd love to hear what others maybe 
doing around this area. Perhaps the Hive User Community can come up with a 
strategic plan, while at the same time share some shorter term workarounds.

Thanks!



CONFIDENTIALITY NOTICE
==
This email message and any attachments are for the exclusive use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message along with any attachments, from 
your computer system. If you are the intended recipient, please be advised that 
the content of this message is subject to access, review and disclosure by the 
sender's Email System Administrator.


[ANNOUNCE] Apache Hive 0.11.0 Released

2013-05-16 Thread Owen O'Malley
The Apache Hive team is proud to announce the the release of Apache
Hive version 0.11.0.

The Apache Hive data warehouse software facilitates querying and
managing large datasets residing in distributed storage. Built on top
of Apache Hadoop, it provides:

* Tools to enable easy data extract/transform/load (ETL)

* A mechanism to impose structure on a variety of data formats

* Access to files stored either directly in Apache HDFS or in other
  data storage systems such as Apache HBase

* Query execution via MapReduce

For Hive release details and downloads, please visit:
http://hive.apache.org/releases.html

Hive 0.11.0 Release Notes are available here:

https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12323587&styleName=Html&projectId=12310843

We would like to thank the many contributors who made this release
possible.

Regards,

The Apache Hive Team


Re: Hive Authorization and Views

2013-05-16 Thread Edward Capriolo
The largest issue is that the RDBMS security model does not match with
hive. Hive/Hadoop has file permissions, RDMBS have column and sometimes row
level permissions.

When you physically have access to the underlying file (row level)
permissions are not enforceable. The only way to enforce this type of
security is to force users through a "turnstyle" that changes how hive
currently works.




On Thu, May 16, 2013 at 4:42 PM, John Omernik  wrote:

> I am curious on the thoughts of the community here, this seems like
> something many enterprises would drool over with Hive... I am not a coder
> so the level coding involved something like this is unknown.
>
>
> On Sat, May 4, 2013 at 8:31 AM, John Omernik  wrote:
>
>> We were doing some tests this past week with hive authorization, one of
>> our current use "challenges" is when we have an underlying, well managed
>> and partitioned table, and we want to allow access to certain columns in
>> that table.  Our first thoughts went to VIEWs as that's a common use case
>> with Relational Databases, (i.e. setup a view with only the columns you
>> want the user to access) and set the permissions appropriately.
>>
>> In testing, and this is not surprising given the the "newness" of Hive
>> Authorization, a VIEW can not be created as to allow access to to a table
>> without granting access to the underlying table, defeating the idea of the
>> view as tool to manage that access.
>>
>> So I wanted to put to the user group: I've done some JIRA searching and
>> didn't find anything (I will admit my JIRA search Foo is not stellar), but
>> is there an option that could be thrown together in Hive that would allow
>> that use case?  Perhaps a configuration setting that would allow views to
>> execute as a specific user (perhaps a global user, or perhaps a user
>> specified as view creation).  This could allow the "view" to have access to
>> underlying table, but since the view is created, and it couldn't be changed
>> by the user, and thus you could set view "read" permissions to your user or
>> group of users you want access.
>>
>> I suppose this has challenges "i.e. can a user just create a view to
>> bypass table level restrictions? Perhaps if this model was taken, the
>> privilege for CREATING/MODIFYING views could be created and granted only to
>> a superuser of some sort.  I am really just walking through ideas here as
>> this is the one last stumbling blocks we have with Hive from an "Enterprise
>> ready" point of view. Heck, if done right, you could almost do data masking
>> at the view level. You have a column in your source data that is sensitive,
>> so instead of returning that column you do a MD5 (can we have a native MD5
>> function? :) of that column or you blank that column. If we put in strong
>> security on the creation, modification of views, and allow views to execute
>> as a different user that has access to source data, you have a powerful way
>> to represent your data to all levels within your org.
>>
>> Also: Since I am just brain storming here, I'd love to hear what others
>> maybe doing around this area. Perhaps the Hive User Community can come up
>> with a strategic plan, while at the same time share some shorter term
>> workarounds.
>>
>> Thanks!
>>
>
>


Re: Hive Authorization and Views

2013-05-16 Thread John Omernik
I am curious on the thoughts of the community here, this seems like
something many enterprises would drool over with Hive... I am not a coder
so the level coding involved something like this is unknown.


On Sat, May 4, 2013 at 8:31 AM, John Omernik  wrote:

> We were doing some tests this past week with hive authorization, one of
> our current use "challenges" is when we have an underlying, well managed
> and partitioned table, and we want to allow access to certain columns in
> that table.  Our first thoughts went to VIEWs as that's a common use case
> with Relational Databases, (i.e. setup a view with only the columns you
> want the user to access) and set the permissions appropriately.
>
> In testing, and this is not surprising given the the "newness" of Hive
> Authorization, a VIEW can not be created as to allow access to to a table
> without granting access to the underlying table, defeating the idea of the
> view as tool to manage that access.
>
> So I wanted to put to the user group: I've done some JIRA searching and
> didn't find anything (I will admit my JIRA search Foo is not stellar), but
> is there an option that could be thrown together in Hive that would allow
> that use case?  Perhaps a configuration setting that would allow views to
> execute as a specific user (perhaps a global user, or perhaps a user
> specified as view creation).  This could allow the "view" to have access to
> underlying table, but since the view is created, and it couldn't be changed
> by the user, and thus you could set view "read" permissions to your user or
> group of users you want access.
>
> I suppose this has challenges "i.e. can a user just create a view to
> bypass table level restrictions? Perhaps if this model was taken, the
> privilege for CREATING/MODIFYING views could be created and granted only to
> a superuser of some sort.  I am really just walking through ideas here as
> this is the one last stumbling blocks we have with Hive from an "Enterprise
> ready" point of view. Heck, if done right, you could almost do data masking
> at the view level. You have a column in your source data that is sensitive,
> so instead of returning that column you do a MD5 (can we have a native MD5
> function? :) of that column or you blank that column. If we put in strong
> security on the creation, modification of views, and allow views to execute
> as a different user that has access to source data, you have a powerful way
> to represent your data to all levels within your org.
>
> Also: Since I am just brain storming here, I'd love to hear what others
> maybe doing around this area. Perhaps the Hive User Community can come up
> with a strategic plan, while at the same time share some shorter term
> workarounds.
>
> Thanks!
>


Re: Hive Web Interface

2013-05-16 Thread Sanjay Subramanian
Thanks Aniket...

From: Aniket Mokashi mailto:aniket...@gmail.com>>
Reply-To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Date: Thursday, May 16, 2013 10:18 AM
To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Cc: "hive-u...@hadoop.apache.org" 
mailto:hive-u...@hadoop.apache.org>>
Subject: Re: Hive Web Interface

Again, you need to set value to "lib/hive-hwi-0.9.0.war". value = 
'/path/to/lib/hive-hwi-0.9.0.war' will not work.

~Aniket


On Thu, May 16, 2013 at 9:57 AM, Sanjay Subramanian 
mailto:sanjay.subraman...@wizecommerce.com>>
 wrote:
1. U will need to set this in the hive-site.xml


  hive.hwi.war.file

  /path/to/lib/hive-hwi-0.9.0.war
  This sets the path to the HWI war file, relative to 
${HIVE_HOME}. 


2. Get this http://archive.apache.org/dist/hive/hive-0.9.0/hive-0.9.0.tar.gz

3. Tar xvf hive-0.9.0.tar.gz

4. hive-0.9.0/lib/hive-hwi-0.9.0.war

From: Something Something 
mailto:mailinglist...@gmail.com>>
Reply-To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Date: Wednesday, May 15, 2013 10:24 PM
To: "hive-u...@hadoop.apache.org" 
mailto:hive-u...@hadoop.apache.org>>
Subject: Hive Web Interface

I have installed Hive locally & I am able to run Hive queries etc.  Now I would 
like to try out Hive Web Interface, but when I try to start the webserver I run 
into this:

./hive --service hwi
13/05/15 22:18:33 INFO hwi.HWIServer: HWI is starting up
13/05/15 22:18:33 WARN conf.HiveConf: hive-site.xml not found on CLASSPATH
13/05/15 22:18:34 FATAL hwi.HWIServer: HWI WAR file not found at 
/lib/hive-hwi-0.9.0.war

CONFIDENTIALITY NOTICE
==
This email message and any attachments are for the exclusive use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message along with any attachments, from 
your computer system. If you are the intended recipient, please be advised that 
the content of this message is subject to access, review and disclosure by the 
sender's Email System Administrator.



--
"...:::Aniket:::... Quetzalco@tl"

CONFIDENTIALITY NOTICE
==
This email message and any attachments are for the exclusive use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message along with any attachments, from 
your computer system. If you are the intended recipient, please be advised that 
the content of this message is subject to access, review and disclosure by the 
sender's Email System Administrator.


Re: Hive Web Interface

2013-05-16 Thread Aniket Mokashi
Again, you need to set value to "lib/hive-hwi-0.9.0.war". value =
'/path/to/lib/hive-hwi-0.9.0.war' will not work.

~Aniket


On Thu, May 16, 2013 at 9:57 AM, Sanjay Subramanian <
sanjay.subraman...@wizecommerce.com> wrote:

>  1. U will need to set this in the hive-site.xml
>
> 
>   hive.hwi.war.file
>   /path/to/lib/hive-hwi-0.9.0.war
>   This sets the path to the HWI war file, relative to 
> ${HIVE_HOME}. 
> 
>
> 2. Get this http://archive.apache.org/dist/hive/hive-0.9.0/hive-0.9.0.tar.gz
>
> 3. Tar xvf hive-0.9.0.tar.gz
>
> 4. hive-0.9.0/lib/hive-hwi-0.9.0.war
>
>
>   From: Something Something 
> Reply-To: "user@hive.apache.org" 
> Date: Wednesday, May 15, 2013 10:24 PM
> To: "hive-u...@hadoop.apache.org" 
> Subject: Hive Web Interface
>
>   I have installed Hive locally & I am able to run Hive queries etc.  Now
> I would like to try out Hive Web Interface, but when I try to start the
> webserver I run into this:
>
> ./hive --service hwi
>  13/05/15 22:18:33 INFO hwi.HWIServer: HWI is starting up
> 13/05/15 22:18:33 WARN conf.HiveConf: hive-site.xml not found on CLASSPATH
> 13/05/15 22:18:34 FATAL hwi.HWIServer: HWI WAR file not found at
> /lib/hive-hwi-0.9.0.war
>
> CONFIDENTIALITY NOTICE
> ==
> This email message and any attachments are for the exclusive use of the
> intended recipient(s) and may contain confidential and privileged
> information. Any unauthorized review, use, disclosure or distribution is
> prohibited. If you are not the intended recipient, please contact the
> sender by reply email and destroy all copies of the original message along
> with any attachments, from your computer system. If you are the intended
> recipient, please be advised that the content of this message is subject to
> access, review and disclosure by the sender's Email System Administrator.
>



-- 
"...:::Aniket:::... Quetzalco@tl"


Re: Hive Web Interface

2013-05-16 Thread Sanjay Subramanian
1. U will need to set this in the hive-site.xml


  hive.hwi.war.file
  /path/to/lib/hive-hwi-0.9.0.war
  This sets the path to the HWI war file, relative to 
${HIVE_HOME}. 


2. Get this http://archive.apache.org/dist/hive/hive-0.9.0/hive-0.9.0.tar.gz

3. Tar xvf hive-0.9.0.tar.gz

4. hive-0.9.0/lib/hive-hwi-0.9.0.war

From: Something Something 
mailto:mailinglist...@gmail.com>>
Reply-To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Date: Wednesday, May 15, 2013 10:24 PM
To: "hive-u...@hadoop.apache.org" 
mailto:hive-u...@hadoop.apache.org>>
Subject: Hive Web Interface

I have installed Hive locally & I am able to run Hive queries etc.  Now I would 
like to try out Hive Web Interface, but when I try to start the webserver I run 
into this:

./hive --service hwi
13/05/15 22:18:33 INFO hwi.HWIServer: HWI is starting up
13/05/15 22:18:33 WARN conf.HiveConf: hive-site.xml not found on CLASSPATH
13/05/15 22:18:34 FATAL hwi.HWIServer: HWI WAR file not found at 
/lib/hive-hwi-0.9.0.war

CONFIDENTIALITY NOTICE
==
This email message and any attachments are for the exclusive use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message along with any attachments, from 
your computer system. If you are the intended recipient, please be advised that 
the content of this message is subject to access, review and disclosure by the 
sender's Email System Administrator.


Re: Hive Web Interface

2013-05-16 Thread Sanjay Subramanian
+1 agreed…beeswax is way better

From: "kulkarni.swar...@gmail.com" 
mailto:kulkarni.swar...@gmail.com>>
Reply-To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Date: Thursday, May 16, 2013 9:21 AM
To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Subject: Re: Hive Web Interface

AFAIK Hive HWI has been deprecated and you should be using hue/beeswax for all 
your web interface needs.


On Thu, May 16, 2013 at 11:18 AM, Aniket Mokashi 
mailto:aniket...@gmail.com>> wrote:
In your hive-site.xml, change value to "lib/hive-hwi-0.9.0.war" from 
"/lib/hive-hwi-0.9.0.war". I guess its a known issue with hwi.

~Aniket


On Thu, May 16, 2013 at 8:58 AM, Stephen Sprague 
mailto:sprag...@gmail.com>> wrote:
ok. i'll bite.  you've cut 'n pasted the stderr to us -- but have you any 
further comment on what you did after reading it?  Take that second line for 
instance.  What action would you take after reading that?




On Wed, May 15, 2013 at 10:24 PM, Something Something 
mailto:mailinglist...@gmail.com>> wrote:
I have installed Hive locally & I am able to run Hive queries etc.  Now I would 
like to try out Hive Web Interface, but when I try to start the webserver I run 
into this:

./hive --service hwi
13/05/15 22:18:33 INFO hwi.HWIServer: HWI is starting up
13/05/15 22:18:33 WARN conf.HiveConf: hive-site.xml not found on CLASSPATH
13/05/15 22:18:34 FATAL hwi.HWIServer: HWI WAR file not found at 
/lib/hive-hwi-0.9.0.war




--
"...:::Aniket:::... Quetzalco@tl"



--
Swarnim

CONFIDENTIALITY NOTICE
==
This email message and any attachments are for the exclusive use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message along with any attachments, from 
your computer system. If you are the intended recipient, please be advised that 
the content of this message is subject to access, review and disclosure by the 
sender's Email System Administrator.


Re: Hive Web Interface

2013-05-16 Thread Aniket Mokashi
In your hive-site.xml, change value to "lib/hive-hwi-0.9.0.war" from
"/lib/hive-hwi-0.9.0.war". I guess its a known issue with hwi.

~Aniket


On Thu, May 16, 2013 at 8:58 AM, Stephen Sprague  wrote:

> ok. i'll bite.  you've cut 'n pasted the stderr to us -- but have you any
> further comment on what you did after reading it?  Take that second line
> for instance.  What action would you take after reading that?
>
>
>
>
> On Wed, May 15, 2013 at 10:24 PM, Something Something <
> mailinglist...@gmail.com> wrote:
>
>> I have installed Hive locally & I am able to run Hive queries etc.  Now I
>> would like to try out Hive Web Interface, but when I try to start the
>> webserver I run into this:
>>
>> ./hive --service hwi
>> 13/05/15 22:18:33 INFO hwi.HWIServer: HWI is starting up
>> 13/05/15 22:18:33 WARN conf.HiveConf: hive-site.xml not found on CLASSPATH
>> 13/05/15 22:18:34 FATAL hwi.HWIServer: HWI WAR file not found at
>> /lib/hive-hwi-0.9.0.war
>>
>
>


-- 
"...:::Aniket:::... Quetzalco@tl"


Re: Hive Web Interface

2013-05-16 Thread kulkarni.swar...@gmail.com
AFAIK Hive HWI has been deprecated and you should be using hue/beeswax for
all your web interface needs.


On Thu, May 16, 2013 at 11:18 AM, Aniket Mokashi wrote:

> In your hive-site.xml, change value to "lib/hive-hwi-0.9.0.war" from
> "/lib/hive-hwi-0.9.0.war". I guess its a known issue with hwi.
>
> ~Aniket
>
>
> On Thu, May 16, 2013 at 8:58 AM, Stephen Sprague wrote:
>
>> ok. i'll bite.  you've cut 'n pasted the stderr to us -- but have you any
>> further comment on what you did after reading it?  Take that second line
>> for instance.  What action would you take after reading that?
>>
>>
>>
>>
>> On Wed, May 15, 2013 at 10:24 PM, Something Something <
>> mailinglist...@gmail.com> wrote:
>>
>>> I have installed Hive locally & I am able to run Hive queries etc.  Now
>>> I would like to try out Hive Web Interface, but when I try to start the
>>> webserver I run into this:
>>>
>>> ./hive --service hwi
>>> 13/05/15 22:18:33 INFO hwi.HWIServer: HWI is starting up
>>> 13/05/15 22:18:33 WARN conf.HiveConf: hive-site.xml not found on
>>> CLASSPATH
>>> 13/05/15 22:18:34 FATAL hwi.HWIServer: HWI WAR file not found at
>>> /lib/hive-hwi-0.9.0.war
>>>
>>
>>
>
>
> --
> "...:::Aniket:::... Quetzalco@tl"
>



-- 
Swarnim


Re: Hive Web Interface

2013-05-16 Thread Stephen Sprague
ok. i'll bite.  you've cut 'n pasted the stderr to us -- but have you any
further comment on what you did after reading it?  Take that second line
for instance.  What action would you take after reading that?




On Wed, May 15, 2013 at 10:24 PM, Something Something <
mailinglist...@gmail.com> wrote:

> I have installed Hive locally & I am able to run Hive queries etc.  Now I
> would like to try out Hive Web Interface, but when I try to start the
> webserver I run into this:
>
> ./hive --service hwi
> 13/05/15 22:18:33 INFO hwi.HWIServer: HWI is starting up
> 13/05/15 22:18:33 WARN conf.HiveConf: hive-site.xml not found on CLASSPATH
> 13/05/15 22:18:34 FATAL hwi.HWIServer: HWI WAR file not found at
> /lib/hive-hwi-0.9.0.war
>


RE: Filtering

2013-05-16 Thread Peter Marron
>>On Wed, May 15, 2013 at 3:38 AM, Peter Marron 
>> wrote:
…
>I've started doing similar work for the ORC reader.

I guess that I’m glad that I’m not completely alone here.

>>
>>Firstly although that page mentions InputFormat there doesn’t seem to be any 
>>way (that I can find)
>>to perform filter passing to InputFormats and so I gave up on that approach.
>>
>There is. You just need to set  hive.optimize.index.filter to true. See 
>https://issues.apache.org/jira/browse/HIVE-4242.

This is a little confusing. When I look through the code for the use of this 
configuration
I see that it’s effectively used in two places.
Firstly it’s used on line 55 of file PhysicalOptimizer.java to add a 
“IndexWhereResolver”
Secondly it’s used on line 766 of file OpProcFactory.java to set a filter 
expression

But I don’t see any point where the predicate is passed to the InputFormat 
class.
I guess that you’re saying that there’s some way that the InputFormat can 
retrieve the
predicate once it’s been stored. But it’s not clear to me how I do that.

>>
>>That said, we really need to create a better interface that allows 
>>inputformats to negotiate what parts of the predicate they can process.

Ah, yes, sorry. I really want to be able to remove part of the predicate and 
subsume the filtering into the InputFormat class.
There’s little point in me going down this route if I can’t do that.

>>
>>-- Owen
>>

Thanks for prodding me into looking at the code, because now I see a big 
problem.

To recap what I really want to do is to be able to effect filtering on the case 
where I do a
select * from table;
query. This is the only query that I’m interested in because it seems to run 
without any
Map/Reduce overhead (either locally or in the cluster) it’s effectively just 
performing
some HDFS calls and that’s what I desire.

What I really want to be able to do is to issue a query like this:
select * from table where 
where I filter out the predicate and do the filtering in the InputFormat and 
then hive
effectively sees the query
select * from table;
and runs it directly (no Map/Reduce) and I’m a happy bunny.

Now, as I say, I can’t see any way to effect this in the InputFormat directly.
If I use a storage handler then I am in “non-native table” terrority and I
can’t LOAD my tables with data.

However I have just noticed that line 111 of file IndexWhereProcessor.java
seems to suggest that indexes are only ever user when the query is going
to run Map/Reduce. Is this so? So I seem to be in the position where I
can’t use InputFormat, StorageHandler or Indexes. What can I do?

Is there any way to filter the query without having to run Map/Reduce?

Any suggestions welcomed.

Peter Marron
Trillium Software UK Limited

Tel : +44 (0) 118 940 7609
Fax : +44 (0) 118 940 7699
E: peter.mar...@trilliumsoftware.com


Re: Partitioning an external hbase table

2013-05-16 Thread kulkarni.swar...@gmail.com
Unfortunately I don't think there is a clean way to achieve that (atleast
not one that I know of). Your option at this point is to run your queries
with a WHERE clause so that the predicate behind the scenes gets converted
to a range scan and restricts the amount of data that is being getting
scanned.


On Wed, May 15, 2013 at 8:22 PM, MailingList
wrote:

>  Is it possible to define partitions for a external table backed by Hbase?
> If so what is the proper syntax?
>
> I already have an external table backed by base and I'm finding that for
> even simple SELECT queries the load isn't getting evenly distributed a
> across the map tasks.  Some tasks see as few as a few hundred map input
> records while others receive more than a million.
>
>


-- 
Swarnim