why apache hive 0.10 document not found?

2013-03-05 Thread 周梦想
from version 0.80, the release document is not found.

http://hive.apache.org/docs/r0.10.0/

Not Found

The requested URL /docs/r0.10.0/ was not found on this server.
--
Apache/2.4.4 (Unix) OpenSSL/1.0.0g Server at hive.apache.org Port 80


Re: Not able to use the timestamp columns

2013-03-05 Thread Morgan Reece
It looks like your row is of this format:

2415022|OKJNECAA|1900-
01-02
02:00:21.0|0|1|1|1900|1|1|2|1|1900|1|1|Monday|1900Q1|N|N|Y|2415021|2415020|2414657|2414930|N|N|N|N|N|

Where your timestamp is in the third field; however, your table only has a
single column.  Hive is reading your fields from left to right, so, you
don't seem to be accessing the correct field.  Further, since this field
isn't in the timestamp format, it's throwing this exception: Timestamp
format must be -mm-dd hh:mm:ss[.f]

One quick fix might be to create your table like this:

create external table date_ts
(
field0  string
field1  string
d_datetimestamp
)
row format delimited fields terminated by '|'
location '/hive/tpcds/date_ts';

Then run your query like this:

hive -e "select d_date from date_ts"

Hope this helps!  :)



On Tue, Mar 5, 2013 at 4:58 PM, Dileep Kumar wrote:

> Hi All,
>
> I am looking for some help in using timestamp column and not sure why I am
> getting this error:
> Here are how I created the tables and how I am querying it
> --hdfs dfs -mkdir /hive/tpcds/date_ts
>
> create external table date_ts
> (
> d_datetimestamp
> )
> row format delimited fields terminated by '|'
> location '/hive/tpcds/date_ts';
> hive -e "select * from date_ts"
> Logging initialized using configuration in
> file:/etc/hive/conf.dist/hive-log4j.properties
> Hive history
> file=/tmp/cloudera/hive_job_log_cloudera_201303052251_950655265.txt
> OK
> Failed with exception
> java.io.IOException:java.lang.IllegalArgumentException: Timestamp format
> must be -mm-dd hh:mm:ss[.f]
> Time taken: 3.556 seconds
> [cloudera@localhost tmp-work]$ hdfs dfs -cat /hive/tpcds/date_ts/*
> 2415022|OKJNECAA|1900-01-02
> 02:00:21.0|0|1|1|1900|1|1|2|1|1900|1|1|Monday|1900Q1|N|N|Y|2415021|2415020|2414657|2414930|N|N|N|N|N|
>
>


Re: Hive insert into RCFILE issue with timestamp columns

2013-03-05 Thread Mark Grover
Dileep,
Can you use a more contemporary timestamp? Something after Jan 1, 1970
GMT, say Jan 1st, 2013?
Let us know what you see.

On Tue, Mar 5, 2013 at 2:56 PM, Dileep Kumar  wrote:
> --hdfs dfs -mkdir /hive/tpcds/date_ts
>
> create external table date_ts
> (
> d_datetimestamp
> )
> row format delimited fields terminated by '|'
> location '/hive/tpcds/date_ts';
>
> [cloudera@localhost tmp-work]$ hive -e "select * from date_ts"
> Logging initialized using configuration in
> file:/etc/hive/conf.dist/hive-log4j.properties
> Hive history
> file=/tmp/cloudera/hive_job_log_cloudera_201303052251_950655265.txt
> OK
> Failed with exception
> java.io.IOException:java.lang.IllegalArgumentException: Timestamp format
> must be -mm-dd hh:mm:ss[.f]
> Time taken: 3.556 seconds
> [cloudera@localhost tmp-work]$ hdfs dfs -cat /hive/tpcds/date_ts/*
> 2415022|OKJNECAA|1900-01-02
> 02:00:21.0|0|1|1|1900|1|1|2|1|1900|1|1|Monday|1900Q1|N|N|Y|2415021|2415020|2414657|2414930|N|N|N|N|N|
>
>
>
>
>
> On Mon, Mar 4, 2013 at 6:00 PM, Dileep Kumar 
> wrote:
>>
>> No.
>> Here are the errors:
>> Task with the most failures(4):
>> -
>> Task ID:
>>   task_1361599885844_0013_m_00
>>
>> URL:
>>
>> http://localhost.localdomain:50030/taskdetails.jsp?jobid=job_1361599885844_0013&tipid=task_1361599885844_0013_m_00
>> -
>> Diagnostic Messages for this Task:
>> Error: java.lang.RuntimeException:
>> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
>> processing row
>> {"d_date_sk":2415022,"d_date_id":"OKJNECAA","d_date":"1969-12-31
>> 19:00:00","d_month_seq":0,"d_week_seq":1,"d_quarter_seq":1,"d_year":1900,"d_dow":1,"d_moy":1,"d_dom":2,"d_qoy":1,"d_fy_year":1900,"d_fy_quarter_seq":1,"d_fy_week_seq":1,"d_day_name":"Monday","d_quarter_name":"1900Q1","d_holiday":"N","d_weekend":"N","d_following_holiday":"Y","d_first_dom":2415021,"d_last_dom":2415020,"d_same_day_ly":2414657,"d_same_day_lq":2414930,"d_current_day":"N","d_current_week":"N","d_current_month":"N","d_current_quarter":"N","d_current_year":"N"}
>> at
>> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
>> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:399)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:396)
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
>> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime
>> Error while processing row
>> {"d_date_sk":2415022,"d_date_id":"OKJNECAA","d_date":"1969-12-31
>> 19:00:00","d_month_seq":0,"d_week_seq":1,"d_quarter_seq":1,"d_year":1900,"d_dow":1,"d_moy":1,"d_dom":2,"d_qoy":1,"d_fy_year":1900,"d_fy_quarter_seq":1,"d_fy_week_seq":1,"d_day_name":"Monday","d_quarter_name":"1900Q1","d_holiday":"N","d_weekend":"N","d_following_holiday":"Y","d_first_dom":2415021,"d_last_dom":2415020,"d_same_day_ly":2414657,"d_same_day_lq":2414930,"d_current_day":"N","d_current_week":"N","d_current_month":"N","d_current_quarter":"N","d_current_year":"N"}
>> at
>> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:548)
>> at
>> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
>> ... 8 more
>> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error
>> evaluating d_date
>> at
>> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:80)
>> at
>> org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
>> at
>> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
>> at
>> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
>> at
>> org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
>> at
>> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
>> at
>> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:529)
>> ... 9 more
>> Caused by: java.lang.IllegalArgumentException: Timestamp format must be
>> -mm-dd hh:mm:ss[.f]
>> at java.sql.Timestamp.valueOf(Timestamp.java:185)
>> at
>> org.apache.hadoop.hive.serde2.lazy.LazyTimestamp.init(LazyTimestamp.java:74)
>> at
>> org.apache.hadoop.hive.serde2.lazy.LazyStruct.uncheckedGetField(LazyStruct.java:219)
>> at
>> org.apache.hadoop.hive.serde2.lazy.LazyStruct.getField(LazyStruct.java:192)
>> at
>> org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.getStructFieldData(LazySimpleStructObjectInspector.java:1

Not able to use the timestamp columns

2013-03-05 Thread Dileep Kumar
Hi All,

I am looking for some help in using timestamp column and not sure why I am
getting this error:
Here are how I created the tables and how I am querying it
--hdfs dfs -mkdir /hive/tpcds/date_ts

create external table date_ts
(
d_datetimestamp
)
row format delimited fields terminated by '|'
location '/hive/tpcds/date_ts';
hive -e "select * from date_ts"
Logging initialized using configuration in
file:/etc/hive/conf.dist/hive-log4j.properties
Hive history
file=/tmp/cloudera/hive_job_log_cloudera_201303052251_950655265.txt
OK
Failed with exception
java.io.IOException:java.lang.IllegalArgumentException: Timestamp format
must be -mm-dd hh:mm:ss[.f]
Time taken: 3.556 seconds
[cloudera@localhost tmp-work]$ hdfs dfs -cat /hive/tpcds/date_ts/*
2415022|OKJNECAA|1900-01-02
02:00:21.0|0|1|1|1900|1|1|2|1|1900|1|1|Monday|1900Q1|N|N|Y|2415021|2415020|2414657|2414930|N|N|N|N|N|


Re: Hive insert into RCFILE issue with timestamp columns

2013-03-05 Thread Dileep Kumar
--hdfs dfs -mkdir /hive/tpcds/date_ts

create external table date_ts
(
d_datetimestamp
)
row format delimited fields terminated by '|'
location '/hive/tpcds/date_ts';

[cloudera@localhost tmp-work]$ hive -e "select * from date_ts"
Logging initialized using configuration in
file:/etc/hive/conf.dist/hive-log4j.properties
Hive history
file=/tmp/cloudera/hive_job_log_cloudera_201303052251_950655265.txt
OK
Failed with exception
java.io.IOException:java.lang.IllegalArgumentException: Timestamp format
must be -mm-dd hh:mm:ss[.f]
Time taken: 3.556 seconds
[cloudera@localhost tmp-work]$ hdfs dfs -cat /hive/tpcds/date_ts/*
2415022|OKJNECAA|1900-01-02
02:00:21.0|0|1|1|1900|1|1|2|1|1900|1|1|Monday|1900Q1|N|N|Y|2415021|2415020|2414657|2414930|N|N|N|N|N|





On Mon, Mar 4, 2013 at 6:00 PM, Dileep Kumar wrote:

> No.
> Here are the errors:
> Task with the most failures(4):
> -
> Task ID:
>   task_1361599885844_0013_m_00
>
> URL:
>
> http://localhost.localdomain:50030/taskdetails.jsp?jobid=job_1361599885844_0013&tipid=task_1361599885844_0013_m_00
> -
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
> processing row
> {"d_date_sk":2415022,"d_date_id":"OKJNECAA","d_date":"1969-12-31
> 19:00:00","d_month_seq":0,"d_week_seq":1,"d_quarter_seq":1,"d_year":1900,"d_dow":1,"d_moy":1,"d_dom":2,"d_qoy":1,"d_fy_year":1900,"d_fy_quarter_seq":1,"d_fy_week_seq":1,"d_day_name":"Monday","d_quarter_name":"1900Q1","d_holiday":"N","d_weekend":"N","d_following_holiday":"Y","d_first_dom":2415021,"d_last_dom":2415020,"d_same_day_ly":2414657,"d_same_day_lq":2414930,"d_current_day":"N","d_current_week":"N","d_current_month":"N","d_current_quarter":"N","d_current_year":"N"}
> at
> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:399)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime
> Error while processing row
> {"d_date_sk":2415022,"d_date_id":"OKJNECAA","d_date":"1969-12-31
> 19:00:00","d_month_seq":0,"d_week_seq":1,"d_quarter_seq":1,"d_year":1900,"d_dow":1,"d_moy":1,"d_dom":2,"d_qoy":1,"d_fy_year":1900,"d_fy_quarter_seq":1,"d_fy_week_seq":1,"d_day_name":"Monday","d_quarter_name":"1900Q1","d_holiday":"N","d_weekend":"N","d_following_holiday":"Y","d_first_dom":2415021,"d_last_dom":2415020,"d_same_day_ly":2414657,"d_same_day_lq":2414930,"d_current_day":"N","d_current_week":"N","d_current_month":"N","d_current_quarter":"N","d_current_year":"N"}
> at
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:548)
> at
> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
> ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error
> evaluating d_date
> at
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:80)
> at
> org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> at
> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
> at
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
> at
> org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> at
> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
> at
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:529)
> ... 9 more
> Caused by: java.lang.IllegalArgumentException: Timestamp format must be
> -mm-dd hh:mm:ss[.f]
> at java.sql.Timestamp.valueOf(Timestamp.java:185)
> at
> org.apache.hadoop.hive.serde2.lazy.LazyTimestamp.init(LazyTimestamp.java:74)
> at
> org.apache.hadoop.hive.serde2.lazy.LazyStruct.uncheckedGetField(LazyStruct.java:219)
> at
> org.apache.hadoop.hive.serde2.lazy.LazyStruct.getField(LazyStruct.java:192)
> at
> org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.getStructFieldData(LazySimpleStructObjectInspector.java:188)
> at
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.evaluate(ExprNodeColumnEvaluator.java:98)
> at
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:76)
> ... 15 more
>
>
> FAILED: Execution Error, return code 2 from
> org.apache.hadoop

Re: Hive sample test

2013-03-05 Thread Dean Wampler
NIce, yea that would do it.

On Tue, Mar 5, 2013 at 1:26 PM, Mark Grover wrote:

> I typically change my query to query from a limited version of the whole
> table.
>
> Change
>
> select really_expensive_select_clause
> from
> really_big_table
> where
> something=something
> group by something=something
>
> to
>
> select really_expensive_select_clause
> from
> (
> select
> *
> from
> really_big_table
> limit 100
> )t
> where
> something=something
> group by something=something
>
>
> On Tue, Mar 5, 2013 at 10:57 AM, Dean Wampler
>  wrote:
> > Unfortunately, it will still go through the whole thing, then just limit
> the
> > output. However, there's a flag that I think only works in more recent
> Hive
> > releases:
> >
> > set hive.limit.optimize.enable=true
> >
> > This is supposed to apply limiting earlier in the data stream, so it will
> > give different results that limiting just the output.
> >
> > Like Chuck said, you might consider sampling, but unless your table is
> > organized into buckets, you'll at least scan the whole table, but maybe
> not
> > do all computation over it ??
> >
> > Also, if you have a small sample data set:
> >
> > set hive.exec.mode.local.auto=true
> >
> > will cause Hive to bypass the Job and Task Trackers, calling APIs
> directly,
> > when it can do the whole thing in a single process. Not "lightning fast",
> > but faster.
> >
> > dean
> >
> > On Tue, Mar 5, 2013 at 12:48 PM, Joey D'Antoni 
> wrote:
> >>
> >> Just add a limit 1 to the end of your query.
> >>
> >>
> >>
> >>
> >> On Mar 5, 2013, at 1:45 PM, Kyle B  wrote:
> >>
> >> Hello,
> >>
> >> I was wondering if there is a way to quick-verify a Hive query before it
> >> is run against a big dataset? The tables I am querying against have
> millions
> >> of records, and I'd like to verify my Hive query before I run it
> against all
> >> records.
> >>
> >> Is there a way to test the query against a small subset of the data,
> >> without going into full MapReduce? As silly as this sounds, is there a
> way
> >> to MapReduce without the overhead of MapReduce? That way I can check my
> >> query is doing what I want before I run it against all records.
> >>
> >> Thanks,
> >>
> >> -Kyle
> >
> >
> >
> >
> > --
> > Dean Wampler, Ph.D.
> > thinkbiganalytics.com
> > +1-312-339-1330
> >
>



-- 
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330


Re: Hive sample test

2013-03-05 Thread Mark Grover
I typically change my query to query from a limited version of the whole table.

Change

select really_expensive_select_clause
from
really_big_table
where
something=something
group by something=something

to

select really_expensive_select_clause
from
(
select
*
from
really_big_table
limit 100
)t
where
something=something
group by something=something


On Tue, Mar 5, 2013 at 10:57 AM, Dean Wampler
 wrote:
> Unfortunately, it will still go through the whole thing, then just limit the
> output. However, there's a flag that I think only works in more recent Hive
> releases:
>
> set hive.limit.optimize.enable=true
>
> This is supposed to apply limiting earlier in the data stream, so it will
> give different results that limiting just the output.
>
> Like Chuck said, you might consider sampling, but unless your table is
> organized into buckets, you'll at least scan the whole table, but maybe not
> do all computation over it ??
>
> Also, if you have a small sample data set:
>
> set hive.exec.mode.local.auto=true
>
> will cause Hive to bypass the Job and Task Trackers, calling APIs directly,
> when it can do the whole thing in a single process. Not "lightning fast",
> but faster.
>
> dean
>
> On Tue, Mar 5, 2013 at 12:48 PM, Joey D'Antoni  wrote:
>>
>> Just add a limit 1 to the end of your query.
>>
>>
>>
>>
>> On Mar 5, 2013, at 1:45 PM, Kyle B  wrote:
>>
>> Hello,
>>
>> I was wondering if there is a way to quick-verify a Hive query before it
>> is run against a big dataset? The tables I am querying against have millions
>> of records, and I'd like to verify my Hive query before I run it against all
>> records.
>>
>> Is there a way to test the query against a small subset of the data,
>> without going into full MapReduce? As silly as this sounds, is there a way
>> to MapReduce without the overhead of MapReduce? That way I can check my
>> query is doing what I want before I run it against all records.
>>
>> Thanks,
>>
>> -Kyle
>
>
>
>
> --
> Dean Wampler, Ph.D.
> thinkbiganalytics.com
> +1-312-339-1330
>


Re: Hive sample test

2013-03-05 Thread Dean Wampler
Unfortunately, it will still go through the whole thing, then just limit
the output. However, there's a flag that I think only works in more recent
Hive releases:

set hive.limit.optimize.enable=true

This is supposed to apply limiting earlier in the data stream, so it will
give different results that limiting just the output.

Like Chuck said, you might consider sampling, but unless your table is
organized into buckets, you'll at least scan the whole table, but maybe not
do all computation over it ??

Also, if you have a small sample data set:

set hive.exec.mode.local.auto=true

will cause Hive to bypass the Job and Task Trackers, calling APIs directly,
when it can do the whole thing in a single process. Not "lightning fast",
but faster.

dean

On Tue, Mar 5, 2013 at 12:48 PM, Joey D'Antoni  wrote:

> Just add a limit 1 to the end of your query.
>
>
>
>
> On Mar 5, 2013, at 1:45 PM, Kyle B  wrote:
>
> Hello,
>
> I was wondering if there is a way to quick-verify a Hive query before it
> is run against a big dataset? The tables I am querying against have
> millions of records, and I'd like to verify my Hive query before I run it
> against all records.
>
> Is there a way to test the query against a small subset of the data,
> without going into full MapReduce? As silly as this sounds, is there a way
> to MapReduce without the overhead of MapReduce? That way I can check my
> query is doing what I want before I run it against all records.
>
> Thanks,
>
> -Kyle
>
>


-- 
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330


RE: Hive sample test

2013-03-05 Thread Connell, Chuck
Using the Hive sampling feature would also help. This is exactly what that 
feature is designed for.

Chuck


From: Kyle B [mailto:kbi...@gmail.com]
Sent: Tuesday, March 05, 2013 1:45 PM
To: user@hive.apache.org
Subject: Hive sample test


Hello,

I was wondering if there is a way to quick-verify a Hive query before it is run 
against a big dataset? The tables I am querying against have millions of 
records, and I'd like to verify my Hive query before I run it against all 
records.

Is there a way to test the query against a small subset of the data, without 
going into full MapReduce? As silly as this sounds, is there a way to MapReduce 
without the overhead of MapReduce? That way I can check my query is doing what 
I want before I run it against all records.

Thanks,

-Kyle


Re: Hive sample test

2013-03-05 Thread Joey D'Antoni
Just add a limit 1 to the end of your query.




On Mar 5, 2013, at 1:45 PM, Kyle B  wrote:

> Hello,
> 
> I was wondering if there is a way to quick-verify a Hive query before it is 
> run against a big dataset? The tables I am querying against have millions of 
> records, and I'd like to verify my Hive query before I run it against all 
> records.
> 
> Is there a way to test the query against a small subset of the data, without 
> going into full MapReduce? As silly as this sounds, is there a way to 
> MapReduce without the overhead of MapReduce? That way I can check my query is 
> doing what I want before I run it against all records.
> 
> Thanks,
> 
> -Kyle


Hive sample test

2013-03-05 Thread Kyle B
Hello,

I was wondering if there is a way to quick-verify a Hive query before it is
run against a big dataset? The tables I am querying against have millions
of records, and I'd like to verify my Hive query before I run it against
all records.

Is there a way to test the query against a small subset of the data,
without going into full MapReduce? As silly as this sounds, is there a way
to MapReduce without the overhead of MapReduce? That way I can check my
query is doing what I want before I run it against all records.

Thanks,

-Kyle


Re: show tables in bin does not display the tables

2013-03-05 Thread Mark Grover
Sai,
This is because you are using the default embedded derby database as
metastore. When using the embedded derby metastore, the metadata is
stored in a relative location.

See the value of javax.jdo.option.ConnectionURL. By default, its value
is jdbc:derby:;databaseName=metastore_db;create=true
metastore_db is the directory that gets created to store the metadata.
If you put an absolute path there instead, e.g.
jdbc:derby:;databaseName=/a/path/that/exists/metastore_db;create=true
that would ensure that the same metadata is shared.

I, however, would recommend moving away from embedded derby metastore
and use MySQL or PostgreSQL for metastore instead. Googling should
give you some nice articles on how to do that.

Mark


On Tue, Mar 5, 2013 at 3:48 AM, Sai Sai  wrote:
> Hello
>
> I have noticed when i execute the following command from hive shell in diff
> folders it behaves in diff ways and was wondering if this is right:
>
> show tables;
>
> from the bin folder under my hive install folder it just shows tab_name:
> 
> myUser@ubuntu:~/work/hive-0.10.0-bin/bin$ ./hive
>
> hive> show tables;
>
> OK
> tab_name
> Time taken: 5.268 seconds
> 
>
> But when i excecute the same command from my install folder:
>
> 
> myUser@ubuntu:~/work/hive-0.10.0-bin/bin$ cd ..
>
> hive> show tables;
>
> OK
> tab_name
> employees
> sample_pages
> Time taken: 13.547 seconds
> 
>
> Please let me know.
> Thanks
> Sai


Re: Location of external table in hdfs

2013-03-05 Thread bharath vissapragada
When you create an external table, original data ('/tmp/states' in
this case) is NOT copied to the warehouse folder (or infact any other
folder for that matter). So you can find it in '/tmp/states' ifself.

On Tue, Mar 5, 2013 at 10:26 PM, Sai Sai  wrote:
> I have created an external table like below and wondering where (folder) in
> hdfs i can find this:
>
> CREATE EXTERNAL TABLE states(abbreviation string, full_name string) ROW
> FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/tmp/states' ;
>
> Any help is really appreciated.
> Thanks
> Sai


Re: Location of external table in hdfs

2013-03-05 Thread Sai Sai
Thanks I figured this is in tmp/states
Thanks for your attention.





 From: Sai Sai 
To: "user@hive.apache.org"  
Sent: Tuesday, 5 March 2013 8:56 AM
Subject: Re: Location of external table in hdfs
 

I have created an external table like below and wondering where (folder) in 
hdfs i can find this:

CREATE EXTERNAL TABLE states(abbreviation string, full_name string) ROW FORMAT 
DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/tmp/states' ;

Any help is really appreciated.

Thanks
Sai

Re: Location of external table in hdfs

2013-03-05 Thread Dean Wampler
/tmp/states in HDFS.

On Tue, Mar 5, 2013 at 10:56 AM, Sai Sai  wrote:

> I have created an external table like below and wondering where (folder)
> in hdfs i can find this:
>
> CREATE EXTERNAL TABLE states(abbreviation string, full_name string) ROW
> FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/tmp/states' ;
>
> Any help is really appreciated.
> Thanks
> Sai
>



-- 
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330


Re: Location of external table in hdfs

2013-03-05 Thread Sai Sai
I have created an external table like below and wondering where (folder) in 
hdfs i can find this:

CREATE EXTERNAL TABLE states(abbreviation string, full_name string) ROW FORMAT 
DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/tmp/states' ;

Any help is really appreciated.

Thanks
Sai


Re: Error while exporting table data from hive to Oracle through Sqoop

2013-03-05 Thread Dean Wampler
>From the exceptions near the bottom, it looks like you're inserting data
that doesn't have unique keys, so it could be a data problem.

On Tue, Mar 5, 2013 at 7:54 AM, Ajit Kumar Shreevastava <
ajit.shreevast...@hcl.com> wrote:

>  Hi All,
>
> ** **
>
> I am facing following issue while exporting table from hive to Oracle.
> Importing table from Oracle to Hive and HDFS is working fine. Please let me
> know where I lag. I am pasting my screen output here.
>
> ** **
>
> ** **
>
> *[hadoop@NHCLT-PC44-2 sqoop-oper]$ sqoop export --connect
> jdbc:oracle:thin:@10.99.42.11:1521/clouddb --username HDFSUSER  --table
> BTTN_BKP --export-dir  /home/hadoop/user/hive/warehouse/bttn  -P --verbose
> -m 1  --input-fields-terminated-by '\001'*
>
> Warning: /usr/lib/hbase does not exist! HBase imports will fail.
>
> Please set $HBASE_HOME to the root of your HBase installation.
>
> 13/03/05 19:20:11 DEBUG tool.BaseSqoopTool: Enabled debug logging.
>
> Enter password:
>
> 13/03/05 19:20:16 DEBUG sqoop.ConnFactory: Loaded manager factory:
> com.cloudera.sqoop.manager.DefaultManagerFactory
>
> 13/03/05 19:20:16 DEBUG sqoop.ConnFactory: Trying ManagerFactory:
> com.cloudera.sqoop.manager.DefaultManagerFactory
>
> 13/03/05 19:20:16 DEBUG manager.DefaultManagerFactory: Trying with scheme:
> jdbc:oracle:thin:@10.99.42.11
>
> 13/03/05 19:20:16 DEBUG manager.OracleManager$ConnCache: Instantiated new
> connection cache.
>
> 13/03/05 19:20:16 INFO manager.SqlManager: Using default fetchSize of 1000
> 
>
> 13/03/05 19:20:16 DEBUG sqoop.ConnFactory: Instantiated ConnManager
> org.apache.sqoop.manager.OracleManager@2abe0e27
>
> 13/03/05 19:20:16 INFO tool.CodeGenTool: Beginning code generation
>
> 13/03/05 19:20:16 DEBUG manager.OracleManager: Using column names query:
> SELECT t.* FROM BTTN_BKP t WHERE 1=0
>
> 13/03/05 19:20:16 DEBUG manager.OracleManager: Creating a new connection
> for jdbc:oracle:thin:@10.99.42.11:1521/clouddb, using username: HDFSUSER**
> **
>
> 13/03/05 19:20:16 DEBUG manager.OracleManager: No connection paramenters
> specified. Using regular API for making connection.
>
> 13/03/05 19:20:16 INFO manager.OracleManager: Time zone has been set to GMT
> 
>
> 13/03/05 19:20:16 DEBUG manager.SqlManager: Using fetchSize for next
> query: 1000
>
> 13/03/05 19:20:16 INFO manager.SqlManager: Executing SQL statement: SELECT
> t.* FROM BTTN_BKP t WHERE 1=0
>
> 13/03/05 19:20:16 DEBUG manager.OracleManager$ConnCache: Caching released
> connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter: selected columns:
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   BTTN_ID
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   DATA_INST_ID
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   SCR_ID
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   BTTN_NU
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   CAT
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   WDTH
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   HGHT
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   KEY_SCAN
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   KEY_SHFT
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   FRGND_CPTN_COLR
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   FRGND_CPTN_COLR_PRSD
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   BKGD_CPTN_COLR
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   BKGD_CPTN_COLR_PRSD
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   BLM_FL
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   LCLZ_FL
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   MENU_ITEM_NU
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   BTTN_ASGN_LVL_ID
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   ON_ATVT
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   ON_CLIK
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   ENBL_FL
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   BLM_SET_ID
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   BTTN_ASGN_LVL_NAME
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   MKT_ID
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   CRTE_TS
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   CRTE_USER_ID
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   UPDT_TS
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   UPDT_USER_ID
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   DEL_TS
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   DEL_USER_ID
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   DLTD_FL
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   MENU_ITEM_NA
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   PRD_CD
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   BLM_SET_NA
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   SOUND_FILE_ID
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   IS_DYNMC_BTTN
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   FRGND_CPTN_COLR_ID
>
> 13/03/05 19:20:16 DEBUG orm.ClassWriter:   FRGND_CPTN_COLR_PRSD_ID
>
> 13/03/05 19:20:16 DEBU

Error while exporting table data from hive to Oracle through Sqoop

2013-03-05 Thread Ajit Kumar Shreevastava
Hi All,

I am facing following issue while exporting table from hive to Oracle. 
Importing table from Oracle to Hive and HDFS is working fine. Please let me 
know where I lag. I am pasting my screen output here.


[hadoop@NHCLT-PC44-2 sqoop-oper]$ sqoop export --connect 
jdbc:oracle:thin:@10.99.42.11:1521/clouddb --username HDFSUSER  --table 
BTTN_BKP --export-dir  /home/hadoop/user/hive/warehouse/bttn  -P --verbose  -m 
1  --input-fields-terminated-by '\001'
Warning: /usr/lib/hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
13/03/05 19:20:11 DEBUG tool.BaseSqoopTool: Enabled debug logging.
Enter password:
13/03/05 19:20:16 DEBUG sqoop.ConnFactory: Loaded manager factory: 
com.cloudera.sqoop.manager.DefaultManagerFactory
13/03/05 19:20:16 DEBUG sqoop.ConnFactory: Trying ManagerFactory: 
com.cloudera.sqoop.manager.DefaultManagerFactory
13/03/05 19:20:16 DEBUG manager.DefaultManagerFactory: Trying with scheme: 
jdbc:oracle:thin:@10.99.42.11
13/03/05 19:20:16 DEBUG manager.OracleManager$ConnCache: Instantiated new 
connection cache.
13/03/05 19:20:16 INFO manager.SqlManager: Using default fetchSize of 1000
13/03/05 19:20:16 DEBUG sqoop.ConnFactory: Instantiated ConnManager 
org.apache.sqoop.manager.OracleManager@2abe0e27
13/03/05 19:20:16 INFO tool.CodeGenTool: Beginning code generation
13/03/05 19:20:16 DEBUG manager.OracleManager: Using column names query: SELECT 
t.* FROM BTTN_BKP t WHERE 1=0
13/03/05 19:20:16 DEBUG manager.OracleManager: Creating a new connection for 
jdbc:oracle:thin:@10.99.42.11:1521/clouddb, using username: HDFSUSER
13/03/05 19:20:16 DEBUG manager.OracleManager: No connection paramenters 
specified. Using regular API for making connection.
13/03/05 19:20:16 INFO manager.OracleManager: Time zone has been set to GMT
13/03/05 19:20:16 DEBUG manager.SqlManager: Using fetchSize for next query: 1000
13/03/05 19:20:16 INFO manager.SqlManager: Executing SQL statement: SELECT t.* 
FROM BTTN_BKP t WHERE 1=0
13/03/05 19:20:16 DEBUG manager.OracleManager$ConnCache: Caching released 
connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER
13/03/05 19:20:16 DEBUG orm.ClassWriter: selected columns:
13/03/05 19:20:16 DEBUG orm.ClassWriter:   BTTN_ID
13/03/05 19:20:16 DEBUG orm.ClassWriter:   DATA_INST_ID
13/03/05 19:20:16 DEBUG orm.ClassWriter:   SCR_ID
13/03/05 19:20:16 DEBUG orm.ClassWriter:   BTTN_NU
13/03/05 19:20:16 DEBUG orm.ClassWriter:   CAT
13/03/05 19:20:16 DEBUG orm.ClassWriter:   WDTH
13/03/05 19:20:16 DEBUG orm.ClassWriter:   HGHT
13/03/05 19:20:16 DEBUG orm.ClassWriter:   KEY_SCAN
13/03/05 19:20:16 DEBUG orm.ClassWriter:   KEY_SHFT
13/03/05 19:20:16 DEBUG orm.ClassWriter:   FRGND_CPTN_COLR
13/03/05 19:20:16 DEBUG orm.ClassWriter:   FRGND_CPTN_COLR_PRSD
13/03/05 19:20:16 DEBUG orm.ClassWriter:   BKGD_CPTN_COLR
13/03/05 19:20:16 DEBUG orm.ClassWriter:   BKGD_CPTN_COLR_PRSD
13/03/05 19:20:16 DEBUG orm.ClassWriter:   BLM_FL
13/03/05 19:20:16 DEBUG orm.ClassWriter:   LCLZ_FL
13/03/05 19:20:16 DEBUG orm.ClassWriter:   MENU_ITEM_NU
13/03/05 19:20:16 DEBUG orm.ClassWriter:   BTTN_ASGN_LVL_ID
13/03/05 19:20:16 DEBUG orm.ClassWriter:   ON_ATVT
13/03/05 19:20:16 DEBUG orm.ClassWriter:   ON_CLIK
13/03/05 19:20:16 DEBUG orm.ClassWriter:   ENBL_FL
13/03/05 19:20:16 DEBUG orm.ClassWriter:   BLM_SET_ID
13/03/05 19:20:16 DEBUG orm.ClassWriter:   BTTN_ASGN_LVL_NAME
13/03/05 19:20:16 DEBUG orm.ClassWriter:   MKT_ID
13/03/05 19:20:16 DEBUG orm.ClassWriter:   CRTE_TS
13/03/05 19:20:16 DEBUG orm.ClassWriter:   CRTE_USER_ID
13/03/05 19:20:16 DEBUG orm.ClassWriter:   UPDT_TS
13/03/05 19:20:16 DEBUG orm.ClassWriter:   UPDT_USER_ID
13/03/05 19:20:16 DEBUG orm.ClassWriter:   DEL_TS
13/03/05 19:20:16 DEBUG orm.ClassWriter:   DEL_USER_ID
13/03/05 19:20:16 DEBUG orm.ClassWriter:   DLTD_FL
13/03/05 19:20:16 DEBUG orm.ClassWriter:   MENU_ITEM_NA
13/03/05 19:20:16 DEBUG orm.ClassWriter:   PRD_CD
13/03/05 19:20:16 DEBUG orm.ClassWriter:   BLM_SET_NA
13/03/05 19:20:16 DEBUG orm.ClassWriter:   SOUND_FILE_ID
13/03/05 19:20:16 DEBUG orm.ClassWriter:   IS_DYNMC_BTTN
13/03/05 19:20:16 DEBUG orm.ClassWriter:   FRGND_CPTN_COLR_ID
13/03/05 19:20:16 DEBUG orm.ClassWriter:   FRGND_CPTN_COLR_PRSD_ID
13/03/05 19:20:16 DEBUG orm.ClassWriter:   BKGD_CPTN_COLR_ID
13/03/05 19:20:16 DEBUG orm.ClassWriter:   BKGD_CPTN_COLR_PRSD_ID
13/03/05 19:20:16 DEBUG orm.ClassWriter: Writing source file: 
/tmp/sqoop-hadoop/compile/8d22103beede09e961b64d0ff8e61e7e/BTTN_BKP.java
13/03/05 19:20:16 DEBUG orm.ClassWriter: Table name: BTTN_BKP
13/03/05 19:20:16 DEBUG orm.ClassWriter: Columns: BTTN_ID:2, DATA_INST_ID:2, 
SCR_ID:2, BTTN_NU:2, CAT:2, WDTH:2, HGHT:2, KEY_SCAN:2, KEY_SHFT:2, 
FRGND_CPTN_COLR:12, FRGND_CPTN_COLR_PRSD:12, BKGD_CPTN_COLR:12, 
BKGD_CPTN_COLR_PRSD:12, BLM_FL:2, LCLZ_FL:2, MENU_ITEM_NU:2, 
BTTN_ASGN_LVL_ID:2, ON_ATVT:2, ON_CLIK:2, ENBL_FL:2, BLM_SET_ID:2, 
BTTN_ASGN_LVL_NAME:12, MKT_ID:2, CRTE_TS:93, CRTE_USER_ID:12, UPDT_TS:93, 
UPDT_USER_ID:12, DEL_TS:93, DEL_USE

Re: Done SemanticException Line 1:17 issue

2013-03-05 Thread Sai Sai


Thanks for your help Nitin.
I have restarted my VM and tried again and it appears to work.

Thanks again.
Sai



 From: Sai Sai 
To: "user@hive.apache.org"  
Sent: Tuesday, 5 March 2013 4:42 AM
Subject: Re: SemanticException Line 1:17 issue
 

Thanks for your help Nitin, here is what it displays:

satish@ubuntu:~/work/hadoop-1.0.4/bin$ $HADOOP_HOME/bin/hadoop dfs -ls /tmp/


Warning: $HADOOP_HOME is deprecated.
Found 3 items

drwxr-xr-x   - satish supergroup  0 2013-03-05 04:12 /tmp/hive-satish
-rw-r--r--   1 satish supergroup    654 2013-03-04 02:41 /tmp/states.txt
drwxr-xr-x   - satish supergroup  0 2013-02-16 00:46 
/tmp/temp-1850940621

**
I have done a search for the file states.txt and it refers to 3 places 2 of em 
refer to
proc/2693/cwd

but none of them refer to tmp folder.

Please let me know if you have any other suggestions.
In the meantime i will try with the [LOCAL] file and let you know.

Thanks
Sai




 From: Nitin Pawar 
To: user@hive.apache.org; Sai Sai  
Sent: Tuesday, 5 March 2013 4:24 AM
Subject: Re: SemanticException Line 1:17 issue
 

it exists but where? on your hdfs or local linux filesystem ?  so if you are 
checking the file with ls -l /tmp/ then add word local

ls can you provide output of $HADOOP_HOME/bin/hadoop dfs -ls /tmp/ 


LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename
If the keyword LOCAL is specified, then:
* the load command will look for filepath in the local file system. If 
a relative path is specified - it will be interpreted relative to the current 
directory of the user



On Tue, Mar 5, 2013 at 5:48 PM, Sai Sai  wrote:

Yes Nitin it exists... but still getting the same issue.
>
>
>
>
>
> From: Nitin Pawar 
>To: user@hive.apache.org; Sai Sai  
>Sent: Tuesday, 5 March 2013 4:14 AM
>Subject: Re: SemanticException Line 1:17 issue
> 
>
>
>this file /tmp/o_small.tsv is on your local filesystem or hdfs? 
>
>
>
>On Tue, Mar 5, 2013 at 5:39 PM, Sai Sai  wrote:
>
>Hello
>>
>>
>>I have been stuck on this issue for quite some time and was wondering if 
>>anyone sees any problem with this that i am not seeing:
>>
>>
>>I have verified the file exists here and have also manually pasted the file 
>>into the tmp folder but still running into the same issue.
>>
>>
>>I am also wondering which folder this maps to in my local drive:
>>hdfs://ubuntu:9000/
>>
>>
>>***
>>
>>
>>hive> LOAD DATA INPATH '/tmp/o_small.tsv' OVERWRITE INTO TABLE odata ;
>>FAILED: SemanticException Line 1:17 Invalid path ''/tmp/o_small.tsv'': No 
>>files matching path hdfs://ubuntu:9000/tmp/o_small.tsv
>>
>>
>>***
>>I have verified the file exists here and have also manually pasted the file 
>>here but still running into the same issue.
>>Please let me know if u have any suggestions will be really appreciated.
>>ThanksSai
>>
>
>
>
>-- 
>Nitin Pawar
>
>
>


-- 
Nitin Pawar

Re: SemanticException Line 1:17 issue

2013-03-05 Thread Nitin Pawar
this file /tmp/o_small.tsv looks like is existing on your local filesystem

try load data local inpath

it should work


On Tue, Mar 5, 2013 at 6:12 PM, Sai Sai  wrote:

> Thanks for your help Nitin, here is what it displays:
>
> satish@ubuntu:~/work/hadoop-1.0.4/bin$ $HADOOP_HOME/bin/hadoop dfs -ls
> /tmp/
>
> Warning: $HADOOP_HOME is deprecated.
> Found 3 items
>
> drwxr-xr-x   - satish supergroup  0 2013-03-05 04:12
> /tmp/hive-satish
> -rw-r--r--   1 satish supergroup654 2013-03-04 02:41
> /tmp/states.txt
> drwxr-xr-x   - satish supergroup  0 2013-02-16 00:46
> /tmp/temp-1850940621
>
> **
> I have done a search for the file states.txt and it refers to 3 places 2
> of em refer to
> proc/2693/cwd
>
> but none of them refer to tmp folder.
>
> Please let me know if you have any other suggestions.
> In the meantime i will try with the [LOCAL] file and let you know.
> Thanks
> Sai
>
>   --
> *From:* Nitin Pawar 
> *To:* user@hive.apache.org; Sai Sai 
> *Sent:* Tuesday, 5 March 2013 4:24 AM
>
> *Subject:* Re: SemanticException Line 1:17 issue
>
> it exists but where? on your hdfs or local linux filesystem ?  so if you
> are checking the file with ls -l /tmp/ then add word local
>
> ls can you provide output of $HADOOP_HOME/bin/hadoop dfs -ls /tmp/
>
> LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename
>
> If the keyword LOCAL is specified, then:
>
>- the load command will look for *filepath* in the local file system. If a 
> relative path is specified - it will be interpreted relative to the current 
> directory of the user
>
>
>
> On Tue, Mar 5, 2013 at 5:48 PM, Sai Sai  wrote:
>
> Yes Nitin it exists... but still getting the same issue.
>
>--
> *From:* Nitin Pawar 
> *To:* user@hive.apache.org; Sai Sai 
> *Sent:* Tuesday, 5 March 2013 4:14 AM
> *Subject:* Re: SemanticException Line 1:17 issue
>
> this file /tmp/o_small.tsv is on your local filesystem or hdfs?
>
>
> On Tue, Mar 5, 2013 at 5:39 PM, Sai Sai  wrote:
>
> Hello
>
> I have been stuck on this issue for quite some time and was wondering if
> anyone sees any problem with this that i am not seeing:
>
> I have verified the file exists here and have also manually pasted the
> file into the tmp folder but still running into the same issue.
>
> I am also wondering which folder this maps to in my local drive:
> hdfs://ubuntu:9000/
>
> ***
>
> hive> LOAD DATA INPATH '/tmp/o_small.tsv' OVERWRITE INTO TABLE odata ;
> FAILED: SemanticException Line 1:17 Invalid path ''/tmp/o_small.tsv'': No
> files matching path hdfs://ubuntu:9000/tmp/o_small.tsv
>
> ***
> I have verified the file exists here and have also manually pasted the
> file here but still running into the same issue.
> Please let me know if u have any suggestions will be really appreciated.
> Thanks
> Sai
>
>
>
>
> --
> Nitin Pawar
>
>
>
>
>
> --
> Nitin Pawar
>
>
>


-- 
Nitin Pawar


Re: SemanticException Line 1:17 issue

2013-03-05 Thread Sai Sai
Thanks for your help Nitin, here is what it displays:

satish@ubuntu:~/work/hadoop-1.0.4/bin$ $HADOOP_HOME/bin/hadoop dfs -ls /tmp/


Warning: $HADOOP_HOME is deprecated.
Found 3 items

drwxr-xr-x   - satish supergroup  0 2013-03-05 04:12 /tmp/hive-satish
-rw-r--r--   1 satish supergroup    654 2013-03-04 02:41 /tmp/states.txt
drwxr-xr-x   - satish supergroup  0 2013-02-16 00:46 
/tmp/temp-1850940621

**
I have done a search for the file states.txt and it refers to 3 places 2 of em 
refer to
proc/2693/cwd

but none of them refer to tmp folder.

Please let me know if you have any other suggestions.
In the meantime i will try with the [LOCAL] file and let you know.

Thanks
Sai




 From: Nitin Pawar 
To: user@hive.apache.org; Sai Sai  
Sent: Tuesday, 5 March 2013 4:24 AM
Subject: Re: SemanticException Line 1:17 issue
 

it exists but where? on your hdfs or local linux filesystem ?  so if you are 
checking the file with ls -l /tmp/ then add word local

ls can you provide output of $HADOOP_HOME/bin/hadoop dfs -ls /tmp/ 


LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename
If the keyword LOCAL is specified, then:
* the load command will look for filepath in the local file system. If 
a relative path is specified - it will be interpreted relative to the current 
directory of the user



On Tue, Mar 5, 2013 at 5:48 PM, Sai Sai  wrote:

Yes Nitin it exists... but still getting the same issue.
>
>
>
>
>
> From: Nitin Pawar 
>To: user@hive.apache.org; Sai Sai  
>Sent: Tuesday, 5 March 2013 4:14 AM
>Subject: Re: SemanticException Line 1:17 issue
> 
>
>
>this file /tmp/o_small.tsv is on your local filesystem or hdfs? 
>
>
>
>On Tue, Mar 5, 2013 at 5:39 PM, Sai Sai  wrote:
>
>Hello
>>
>>
>>I have been stuck on this issue for quite some time and was wondering if 
>>anyone sees any problem with this that i am not seeing:
>>
>>
>>I have verified the file exists here and have also manually pasted the file 
>>into the tmp folder but still running into the same issue.
>>
>>
>>I am also wondering which folder this maps to in my local drive:
>>hdfs://ubuntu:9000/
>>
>>
>>***
>>
>>
>>hive> LOAD DATA INPATH '/tmp/o_small.tsv' OVERWRITE INTO TABLE odata ;
>>FAILED: SemanticException Line 1:17 Invalid path ''/tmp/o_small.tsv'': No 
>>files matching path hdfs://ubuntu:9000/tmp/o_small.tsv
>>
>>
>>***
>>I have verified the file exists here and have also manually pasted the file 
>>here but still running into the same issue.
>>Please let me know if u have any suggestions will be really appreciated.
>>ThanksSai
>>
>
>
>
>-- 
>Nitin Pawar
>
>
>


-- 
Nitin Pawar

Re: Get the job id for a hive query

2013-03-05 Thread Nitin Pawar
select statement without where clause is just a hdfs cat command. It will
not run mapreduce for that.



On Tue, Mar 5, 2013 at 5:48 PM, Tim Bittersohl  wrote:

> Ok, it works.
>
> For testing, I fired a create table command and a select without a where
> clause. Both don’t result in MapReduce jobs... with a where clause, there
> is a job created now.
>
> ** **
>
> Thanks
>
> ** **
>
> ** **
>
> *Von:* Nitin Pawar [mailto:nitinpawar...@gmail.com]
> *Gesendet:* Dienstag, 5. März 2013 12:48
>
> *An:* user@hive.apache.org
> *Betreff:* Re: Get the job id for a hive query
>
> ** **
>
> if the job is submitted to hadoop, it will come up on jobtracker. 
>
> Unless you are slow on tracking and your job history retaining is 0, each
> hive query submitted to jobtracker will be there on jobhistory 
>
> ** **
>
> On Tue, Mar 5, 2013 at 4:58 PM, Tim Bittersohl  wrote:
> 
>
> Hi,
>
>  
>
> I do have the following problem monitoring my Hive queries in Hadoop.
>
>  
>
> I create a Sever using the Hive library which connects to an Hadoop
> cluster (file system, job tracker and Hive metastore are set up on this
> cluster). The needed parameters for the Hive server I've set in the
> configuration.
>
> Commands sent to this server are executed, that works.
>
> The problem now is that the MapReduce jobs created by Hive don't appear in
> the job tracker's job list. The list itself works, showing me other
> commands I executed not using Hive. (I use the JobClient Java class to get
> the list)
>
>  
>
> Why are the Hive query jobs not in the list? How can I track them in
> Hadoop?
>
>  
>
>  
>
> Thanks
>
>  
>
>  
>
> *Von:* Nitin Pawar [mailto:nitinpawar...@gmail.com]
> *Gesendet:* Donnerstag, 28. Februar 2013 16:13
>
>
> *An:* user@hive.apache.org
> *Betreff:* Re: Get the job id for a hive query
>
>  
>
> you can set this property mapred.job.name and this should set the name
> for the job 
>
>  
>
> On Thu, Feb 28, 2013 at 8:26 PM, Tim Bittersohl 
> wrote:
>
> Thanks for the response,
>
>  
>
> I also found no way to access the job id via java thrift client, all I can
> get is a query ID by the query planner.
>
>  
>
> How to set the name of a job where a Hive query is fired with, so I can
> find it in the job tracker later?
>
>  
>
> Tim Bittersohl 
>
> Software Engineer 
>
>
> [image: http://www.innoplexia.de/ci/logo/inno_logo_links%20200x80.png]
>
> Innoplexia GmbH
> Mannheimer Str. 175 
>
> 69123 Heidelberg
>
> Tel.: +49 (0) 6221 7198033
> Fax: +49 (0) 6221 7198034
> Web: www.innoplexia.com
>
> Sitz: 69123 Heidelberg, Mannheimer Str. 175 - Steuernummer 32494/62606 -
> USt. IdNr.: DE 272 871 728 - Geschäftsführer: Christian Schneider, Walery
> Strauch, Prof. Dr. Herbert Schuster 
>
>  
>
> *Von:* Nitin Pawar [mailto:nitinpawar...@gmail.com]
> *Gesendet:* Donnerstag, 28. Februar 2013 15:46
>
>
> *An:* user@hive.apache.org
> *Betreff:* Re: Get the job id for a hive query
>
>  
>
> With thrift Client I don't think you can get the jobID from hadoop (I may
> very well be wrong in this)
>
>  
>
> the other way around this was to have a separate name for each job you
> fire through hive and then directly query jobtracker for the same 
>
>  
>
> On Thu, Feb 28, 2013 at 5:18 PM, Tim Bittersohl 
> wrote:
>
> I use java and there HiveClient of the hive library (Version 0.10.0).
>
>  
>
> Tim Bittersohl 
>
> Software Engineer 
>
>
> [image: http://www.innoplexia.de/ci/logo/inno_logo_links%20200x80.png]
>
> Innoplexia GmbH
> Mannheimer Str. 175 
>
> 69123 Heidelberg
>
> Tel.: +49 (0) 6221 7198033
> Fax: +49 (0) 6221 7198034
> Web: www.innoplexia.com
>
> Sitz: 69123 Heidelberg, Mannheimer Str. 175 - Steuernummer 32494/62606 -
> USt. IdNr.: DE 272 871 728 - Geschäftsführer: Christian Schneider, Walery
> Strauch, Prof. Dr. Herbert Schuster 
>
>  
>
> *Von:* Nitin Pawar [mailto:nitinpawar...@gmail.com]
> *Gesendet:* Donnerstag, 28. Februar 2013 12:43
> *An:* user@hive.apache.org
>
>
> *Betreff:* Re: Get the job id for a hive query
>
>  
>
> how are you running your hive queries? using hive cli or hive jdbc client?
> 
>
>  
>
> if you are using hive cli then whenever you fire a query and assuming it
> is syntactically correct and its not select * from table operation 
>
> hive cli shows  the job ID and an URL which points to the job tracker url
> for the job it submitted  
>
>  
>
> On Thu, Feb 28, 2013 at 5:09 PM, Tim Bittersohl 
> wrote:
>
> I’m trying to get the job id of the job created with a Hive query.
>
> At the moment I can get the cluster status from the HiveClient, but I don’
> find any job id in there... 
>
>  
>
> Tim Bittersohl 
>
> Software Engineer 
>
>
> [image: http://www.innoplexia.de/ci/logo/inno_logo_links%20200x80.png]
>
> Innoplexia GmbH
> Mannheimer Str. 

Re: SemanticException Line 1:17 issue

2013-03-05 Thread Nitin Pawar
it exists but where? on your hdfs or local linux filesystem ?  so if you
are checking the file with ls -l /tmp/ then add word local

ls can you provide output of $HADOOP_HOME/bin/hadoop dfs -ls /tmp/

LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename

If the keyword LOCAL is specified, then:

   - the load command will look for *filepath* in the local file
system. If a relative path is specified - it will be interpreted
relative to the current directory of the user



On Tue, Mar 5, 2013 at 5:48 PM, Sai Sai  wrote:

> Yes Nitin it exists... but still getting the same issue.
>
>   --
> *From:* Nitin Pawar 
> *To:* user@hive.apache.org; Sai Sai 
> *Sent:* Tuesday, 5 March 2013 4:14 AM
> *Subject:* Re: SemanticException Line 1:17 issue
>
> this file /tmp/o_small.tsv is on your local filesystem or hdfs?
>
>
> On Tue, Mar 5, 2013 at 5:39 PM, Sai Sai  wrote:
>
> Hello
>
> I have been stuck on this issue for quite some time and was wondering if
> anyone sees any problem with this that i am not seeing:
>
> I have verified the file exists here and have also manually pasted the
> file into the tmp folder but still running into the same issue.
>
> I am also wondering which folder this maps to in my local drive:
> hdfs://ubuntu:9000/
>
> ***
>
> hive> LOAD DATA INPATH '/tmp/o_small.tsv' OVERWRITE INTO TABLE odata ;
> FAILED: SemanticException Line 1:17 Invalid path ''/tmp/o_small.tsv'': No
> files matching path hdfs://ubuntu:9000/tmp/o_small.tsv
>
> ***
> I have verified the file exists here and have also manually pasted the
> file here but still running into the same issue.
> Please let me know if u have any suggestions will be really appreciated.
> Thanks
> Sai
>
>
>
>
> --
> Nitin Pawar
>
>
>


-- 
Nitin Pawar


AW: Get the job id for a hive query

2013-03-05 Thread Tim Bittersohl
Ok, it works.

For testing, I fired a create table command and a select without a where
clause. Both don’t result in MapReduce jobs... with a where clause, there is
a job created now.

 

Thanks

 

 

Von: Nitin Pawar [mailto:nitinpawar...@gmail.com] 
Gesendet: Dienstag, 5. März 2013 12:48
An: user@hive.apache.org
Betreff: Re: Get the job id for a hive query

 

if the job is submitted to hadoop, it will come up on jobtracker. 

Unless you are slow on tracking and your job history retaining is 0, each
hive query submitted to jobtracker will be there on jobhistory 

 

On Tue, Mar 5, 2013 at 4:58 PM, Tim Bittersohl  wrote:

Hi,

 

I do have the following problem monitoring my Hive queries in Hadoop.

 

I create a Sever using the Hive library which connects to an Hadoop cluster
(file system, job tracker and Hive metastore are set up on this cluster).
The needed parameters for the Hive server I've set in the configuration.

Commands sent to this server are executed, that works.

The problem now is that the MapReduce jobs created by Hive don't appear in
the job tracker's job list. The list itself works, showing me other commands
I executed not using Hive. (I use the JobClient Java class to get the list)

 

Why are the Hive query jobs not in the list? How can I track them in Hadoop?

 

 

Thanks

 

 

Von: Nitin Pawar [mailto:nitinpawar...@gmail.com] 
Gesendet: Donnerstag, 28. Februar 2013 16:13


An: user@hive.apache.org
Betreff: Re: Get the job id for a hive query

 

you can set this property mapred.job.name and this should set the name for
the job 

 

On Thu, Feb 28, 2013 at 8:26 PM, Tim Bittersohl  wrote:

Thanks for the response,

 

I also found no way to access the job id via java thrift client, all I can
get is a query ID by the query planner.

 

How to set the name of a job where a Hive query is fired with, so I can find
it in the job tracker later?

 

Tim Bittersohl 

Software Engineer 


http://www.innoplexia.de/ci/logo/inno_logo_links%20200x80.png

Innoplexia GmbH
Mannheimer Str. 175 

69123 Heidelberg 

Tel.: +49 (0) 6221 7198033   
Fax: +49 (0) 6221 7198034   
Web: www.innoplexia.com   

Sitz: 69123 Heidelberg, Mannheimer Str. 175 - Steuernummer 32494/62606 -
USt. IdNr.: DE 272 871 728 - Geschäftsführer: Christian Schneider, Walery
Strauch, Prof. Dr. Herbert Schuster 

 

Von: Nitin Pawar [mailto:nitinpawar...@gmail.com] 
Gesendet: Donnerstag, 28. Februar 2013 15:46


An: user@hive.apache.org
Betreff: Re: Get the job id for a hive query

 

With thrift Client I don't think you can get the jobID from hadoop (I may
very well be wrong in this)

 

the other way around this was to have a separate name for each job you fire
through hive and then directly query jobtracker for the same 

 

On Thu, Feb 28, 2013 at 5:18 PM, Tim Bittersohl  wrote:

I use java and there HiveClient of the hive library (Version 0.10.0).

 

Tim Bittersohl 

Software Engineer 


http://www.innoplexia.de/ci/logo/inno_logo_links%20200x80.png

Innoplexia GmbH
Mannheimer Str. 175 

69123 Heidelberg 

Tel.: +49 (0) 6221 7198033   
Fax: +49 (0) 6221 7198034   
Web: www.innoplexia.com   

Sitz: 69123 Heidelberg, Mannheimer Str. 175 - Steuernummer 32494/62606 -
USt. IdNr.: DE 272 871 728 - Geschäftsführer: Christian Schneider, Walery
Strauch, Prof. Dr. Herbert Schuster 

 

Von: Nitin Pawar [mailto:nitinpawar...@gmail.com] 
Gesendet: Donnerstag, 28. Februar 2013 12:43
An: user@hive.apache.org


Betreff: Re: Get the job id for a hive query

 

how are you running your hive queries? using hive cli or hive jdbc client? 

 

if you are using hive cli then whenever you fire a query and assuming it is
syntactically correct and its not select * from table operation 

hive cli shows  the job ID and an URL which points to the job tracker url
for the job it submitted  

 

On Thu, Feb 28, 2013 at 5:09 PM, Tim Bittersohl  wrote:

I’m trying to get the job id of the job created with a Hive query.

At the moment I can get the cluster status from the HiveClient, but I don’
find any job id in there... 

 

Tim Bittersohl 

Software Engineer 


http://www.innoplexia.de/ci/logo/inno_logo_links%20200x80.png

Innoplexia GmbH
Mannheimer Str. 175 

69123 Heidelberg 

Tel.: +49 (0) 6221 7198033   
Fax: +49 (0) 6221 7198034   
Web: www.innoplexia.com   

Sitz: 69123 Heidelberg, Mannheimer Str. 175 - Steuernummer 32494/62606 -
USt. IdNr.: DE 272 871 728 - Geschäftsführer: Christian Schneider, Walery
Strauch, Prof. Dr. Herbert Schuster 

 

Von: Harsh J [mailto:ha...@cloudera.com] 
Gesendet: Donnerstag, 28. Februar 2013 09:08
An: hive request
Betreff: Re: Get the job id for a hive query

 

The client logs print the Job ID of the spawned job and a tracking URL. Is
that what you're looking for? Its printed for each stage.

 

On Wed, Feb 27, 2013 at 11:06 PM, Tim Bittersohl  wrote:

Hi,

 

has the Hive client the possibility to give back the job id of the job
created

Re: SemanticException Line 1:17 issue

2013-03-05 Thread Sai Sai
Yes Nitin it exists... but still getting the same issue.





 From: Nitin Pawar 
To: user@hive.apache.org; Sai Sai  
Sent: Tuesday, 5 March 2013 4:14 AM
Subject: Re: SemanticException Line 1:17 issue
 

this file /tmp/o_small.tsv is on your local filesystem or hdfs? 



On Tue, Mar 5, 2013 at 5:39 PM, Sai Sai  wrote:

Hello
>
>
>I have been stuck on this issue for quite some time and was wondering if 
>anyone sees any problem with this that i am not seeing:
>
>
>I have verified the file exists here and have also manually pasted the file 
>into the tmp folder but still running into the same issue.
>
>
>I am also wondering which folder this maps to in my local drive:
>hdfs://ubuntu:9000/
>
>
>***
>
>
>hive> LOAD DATA INPATH '/tmp/o_small.tsv' OVERWRITE INTO TABLE odata ;
>FAILED: SemanticException Line 1:17 Invalid path ''/tmp/o_small.tsv'': No 
>files matching path hdfs://ubuntu:9000/tmp/o_small.tsv
>
>
>***
>I have verified the file exists here and have also manually pasted the file 
>here but still running into the same issue.
>Please let me know if u have any suggestions will be really appreciated.
>ThanksSai
>


-- 
Nitin Pawar

Re: SemanticException Line 1:17 issue

2013-03-05 Thread Nitin Pawar
this file /tmp/o_small.tsv is on your local filesystem or hdfs?


On Tue, Mar 5, 2013 at 5:39 PM, Sai Sai  wrote:

> Hello
>
> I have been stuck on this issue for quite some time and was wondering if
> anyone sees any problem with this that i am not seeing:
>
> I have verified the file exists here and have also manually pasted the
> file into the tmp folder but still running into the same issue.
>
> I am also wondering which folder this maps to in my local drive:
> hdfs://ubuntu:9000/
>
> ***
>
> hive> LOAD DATA INPATH '/tmp/o_small.tsv' OVERWRITE INTO TABLE odata ;
> FAILED: SemanticException Line 1:17 Invalid path ''/tmp/o_small.tsv'': No
> files matching path hdfs://ubuntu:9000/tmp/o_small.tsv
>
> ***
> I have verified the file exists here and have also manually pasted the
> file here but still running into the same issue.
> Please let me know if u have any suggestions will be really appreciated.
> Thanks
> Sai
>



-- 
Nitin Pawar


Re: SemanticException Line 1:17 issue

2013-03-05 Thread Sai Sai
Hello

I have been stuck on this issue for quite some time and was wondering if anyone 
sees any problem with this that i am not seeing:

I have verified the file exists here and have also manually pasted the file 
into the tmp folder but still running into the same issue.

I am also wondering which folder this maps to in my local drive:
hdfs://ubuntu:9000/

***


hive> LOAD DATA INPATH '/tmp/o_small.tsv' OVERWRITE INTO TABLE odata ;
FAILED: SemanticException Line 1:17 Invalid path ''/tmp/o_small.tsv'': No files 
matching path hdfs://ubuntu:9000/tmp/o_small.tsv

***
I have verified the file exists here and have also manually pasted the file 
here but still running into the same issue.
Please let me know if u have any suggestions will be really appreciated.
Thanks
Sai


Re: show tables in bin does not display the tables

2013-03-05 Thread Sai Sai
Hello

I have noticed when i execute the following command from hive shell in diff 
folders it behaves in diff ways and was wondering if this is right:

show tables;

from the bin folder under my hive install folder it just shows tab_name:


myUser@ubuntu:~/work/hive-0.10.0-bin/bin$ ./hive


hive> show tables;


OK
tab_name
Time taken: 5.268 seconds


But when i excecute the same command from my install folder:


myUser@ubuntu:~/work/hive-0.10.0-bin/bin$ cd ..


hive> show tables;


OK
tab_name
employees
sample_pages
Time taken: 13.547 seconds


Please let me know.
Thanks
Sai


Re: Get the job id for a hive query

2013-03-05 Thread Nitin Pawar
if the job is submitted to hadoop, it will come up on jobtracker.
Unless you are slow on tracking and your job history retaining is 0, each
hive query submitted to jobtracker will be there on jobhistory


On Tue, Mar 5, 2013 at 4:58 PM, Tim Bittersohl  wrote:

> Hi,
>
> ** **
>
> I do have the following problem monitoring my Hive queries in Hadoop.
>
> ** **
>
> I create a Sever using the Hive library which connects to an Hadoop
> cluster (file system, job tracker and Hive metastore are set up on this
> cluster). The needed parameters for the Hive server I've set in the
> configuration.
>
> Commands sent to this server are executed, that works.
>
> The problem now is that the MapReduce jobs created by Hive don't appear in
> the job tracker's job list. The list itself works, showing me other
> commands I executed not using Hive. (I use the JobClient Java class to get
> the list)
>
> ** **
>
> Why are the Hive query jobs not in the list? How can I track them in
> Hadoop?
>
> ** **
>
> ** **
>
> Thanks
>
> ** **
>
> ** **
>
> *Von:* Nitin Pawar [mailto:nitinpawar...@gmail.com]
> *Gesendet:* Donnerstag, 28. Februar 2013 16:13
>
> *An:* user@hive.apache.org
> *Betreff:* Re: Get the job id for a hive query
>
> ** **
>
> you can set this property mapred.job.name and this should set the name
> for the job 
>
> ** **
>
> On Thu, Feb 28, 2013 at 8:26 PM, Tim Bittersohl 
> wrote:
>
> Thanks for the response,
>
>  
>
> I also found no way to access the job id via java thrift client, all I can
> get is a query ID by the query planner.
>
>  
>
> How to set the name of a job where a Hive query is fired with, so I can
> find it in the job tracker later?
>
>  
>
> Tim Bittersohl 
>
> Software Engineer 
>
>
> [image: http://www.innoplexia.de/ci/logo/inno_logo_links%20200x80.png]
>
> Innoplexia GmbH
> Mannheimer Str. 175 
>
> 69123 Heidelberg
>
> Tel.: +49 (0) 6221 7198033
> Fax: +49 (0) 6221 7198034
> Web: www.innoplexia.com
>
> Sitz: 69123 Heidelberg, Mannheimer Str. 175 - Steuernummer 32494/62606 -
> USt. IdNr.: DE 272 871 728 - Geschäftsführer: Christian Schneider, Walery
> Strauch, Prof. Dr. Herbert Schuster 
>
>  
>
> *Von:* Nitin Pawar [mailto:nitinpawar...@gmail.com]
> *Gesendet:* Donnerstag, 28. Februar 2013 15:46
>
>
> *An:* user@hive.apache.org
> *Betreff:* Re: Get the job id for a hive query
>
>  
>
> With thrift Client I don't think you can get the jobID from hadoop (I may
> very well be wrong in this)
>
>  
>
> the other way around this was to have a separate name for each job you
> fire through hive and then directly query jobtracker for the same 
>
>  
>
> On Thu, Feb 28, 2013 at 5:18 PM, Tim Bittersohl 
> wrote:
>
> I use java and there HiveClient of the hive library (Version 0.10.0).
>
>  
>
> Tim Bittersohl 
>
> Software Engineer 
>
>
> [image: http://www.innoplexia.de/ci/logo/inno_logo_links%20200x80.png]
>
> Innoplexia GmbH
> Mannheimer Str. 175 
>
> 69123 Heidelberg
>
> Tel.: +49 (0) 6221 7198033
> Fax: +49 (0) 6221 7198034
> Web: www.innoplexia.com
>
> Sitz: 69123 Heidelberg, Mannheimer Str. 175 - Steuernummer 32494/62606 -
> USt. IdNr.: DE 272 871 728 - Geschäftsführer: Christian Schneider, Walery
> Strauch, Prof. Dr. Herbert Schuster 
>
>  
>
> *Von:* Nitin Pawar [mailto:nitinpawar...@gmail.com]
> *Gesendet:* Donnerstag, 28. Februar 2013 12:43
> *An:* user@hive.apache.org
>
>
> *Betreff:* Re: Get the job id for a hive query
>
>  
>
> how are you running your hive queries? using hive cli or hive jdbc client?
> 
>
>  
>
> if you are using hive cli then whenever you fire a query and assuming it
> is syntactically correct and its not select * from table operation 
>
> hive cli shows  the job ID and an URL which points to the job tracker url
> for the job it submitted  
>
>  
>
> On Thu, Feb 28, 2013 at 5:09 PM, Tim Bittersohl 
> wrote:
>
> I’m trying to get the job id of the job created with a Hive query.
>
> At the moment I can get the cluster status from the HiveClient, but I don’
> find any job id in there... 
>
>  
>
> Tim Bittersohl 
>
> Software Engineer 
>
>
> [image: http://www.innoplexia.de/ci/logo/inno_logo_links%20200x80.png]
>
> Innoplexia GmbH
> Mannheimer Str. 175 
>
> 69123 Heidelberg
>
> Tel.: +49 (0) 6221 7198033
> Fax: +49 (0) 6221 7198034
> Web: www.innoplexia.com
>
> Sitz: 69123 Heidelberg, Mannheimer Str. 175 - Steuernummer 32494/62606 -
> USt. IdNr.: DE 272 871 728 - Geschäftsführer: Christian Schneider, Walery
> Strauch, Prof. Dr. Herbert Schuster 
>
>  
>
> *Von:* Harsh J [mailto:ha...@cloudera.com]
> *Gesendet:* Donnerstag, 28. Februar 2013 09:08
> *An:* hive request
> *Betreff:* Re: Get the job id for a hive query
>
>  
>
> The client logs print the Job ID of the spawned job and a tracking URL. Is
> that what you're looking for? Its printed for each stage.
>

AW: Get the job id for a hive query

2013-03-05 Thread Tim Bittersohl
Hi,

 

I do have the following problem monitoring my Hive queries in Hadoop.

 

I create a Sever using the Hive library which connects to an Hadoop cluster
(file system, job tracker and Hive metastore are set up on this cluster).
The needed parameters for the Hive server I've set in the configuration.

Commands sent to this server are executed, that works.

The problem now is that the MapReduce jobs created by Hive don't appear in
the job tracker's job list. The list itself works, showing me other commands
I executed not using Hive. (I use the JobClient Java class to get the list)

 

Why are the Hive query jobs not in the list? How can I track them in Hadoop?

 

 

Thanks

 

 

Von: Nitin Pawar [mailto:nitinpawar...@gmail.com] 
Gesendet: Donnerstag, 28. Februar 2013 16:13
An: user@hive.apache.org
Betreff: Re: Get the job id for a hive query

 

you can set this property mapred.job.name and this should set the name for
the job 

 

On Thu, Feb 28, 2013 at 8:26 PM, Tim Bittersohl  wrote:

Thanks for the response,

 

I also found no way to access the job id via java thrift client, all I can
get is a query ID by the query planner.

 

How to set the name of a job where a Hive query is fired with, so I can find
it in the job tracker later?

 

Tim Bittersohl 

Software Engineer 


http://www.innoplexia.de/ci/logo/inno_logo_links%20200x80.png

Innoplexia GmbH
Mannheimer Str. 175 

69123 Heidelberg 

Tel.: +49 (0) 6221 7198033   
Fax: +49 (0) 6221 7198034   
Web: www.innoplexia.com   

Sitz: 69123 Heidelberg, Mannheimer Str. 175 - Steuernummer 32494/62606 -
USt. IdNr.: DE 272 871 728 - Geschäftsführer: Christian Schneider, Walery
Strauch, Prof. Dr. Herbert Schuster 

 

Von: Nitin Pawar [mailto:nitinpawar...@gmail.com] 
Gesendet: Donnerstag, 28. Februar 2013 15:46


An: user@hive.apache.org
Betreff: Re: Get the job id for a hive query

 

With thrift Client I don't think you can get the jobID from hadoop (I may
very well be wrong in this)

 

the other way around this was to have a separate name for each job you fire
through hive and then directly query jobtracker for the same 

 

On Thu, Feb 28, 2013 at 5:18 PM, Tim Bittersohl  wrote:

I use java and there HiveClient of the hive library (Version 0.10.0).

 

Tim Bittersohl 

Software Engineer 


http://www.innoplexia.de/ci/logo/inno_logo_links%20200x80.png

Innoplexia GmbH
Mannheimer Str. 175 

69123 Heidelberg 

Tel.: +49 (0) 6221 7198033   
Fax: +49 (0) 6221 7198034   
Web: www.innoplexia.com   

Sitz: 69123 Heidelberg, Mannheimer Str. 175 - Steuernummer 32494/62606 -
USt. IdNr.: DE 272 871 728 - Geschäftsführer: Christian Schneider, Walery
Strauch, Prof. Dr. Herbert Schuster 

 

Von: Nitin Pawar [mailto:nitinpawar...@gmail.com] 
Gesendet: Donnerstag, 28. Februar 2013 12:43
An: user@hive.apache.org


Betreff: Re: Get the job id for a hive query

 

how are you running your hive queries? using hive cli or hive jdbc client? 

 

if you are using hive cli then whenever you fire a query and assuming it is
syntactically correct and its not select * from table operation 

hive cli shows  the job ID and an URL which points to the job tracker url
for the job it submitted  

 

On Thu, Feb 28, 2013 at 5:09 PM, Tim Bittersohl  wrote:

I’m trying to get the job id of the job created with a Hive query.

At the moment I can get the cluster status from the HiveClient, but I don’
find any job id in there... 

 

Tim Bittersohl 

Software Engineer 


http://www.innoplexia.de/ci/logo/inno_logo_links%20200x80.png

Innoplexia GmbH
Mannheimer Str. 175 

69123 Heidelberg 

Tel.: +49 (0) 6221 7198033   
Fax: +49 (0) 6221 7198034   
Web: www.innoplexia.com   

Sitz: 69123 Heidelberg, Mannheimer Str. 175 - Steuernummer 32494/62606 -
USt. IdNr.: DE 272 871 728 - Geschäftsführer: Christian Schneider, Walery
Strauch, Prof. Dr. Herbert Schuster 

 

Von: Harsh J [mailto:ha...@cloudera.com] 
Gesendet: Donnerstag, 28. Februar 2013 09:08
An: hive request
Betreff: Re: Get the job id for a hive query

 

The client logs print the Job ID of the spawned job and a tracking URL. Is
that what you're looking for? Its printed for each stage.

 

On Wed, Feb 27, 2013 at 11:06 PM, Tim Bittersohl  wrote:

Hi,

 

has the Hive client the possibility to give back the job id of the job
created when running a query? I need that for tracking.

 

 

Greetings

Tim Bittersohl 

Software Engineer 


http://www.innoplexia.de/ci/logo/inno_logo_links%20200x80.png

Innoplexia GmbH
Mannheimer Str. 175 

69123 Heidelberg 

Tel.: +49 (0) 6221 7198033   
Fax: +49 (0) 6221 7198034   
Web: www.innoplexia.com   

Sitz: 69123 Heidelberg, Mannheimer Str. 175 - Steuernummer 32494/62606
  - USt. IdNr.: DE 272 871 728 - Geschäftsführer:
Christian Schneider, Walery Strauch, Prof. Dr. Herbert Schuster 

 





 

-- 
Harsh J 





 

-- 
Nitin Pawar





 

-- 
Nitin Pawar





 

-- 
Nitin Pawar

<>