Hadoop based product recomendations.

2013-05-29 Thread Sai Sai
Just wondering if anyone would have any suggestions.
We r a bunch of developers on bench for a few months trained on Hadoop but do 
not have any projects to work.
We would like to develop a Hadoop/Hive/Pig based product for our company so we 
can be of value to the company and not be scared of lay offs. We r wondering if 
anyone could share any ideas of any product that we can develop and be of value 
to our company rather than just hoping we would get any projects to work. Any 
help/suggestions/direction will be really appreciated.
Thanks,
Sai

Re: Issue with Json tuple lateral view

2013-05-27 Thread Sai Sai


*
Here is the json-data that i load:
*


{ blogID : FJY26J1333, date : 2012-04-01, name : vpxnksu, comment 
: good stuff, contact : { email : vpxn...@gmail.com, website : 
vpxnksu.wordpress.com } }
{ blogID : VSAUMDFGSD, date : 2012-04-01, name : yhftrcx, comment 
: another comment,}

*

Here is the hive commands :
*


CREATE  EXTERNAL  TABLE json_comments(value STRING) LOCATION  
'/user/json-comments';

LOAD DATA LOCAL INPATH '/home/satish/data/inputSai/json-comments.txt' OVERWRITE 
INTO TABLE json_comments;

SELECT b.blogID, c.email FROM json_comments a LATERAL VIEW json_tuple(a.value, 
'blogID', 'contact') b AS blogID, contact LATERAL VIEW json_tuple(b.contact, 
'email', 'website') c AS email, website;


*

Here r the results of  map reduce:
*


blogidemail

fjy26j1333vpxn...@gmail.com
NULLNULL

*

My question is why the 2nd row is coming up as Null values, i was expecting the 
results to be like this:
*


blogidemail
fjy26j1333vpxn...@gmail.com
VSAUMDFGSDNULL

Any input is appreciated in understanding this.
Thanks
S

Re: Issue with Json tuple lateral view

2013-05-27 Thread Sai Sai
Thanks Navis



 From: Navis류승우 navis@nexr.com
To: user@hive.apache.org; Sai Sai saigr...@yahoo.in 
Sent: Monday, 27 May 2013 12:15 PM
Subject: Re: Issue with Json tuple lateral view
 

Removing last ',' in second row would make result as you expected.

I can't tell it's bug or not.

2013/5/27 Sai Sai saigr...@yahoo.in:

 *
 Here is the json-data that i load:
 *

 { blogID : FJY26J1333, date : 2012-04-01, name : vpxnksu,
 comment : good stuff, contact : { email : vpxn...@gmail.com,
 website : vpxnksu.wordpress.com } }
 { blogID : VSAUMDFGSD, date : 2012-04-01, name : yhftrcx,
 comment : another comment,}

 *
 Here is the hive commands :
 *

 CREATE  EXTERNAL  TABLE json_comments(value STRING) LOCATION
 '/user/json-comments';

 LOAD DATA LOCAL INPATH '/home/satish/data/inputSai/json-comments.txt'
 OVERWRITE INTO TABLE json_comments;

 SELECT b.blogID, c.email FROM json_comments a LATERAL VIEW
 json_tuple(a.value, 'blogID', 'contact') b AS blogID, contact LATERAL VIEW
 json_tuple(b.contact, 'email', 'website') c AS email, website;

 *
 Here r the results of  map reduce:
 *

 blogid email
 FJY26J1333 vpxn...@gmail.com
 NULL NULL

 *
 My question is why the 2nd row is coming up as Null values, i was expecting
 the results to be like this:
 *

 blogid email
 FJY26J1333 vpxn...@gmail.com
 VSAUMDFGSD NULL

 Any input is appreciated in understanding this.
 Thanks
 S

Re:Partitioning confusion

2013-05-27 Thread Sai Sai


After creating a partition for a country (USA) and state (IL) and when we go to 
the the hdfs site to look at the partition in the browser we r seeing  all the 
records for all the countries and states rather than just for the partition 
created for US and IL given below, is this correct behavior:

Here is my commands:



CREATE TABLE employees (name STRING, salary FLOAT, subordinates ARRAYSTRING, 
deductions MAPSTRING, FLOAT, address STRUCTstreet:STRING, city:STRING, 
state:STRING, zip:INT, country:STRING ) PARTITIONED BY (country STRING, state 
STRING);

LOAD DATA LOCAL INPATH 
'/home/satish/data/employees/input/employees-country.txt' INTO TABLE employees 
PARTITION (country='USA',state='IL');



Here is my original data file, where i have a few countries data such as USA, 
INDIA, UK, AUS:



John Doe10.0Mary SmithTodd JonesFederal Taxes.2State Taxes.05Insurance.11 
Michigan Ave.ChicagoIL60600USA
Mary Smith8.0Bill KingFederal Taxes.2State Taxes.05Insurance.1100 Ontario 
St.ChicagoIL60601USA
Todd Jones7.0Federal Taxes.15State Taxes.03Insurance.1200 Chicago Ave.Oak 
ParkIL60700USA
Bill King6.0Federal Taxes.15State Taxes.03Insurance.1300 Obscure 
Dr.ObscuriaIL60100USA
Boss Man20.0John DoeFred FinanceFederal Taxes.3State Taxes.07Insurance.051 
Pretentious Drive.ChicagoIL60500USA
Fred Finance15.0Stacy AccountantFederal Taxes.3State Taxes.07Insurance.052 
Pretentious Drive.ChicagoIL60500USA
Stacy Accountant6.0Federal Taxes.15State Taxes.03Insurance.1300 Main 
St.NapervilleIL60563USA
John Doe 210.0Mary SmithTodd JonesFederal Taxes.2State Taxes.05Insurance.11 
Michigan Ave.ChicagoIL60600INDIA
Mary Smith 28.0Bill KingFederal Taxes.2State Taxes.05Insurance.1100 Ontario 
St.ChicagoIL60601INDIA
Todd Jones 27.0Federal Taxes.15State Taxes.03Insurance.1200 Chicago Ave.Oak 
ParkIL60700AUSTRALIA
Bill King 26.0Federal Taxes.15State Taxes.03Insurance.1300 Obscure 
Dr.ObscuriaIL60100AUSTRALIA
Boss Man2 20.0John DoeFred FinanceFederal Taxes.3State 
Taxes.07Insurance.051 Pretentious Drive.ChicagoIL60500UK
Fred Finance 215.0Stacy AccountantFederal Taxes.3State 
Taxes.07Insurance.052 Pretentious Drive.ChicagoIL60500UK
Stacy Accountant 26.0Federal Taxes.15State Taxes.03Insurance.1300 Main 
St.NapervilleIL60563UK


Now when i navigate to:
Contents of directory /user/hive/warehouse/db1.db/employees/country=USA/state=IL



I see all the records and was wondering if it should have only USA  IL records.
Please help.

Re: Partitioning confusion

2013-05-27 Thread Sai Sai
Nitin
I am still confused, from the below data that  i have given should the file 
which sits in the folder Country=USA and state=IL have only the rows where 
Country=USA and state=IL or will it have rows of other countries also.
The reason i ask is because if we have a 250GB file and would like to create 10 
partitions that would end up in 2.5 TB * 3 = 7.5TB. Is this expected.
Thanks
S



 From: Nitin Pawar nitinpawar...@gmail.com
To: user@hive.apache.org; Sai Sai saigr...@yahoo.in 
Sent: Monday, 27 May 2013 2:08 PM
Subject: Re: Partitioning confusion
 


when you specify the load data query with specific partition, it will put the 
entire data into that partition. 




On Mon, May 27, 2013 at 1:08 PM, Sai Sai saigr...@yahoo.in wrote:



After creating a partition for a country (USA) and state (IL) and when we go 
to the the hdfs site to look at the partition in the browser we r seeing  all 
the records for all the countries and states rather than just for the 
partition created for US and IL given below, is this correct behavior:

Here is my commands:




CREATE TABLE employees (name STRING, salary FLOAT, subordinates ARRAYSTRING, 
deductions MAPSTRING, FLOAT, address STRUCTstreet:STRING, city:STRING, 
state:STRING, zip:INT, country:STRING ) PARTITIONED BY (country STRING, state 
STRING);


LOAD DATA LOCAL INPATH 
'/home/satish/data/employees/input/employees-country.txt' INTO TABLE employees 
PARTITION (country='USA',state='IL');




Here is my original data file, where i have a few countries data such as USA, 
INDIA, UK, AUS:




John Doe10.0Mary SmithTodd JonesFederal Taxes.2State Taxes.05Insurance.11 
Michigan Ave.ChicagoIL60600USA
Mary Smith8.0Bill KingFederal Taxes.2State Taxes.05Insurance.1100 Ontario 
St.ChicagoIL60601USA
Todd Jones7.0Federal Taxes.15State Taxes.03Insurance.1200 Chicago Ave.Oak 
ParkIL60700USA
Bill King6.0Federal Taxes.15State Taxes.03Insurance.1300 Obscure 
Dr.ObscuriaIL60100USA
Boss Man20.0John DoeFred FinanceFederal Taxes.3State Taxes.07Insurance.051 
Pretentious Drive.ChicagoIL60500USA
Fred Finance15.0Stacy AccountantFederal Taxes.3State Taxes.07Insurance.052 
Pretentious Drive.ChicagoIL60500USA
Stacy Accountant6.0Federal Taxes.15State Taxes.03Insurance.1300 Main 
St.NapervilleIL60563USA
John Doe 210.0Mary SmithTodd JonesFederal Taxes.2State 
Taxes.05Insurance.11 Michigan Ave.ChicagoIL60600INDIA
Mary Smith 28.0Bill KingFederal Taxes.2State Taxes.05Insurance.1100 
Ontario St.ChicagoIL60601INDIA
Todd Jones 27.0Federal Taxes.15State Taxes.03Insurance.1200 Chicago 
Ave.Oak ParkIL60700AUSTRALIA
Bill King 26.0Federal Taxes.15State Taxes.03Insurance.1300 Obscure 
Dr.ObscuriaIL60100AUSTRALIA
Boss Man2 20.0John DoeFred FinanceFederal Taxes.3State 
Taxes.07Insurance.051 Pretentious Drive.ChicagoIL60500UK
Fred Finance 215.0Stacy AccountantFederal Taxes.3State 
Taxes.07Insurance.052 Pretentious Drive.ChicagoIL60500UK
Stacy Accountant 26.0Federal Taxes.15State Taxes.03Insurance.1300 Main 
St.NapervilleIL60563UK


Now when i navigate to:
Contents of directory 
/user/hive/warehouse/db1.db/employees/country=USA/state=IL



I see all the records and was wondering if it should have only USA  IL 
records.
Please help.


-- 
Nitin Pawar

Re: Difference between like %A% and %a%

2013-05-24 Thread Sai Sai


Just wondering about this, please let me know if you have any suggestions why 
we r getting these results:

This  query does not return any data:

Query1:hive (test) select full_name from states where abbreviation like '%a%';


But this query returns data successfully:

Query2:hive (test) select full_name from states where abbreviation like '%A%';

Result of Query 1:

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201305240156_0012, Tracking URL = 
http://ubuntu:50030/jobdetails.jsp?jobid=job_201305240156_0012
Kill Command = /home/satish/work/hadoop-1.0.4/libexec/../bin/hadoop job  -kill 
job_201305240156_0012
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2013-05-24 03:51:04,939 Stage-1 map = 0%,  reduce = 0%
2013-05-24 03:51:10,970 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.46 
sec
2013-05-24 03:51:11,983 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.46 
sec
2013-05-24 03:51:12,988 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.46 
sec
2013-05-24 03:51:13,995 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.46 
sec
2013-05-24 03:51:15,004 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.46 
sec
2013-05-24 03:51:16,013 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.46 
sec
2013-05-24 03:51:17,020 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 0.46 
sec
MapReduce Total cumulative CPU time: 460 msec
Ended Job = job_201305240156_0012
MapReduce Jobs Launched: 
Job 0: Map: 1   Cumulative CPU: 0.46 sec   HDFS Read: 848 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 460 msec
OK
full_name
Time taken: 19.558 seconds

But this query returns data successfully:

hive (test) select full_name from states where abbreviation like '%A%';

Result of Query2:


Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201305240156_0011, Tracking URL = 
http://ubuntu:50030/jobdetails.jsp?jobid=job_201305240156_0011
Kill Command = /home/satish/work/hadoop-1.0.4/libexec/../bin/hadoop job  -kill 
job_201305240156_0011
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2013-05-24 03:50:32,163 Stage-1 map = 0%,  reduce = 0%
2013-05-24 03:50:38,193 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.47 
sec
2013-05-24 03:50:39,196 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.47 
sec
2013-05-24 03:50:40,199 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.47 
sec
2013-05-24 03:50:41,206 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.47 
sec
2013-05-24 03:50:42,210 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.47 
sec
2013-05-24 03:50:43,221 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.47 
sec
2013-05-24 03:50:44,227 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 0.47 
sec
MapReduce Total cumulative CPU time: 470 msec
Ended Job = job_201305240156_0011
MapReduce Jobs Launched: 
Job 0: Map: 1   Cumulative CPU: 0.47 sec   HDFS Read: 848 HDFS Write: 115 
SUCCESS
Total MapReduce CPU Time Spent: 470 msec
OK
full_name
Alabama
Alaska
Arizona
Arkansas
California
Georgia
Iowa
Louisiana
Massachusetts  
Pennsylvania
Virginia
Washington
Time taken: 20.551 seconds

Thanks
Sai

Re: Where can we see the results of Select * from states

2013-05-24 Thread Sai Sai
I have created an external table called states under a database called test,
Then loaded the table successfully;
The i have tried:

Select * from states;

It successfully executes MR and displays the results in the console but 
wondering where to look in hdfs to see these results.

I have looked under all the dirs in filesystem for the below url but cannot see 
the results part file.

http://localhost.localdomain:50070/dfshealth.jsp


Also if i would like the results to save to a specific file from a query how to 
do it?

For Ex: 
    Select * from states  myStates.txt ;
Is there something like this.
Thanks
Sai

Re: Where to find the external table file in HDFS

2013-05-24 Thread Sai Sai
I have created an external table states and loaded it from a file under 
/tmp/states.txt

Then in the url: 

http://localhost.localdomain:50070/dfshealth.jsp

I have looked to see if this file states table exists and do not see it.
Just wondering if it is saved in hdfs or not.

How many days will the files exist under /tmp folder.
Thanks
Sai

Re: Difference between like %A% and %a%

2013-05-24 Thread Sai Sai
But it should get more results for this:

%a%

than for

%A%

Please let me know if i am missing something.
Thanks
Sai




 From: Jov am...@amutu.com
To: user@hive.apache.org; Sai Sai saigr...@yahoo.in 
Sent: Friday, 24 May 2013 4:39 PM
Subject: Re: Difference between like %A% and %a%
 




2013/5/24 Sai Sai saigr...@yahoo.in

abbreviation l
unlike MySQL, string in Hive is case sensitive,so '%A%' is not equal with '%a%'.


-- 
Jov

blog: http:amutu.com/blog

Re: How to look at the metadata of the tables we have created.

2013-05-24 Thread Sai Sai
Is it possible to look at the metadata of the databases/tables/views we have 
created in hive.
Is there some thing like sysobjects in hive.
Thanks
Sai

Re: java.lang.NoClassDefFoundError: com/jayway/jsonpath/PathUtil

2013-03-10 Thread Sai Sai
: 
org.apache.hadoop.hive.contrib.serde2.JsonSerde
    at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:264)
    at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820)
    at 
org.apache.hadoop.hive.ql.exec.MapOperator.initObjectInspector(MapOperator.java:243)
    at 
org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:380)
    ... 23 more


FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched: 
Job 0: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

Thanks
Sai



 From: Dean Wampler dean.wamp...@thinkbiganalytics.com
To: user@hive.apache.org; Sai Sai saigr...@yahoo.in 
Sent: Friday, 8 March 2013 5:22 AM
Subject: Re: java.lang.NoClassDefFoundError: com/jayway/jsonpath/PathUtil
 

Unfortunately, you have to also add the json jars to Hive's class path before 
it starts, e.g., 

env HADOOP_CLASSPATH=/path/to/lib/*.jar hive

Use the appropriate path to your lib directory.


On Fri, Mar 8, 2013 at 4:53 AM, Sai Sai saigr...@yahoo.in wrote:

I have added the jar files successfully like this:




hive (testdb) ADD JAR lib/hive-json-serde-0.3.jar;
   Added lib/hive-json-serde-0.3.jar to class path
   Added resource: lib/hive-json-serde-0.3.jar





hive (testdb) ADD JAR lib/json-path-0.5.4.jar;
   Added lib/json-path-0.5.4.jar to class path
   Added resource: lib/json-path-0.5.4.jar





hive (testdb) ADD JAR lib/json-smart-1.0.6.3.jar;
   Added lib/json-smart-1.0.6.3.jar to class path
   Added resource: lib/json-smart-1.0.6.3.jar




After this i am getting this error:




CREATE EXTERNAL TABLE IF NOT EXISTS twitter (tweet_id BIGINT,created_at 
STRING,text STRING,user_id BIGINT, user_screen_name STRING,user_lang STRING) 
ROW FORMAT SERDE org.apache.hadoop.hive.contrib.serde2.JsonSerde WITH 
SERDEPROPERTIES ( 
tweet_id=$.id,created_at=$.created_at,text=$.text,user_id=$.user.id,user_screen_name=$.user.screen_name,
 user_lang=$.user.lang) LOCATION '/home/satish/data/twitter/input';
java.lang.NoClassDefFoundError: com/jayway/jsonpath/PathUtil
    at org.apache.hadoop.hive.contrib.serde2.JsonSerde.initialize(Unknown 
Source)
    at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:207)
    at 
org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:266)
    at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:259)
    at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:585)
    at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:550)
    at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3698)
    at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:253)
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
    at
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
    at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1336)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1122)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:935)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:755)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.lang.ClassNotFoundException: com.jayway.jsonpath.PathUtil
    at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
    ... 23 more
FAILED: Execution Error, return code -101 from 
org.apache.hadoop.hive.ql.exec.DDLTask





Any help would be really appreciated.
ThanksSai



-- 
Dean Wampler, Ph.D.
thinkbiganalytics.com
+1-312-339-1330

Re: java.lang.NoClassDefFoundError: com/jayway/jsonpath/PathUtil

2013-03-10 Thread Sai Sai
Ramki/John
Many Thanks, that really helped. I have run the add jars in the new session and 
it appears to be running. However i was wondering about by passing MR, why 
would we do it and what is the use of it. Will appreciate any input.
Thanks
Sai





 From: Ramki Palle ramki.pa...@gmail.com
To: user@hive.apache.org; Sai Sai saigr...@yahoo.in 
Sent: Sunday, 10 March 2013 4:22 AM
Subject: Re: java.lang.NoClassDefFoundError: com/jayway/jsonpath/PathUtil
 

When you execute the following query,

hive select * from twitter limit 5;

Hive runs it in local mode and not use MapReduce.

For the query,

hive select tweet_id from twitter limit 5;

I think you need to add JSON jars to overcome this error. You might have added 
these in a previous session. If you want these jars available for all sessions, 
insert the add jar statements to your $HOME/.hiverc file.



To bypass MapReduce

set hive.exec.mode.local.auto = true;

to suggest Hive to use local mode to execute the query. If it still uses MR, 
try 

set hive.fetch.task.conversion = more;.


-Ramki.





On Sun, Mar 10, 2013 at 12:19 AM, Sai Sai saigr...@yahoo.in wrote:

Just wondering if anyone has any suggestions:


This executes successfully:


hive select * from twitter limit 5;


This does not work:


hive select tweet_id from twitter limit 5; // I have given the exception info 
below:



Here is the output of this:


hive select * from twitter limit 5;
OK



tweet_id    created_at    text    user_id    user_screen_name    user_lang
122106088022745088    Fri Oct 07 00:28:54 + 2011    wkwkw -_- ayo saja mba 
RT @yullyunet: Sepupuuu, kita lanjalan yok.. Kita karokoe-an.. Ajak mas galih 
jg kalo dia mau.. @Dindnf: doremifas    124735434    Dindnf    en
122106088018558976    Fri Oct 07 00:28:54 + 2011    @egg486 특별히 준비했습니다!    
252828803    CocaCola_Korea    ko
122106088026939392    Fri Oct 07 00:28:54 + 2011    My offer of free 
gobbies for all if @amityaffliction play Blair snitch project still
 stands.    168590073    SarahYoungBlood    en
122106088035328001    Fri Oct 07 00:28:54 + 2011    the girl nxt to me in 
the lib got her headphones in dancing and singing loud af like she the only 
one here haha    267296295    MONEYyDREAMS_    en
122106088005971968    Fri Oct 07 00:28:54 + 2011    @KUnYoong_B2UTY Bị 
lsao đấy    269182160    b2st_b2utyhp    en
Time taken: 0.154 seconds



This does not work:


hive select tweet_id from twitter limit 5;





Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201303050432_0094, Tracking URL = 
http://ubuntu:50030/jobdetails.jsp?jobid=job_201303050432_0094
Kill Command = /home/satish/work/hadoop-1.0.4/libexec/../bin/hadoop job  -kill 
job_201303050432_0094
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2013-03-10 00:14:44,509 Stage-1 map = 0%,  reduce = 0%
2013-03-10 00:15:14,613 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201303050432_0094 with errors
Error during job, obtaining debugging information...
Job Tracking URL: 
http://ubuntu:50030/jobdetails.jsp?jobid=job_201303050432_0094
Examining task ID: task_201303050432_0094_m_02 (and more) from job 
job_201303050432_0094

Task with the most failures(4): 
-
Task ID:
  task_201303050432_0094_m_00

URL:
  
http://ubuntu:50030/taskdetails.jsp?jobid=job_201303050432_0094tipid=task_201303050432_0094_m_00
-
Diagnostic Messages for this Task:
java.lang.RuntimeException: Error in configuring object
    at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:416)
    at
 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.reflect.InvocationTargetException

    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
    at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64

Re: java.lang.NoClassDefFoundError: com/jayway/jsonpath/PathUtil

2013-03-10 Thread Sai Sai
Many Thanks Guys, you guys r really helpful. Really appreciate it.
Thanks
Sai





 From: bejoy...@yahoo.com bejoy...@yahoo.com
To: user@hive.apache.org; Sai Sai saigr...@yahoo.in 
Sent: Sunday, 10 March 2013 12:06 PM
Subject: Re: java.lang.NoClassDefFoundError: com/jayway/jsonpath/PathUtil
 

Hi Sai

Local mode is just for trials, for any pre prod/production environment you need 
MR jobs.

Hive under the hood stores data in HDFS (mostly) and definitely we use 
hadoop/hive for larger data volumes. So MR should be in there to process them. 

Regards 
Bejoy KS

Sent from remote device, Please excuse typos


From:  Ramki Palle ramki.pa...@gmail.com 
Date: Sun, 10 Mar 2013 06:58:57 -0700
To: user@hive.apache.org; Sai Saisaigr...@yahoo.in
ReplyTo:  user@hive.apache.org 
Subject: Re: java.lang.NoClassDefFoundError: com/jayway/jsonpath/PathUtil

Well, you get the results faster.


Please check this:

https://cwiki.apache.org/Hive/gettingstarted.html#GettingStarted-Runtimeconfiguration
 

Under section   Hive, Map-Reduce and Local-Mode, it says

This can be very useful to run queries over small data sets - in such 
cases local mode execution is usually significantly faster than 
submitting jobs to a large cluster.


-Ramki.








On Sun, Mar 10, 2013 at 5:26 AM, Sai Sai saigr...@yahoo.in wrote:

Ramki/John
Many Thanks, that really helped. I have run the add jars in the new session 
and it appears to be running. However i was wondering about by passing MR, why 
would we do it and what is the use of it. Will appreciate any input.
Thanks
Sai







 From: Ramki Palle ramki.pa...@gmail.com

To: user@hive.apache.org; Sai Sai saigr...@yahoo.in 
Sent: Sunday, 10 March 2013 4:22 AM
Subject: Re: java.lang.NoClassDefFoundError: com/jayway/jsonpath/PathUtil
 


When you execute the following query,

hive select * from twitter limit 5;

Hive runs it in local mode and not use MapReduce.

For the query,

hive select tweet_id from twitter limit 5;

I think you need to add JSON jars to overcome this error. You might have added 
these in a previous session. If you want these jars available for all 
sessions, insert the add jar statements to your $HOME/.hiverc file.



To bypass MapReduce

set hive.exec.mode.local.auto = true;

to suggest Hive to use local mode to execute the query. If it still uses MR, 
try 

set hive.fetch.task.conversion = more;.


-Ramki.






On Sun, Mar 10, 2013 at 12:19 AM, Sai Sai saigr...@yahoo.in wrote:

Just wondering if anyone has any suggestions:


This executes successfully:


hive select * from twitter limit 5;


This does not work:


hive select tweet_id from twitter limit 5; // I have given the exception 
info below:



Here is the output of this:


hive select * from twitter limit 5;
OK



tweet_id    created_at    text    user_id    user_screen_name    user_lang
122106088022745088    Fri Oct 07 00:28:54 + 2011    wkwkw -_- ayo saja 
mba RT @yullyunet: Sepupuuu, kita lanjalan yok.. Kita karokoe-an.. Ajak mas 
galih jg kalo dia mau.. @Dindnf: doremifas    124735434    Dindnf    en
122106088018558976    Fri Oct 07 00:28:54 + 2011    @egg486 특별히 준비했습니다!   
 252828803    CocaCola_Korea    ko
122106088026939392    Fri Oct 07 00:28:54 + 2011    My offer of free 
gobbies for all if @amityaffliction play Blair snitch project still
 stands.    168590073    SarahYoungBlood    en
122106088035328001    Fri Oct 07 00:28:54 + 2011    the girl nxt to me in 
the lib got her headphones in dancing and singing loud af like she the only 
one here haha    267296295    MONEYyDREAMS_    en
122106088005971968    Fri Oct 07 00:28:54 + 2011    @KUnYoong_B2UTY Bị 
lsao đấy    269182160    b2st_b2utyhp    en
Time taken: 0.154 seconds



This does not work:


hive select tweet_id from twitter limit 5;





Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201303050432_0094, Tracking URL = 
http://ubuntu:50030/jobdetails.jsp?jobid=job_201303050432_0094
Kill Command = /home/satish/work/hadoop-1.0.4/libexec/../bin/hadoop job  
-kill job_201303050432_0094
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2013-03-10 00:14:44,509 Stage-1 map = 0%,  reduce = 0%
2013-03-10 00:15:14,613 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201303050432_0094 with errors
Error during job, obtaining debugging information...
Job Tracking URL: 
http://ubuntu:50030/jobdetails.jsp?jobid=job_201303050432_0094
Examining task ID: task_201303050432_0094_m_02 (and more) from job 
job_201303050432_0094

Task with the most failures(4): 
-
Task ID:
  task_201303050432_0094_m_00

URL:
  
http://ubuntu:50030/taskdetails.jsp?jobid=job_201303050432_0094tipid=task_201303050432_0094_m_00
-
Diagnostic Messages for this Task:
java.lang.RuntimeException: Error in configuring object

Re: java.lang.NoClassDefFoundError: com/jayway/jsonpath/PathUtil

2013-03-08 Thread Sai Sai
I have added the jar files successfully like this:


hive (testdb) ADD JAR lib/hive-json-serde-0.3.jar;
   Added lib/hive-json-serde-0.3.jar to class path
   Added resource: lib/hive-json-serde-0.3.jar



hive (testdb) ADD JAR lib/json-path-0.5.4.jar;
   Added lib/json-path-0.5.4.jar to class path
   Added resource: lib/json-path-0.5.4.jar



hive (testdb) ADD JAR lib/json-smart-1.0.6.3.jar;
   Added lib/json-smart-1.0.6.3.jar to class path
   Added resource: lib/json-smart-1.0.6.3.jar


After this i am getting this error:



CREATE EXTERNAL TABLE IF NOT EXISTS twitter (tweet_id BIGINT,created_at 
STRING,text STRING,user_id BIGINT, user_screen_name STRING,user_lang STRING) 
ROW FORMAT SERDE org.apache.hadoop.hive.contrib.serde2.JsonSerde WITH 
SERDEPROPERTIES ( 
tweet_id=$.id,created_at=$.created_at,text=$.text,user_id=$.user.id,user_screen_name=$.user.screen_name,
 user_lang=$.user.lang) LOCATION '/home/satish/data/twitter/input';
java.lang.NoClassDefFoundError: com/jayway/jsonpath/PathUtil
    at org.apache.hadoop.hive.contrib.serde2.JsonSerde.initialize(Unknown 
Source)
    at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:207)
    at 
org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:266)
    at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:259)
    at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:585)
    at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:550)
    at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3698)
    at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:253)
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
    at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
    at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1336)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1122)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:935)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:755)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.lang.ClassNotFoundException: com.jayway.jsonpath.PathUtil
    at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
    ... 23 more
FAILED: Execution Error, return code -101 from 
org.apache.hadoop.hive.ql.exec.DDLTask



Any help would be really appreciated.
Thanks
Sai


Re: Accessing sub column in hive

2013-03-07 Thread Sai Sai
I have a table created like this successfully:

CREATE TABLE IF NOT EXISTS employees (name STRING,salary FLOAT,subordinates 
ARRAYSTRING,deductions   MAPSTRING,FLOAT,address STRUCTstreet:STRING, 
city:STRING, state:STRING, zip:INT, country:STRING)

I would like to access/display country column from my address struct.
I have tried this:

select address[country] from employees;

I get an error.

Please help.

Thanks
Sai


Re: Partition not displaying in the browser

2013-03-07 Thread Sai Sai
I get this ouput for:

hive show partitions dividends;
OK
partition
exchange=NASDAQ/symbol=AAPL
exchange=NASDAQ/symbol=INTC
Time taken: 0.133 seconds

But when i navigate to my browser folder the partition INTC is not displayed 
even after refreshing it a bunch of times, any suggestions will be appreciated:


*
Contents of directory /home/satish/data/dividends/input/plain-text/NASDAQ


Goto : 

Go to parent directory

Name
 Type
 Size
 Replication
 Block Size
 Modification Time
 Permission
 Owner
 Group
 
AAPL
 dir
 
 
 
 2013-03-07 08:46
 rwxr-xr-x
 satish
 supergroup
 
Go back to DFS home 

 
Local logs
Any suggestions will be appreciated.
Thanks
Sai


Re: Partition not displaying in the browser

2013-03-07 Thread Sai Sai
Many Thanks for your help Venkatesh.

I have verified the partition exists and also the data displays successfully in 
the partition when i execute the select in the console.

But it does not appear in the web browser.

I have verified multiple time the path i have given and is given below:

Here is the first partition i have created and viewing successfully in both 
console  web browser:

ALTER TABLE dividends ADD PARTITION(exchange = 'NASDAQ', symbol = 'AAPL') 
LOCATION '/home/satish/data/dividends/input/plain-text/NASDAQ/AAPL';

LOAD DATA LOCAL INPATH 
'/home/satish/data/dividends/input/plain-text/NASDAQ/AAPL/dividends.csv' INTO 
TABLE dividends Partition(exchange='NASDAQ',symbol='AAPL');



Here is the one i can view only in the console but not in the browser:

        ALTER TABLE dividends ADD PARTITION(exchange = 'NASDAQ', symbol = 
'INTC') LOCATION '/home/satish/data/dividends/input/plain-text/NASDAQ/INTC';

        LOAD DATA LOCAL INPATH 
'/home/satish/data/dividends/input/plain-text/NASDAQ/INTC/dividends.csv' INTO 
TABLE dividends Partition(exchange='NASDAQ',symbol='INTC');


When i run the command:
select * from dividends where exchange='NASDAQ' and symbol='INTC';
I succesfully see the data.
I am wondering if it is  possible to bounce/restart the server with any command 
or
is it possible to look into the hive metadata directly using a command.
Any help is appreciated.
Thanks
Sai




 From: Venkatesh Kavuluri vkavul...@outlook.com
To: user@hive.apache.org user@hive.apache.org 
Sent: Thursday, 7 March 2013 1:44 PM
Subject: RE: Partition not displaying in the browser
 

 
The partitions info you see on 'show partitions' is fetched from Hive metadata 
tables. The reason you are not seeing the path you are expecting might be 
either 
1) the path got deleted after the data load (do a simple select and verify you 
see some data) or
2) you have loaded the data from some other path to this partition  

-
Venkatesh




Date: Fri, 8 Mar 2013 00:54:17 +0800
From: saigr...@yahoo.in
Subject: Re: Partition not displaying in the browser
To: user@hive.apache.org


I get this ouput for:

hive show partitions dividends;
OK
partition
exchange=NASDAQ/symbol=AAPL
exchange=NASDAQ/symbol=INTC
Time taken: 0.133 seconds

But when i navigate to my browser folder the partition INTC is not displayed 
even after refreshing it a bunch of times, any suggestions will be appreciated:


*
Contents of directory /home/satish/data/dividends/input/plain-text/NASDAQ

Goto : 

Go to parent directory

Name
 Type
 Size
 Replication
 Block Size
 Modification Time
 Permission
 Owner
 Group
 
AAPL
 dir
 
 
 
 2013-03-07 08:46
 rwxr-xr-x
 satish
 supergroup
 
Go back to DFS home 

 
Local logs
Any suggestions will be appreciated.
Thanks
Sai

Re: Find current db we r using in Hive

2013-03-07 Thread Sai Sai
Just wondering if there is any command in Hive which will show us the current 
db we r using similar to pwd in Unix.
Thanks
Sai

Re: Read map value from a table

2013-03-06 Thread Sai Sai
Here is my data in a file which i have successfully loaded into a table test 
and successfully get the data for:

Select * from test;

Name    ph    category


Name1    ph1    {type:1000,color:200,shape:610}
Name2    ph2    {type:2000,color:200,shape:150}
Name3    ph3    {type:3000,color:700,shape:167}

But when i execute this query:

select category[type] from test;

I get null values;

Please help.
Thanks
Sai


Re: Where is the location of hive queries

2013-03-06 Thread Sai Sai
After we run a query in hive shell as:
Select * from myTable;

Are these results getting saved to any file apart from the console/terminal 
display.
If so where is the location of the results.
Thanks
Sai


Re: SemanticException Line 1:17 issue

2013-03-05 Thread Sai Sai
Hello

I have been stuck on this issue for quite some time and was wondering if anyone 
sees any problem with this that i am not seeing:

I have verified the file exists here and have also manually pasted the file 
into the tmp folder but still running into the same issue.

I am also wondering which folder this maps to in my local drive:
hdfs://ubuntu:9000/

***


hive LOAD DATA INPATH '/tmp/o_small.tsv' OVERWRITE INTO TABLE odata ;
FAILED: SemanticException Line 1:17 Invalid path ''/tmp/o_small.tsv'': No files 
matching path hdfs://ubuntu:9000/tmp/o_small.tsv

***
I have verified the file exists here and have also manually pasted the file 
here but still running into the same issue.
Please let me know if u have any suggestions will be really appreciated.
Thanks
Sai


Re: SemanticException Line 1:17 issue

2013-03-05 Thread Sai Sai
Yes Nitin it exists... but still getting the same issue.





 From: Nitin Pawar nitinpawar...@gmail.com
To: user@hive.apache.org; Sai Sai saigr...@yahoo.in 
Sent: Tuesday, 5 March 2013 4:14 AM
Subject: Re: SemanticException Line 1:17 issue
 

this file /tmp/o_small.tsv is on your local filesystem or hdfs? 



On Tue, Mar 5, 2013 at 5:39 PM, Sai Sai saigr...@yahoo.in wrote:

Hello


I have been stuck on this issue for quite some time and was wondering if 
anyone sees any problem with this that i am not seeing:


I have verified the file exists here and have also manually pasted the file 
into the tmp folder but still running into the same issue.


I am also wondering which folder this maps to in my local drive:
hdfs://ubuntu:9000/


***


hive LOAD DATA INPATH '/tmp/o_small.tsv' OVERWRITE INTO TABLE odata ;
FAILED: SemanticException Line 1:17 Invalid path ''/tmp/o_small.tsv'': No 
files matching path hdfs://ubuntu:9000/tmp/o_small.tsv


***
I have verified the file exists here and have also manually pasted the file 
here but still running into the same issue.
Please let me know if u have any suggestions will be really appreciated.
ThanksSai



-- 
Nitin Pawar

Re: SemanticException Line 1:17 issue

2013-03-05 Thread Sai Sai
Thanks for your help Nitin, here is what it displays:

satish@ubuntu:~/work/hadoop-1.0.4/bin$ $HADOOP_HOME/bin/hadoop dfs -ls /tmp/


Warning: $HADOOP_HOME is deprecated.
Found 3 items

drwxr-xr-x   - satish supergroup  0 2013-03-05 04:12 /tmp/hive-satish
-rw-r--r--   1 satish supergroup    654 2013-03-04 02:41 /tmp/states.txt
drwxr-xr-x   - satish supergroup  0 2013-02-16 00:46 
/tmp/temp-1850940621

**
I have done a search for the file states.txt and it refers to 3 places 2 of em 
refer to
proc/2693/cwd

but none of them refer to tmp folder.

Please let me know if you have any other suggestions.
In the meantime i will try with the [LOCAL] file and let you know.

Thanks
Sai




 From: Nitin Pawar nitinpawar...@gmail.com
To: user@hive.apache.org; Sai Sai saigr...@yahoo.in 
Sent: Tuesday, 5 March 2013 4:24 AM
Subject: Re: SemanticException Line 1:17 issue
 

it exists but where? on your hdfs or local linux filesystem ?  so if you are 
checking the file with ls -l /tmp/ then add word local

ls can you provide output of $HADOOP_HOME/bin/hadoop dfs -ls /tmp/ 


LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename
If the keyword LOCAL is specified, then:
* the load command will look for filepath in the local file system. If 
a relative path is specified - it will be interpreted relative to the current 
directory of the user



On Tue, Mar 5, 2013 at 5:48 PM, Sai Sai saigr...@yahoo.in wrote:

Yes Nitin it exists... but still getting the same issue.





 From: Nitin Pawar nitinpawar...@gmail.com
To: user@hive.apache.org; Sai Sai saigr...@yahoo.in 
Sent: Tuesday, 5 March 2013 4:14 AM
Subject: Re: SemanticException Line 1:17 issue
 


this file /tmp/o_small.tsv is on your local filesystem or hdfs? 



On Tue, Mar 5, 2013 at 5:39 PM, Sai Sai saigr...@yahoo.in wrote:

Hello


I have been stuck on this issue for quite some time and was wondering if 
anyone sees any problem with this that i am not seeing:


I have verified the file exists here and have also manually pasted the file 
into the tmp folder but still running into the same issue.


I am also wondering which folder this maps to in my local drive:
hdfs://ubuntu:9000/


***


hive LOAD DATA INPATH '/tmp/o_small.tsv' OVERWRITE INTO TABLE odata ;
FAILED: SemanticException Line 1:17 Invalid path ''/tmp/o_small.tsv'': No 
files matching path hdfs://ubuntu:9000/tmp/o_small.tsv


***
I have verified the file exists here and have also manually pasted the file 
here but still running into the same issue.
Please let me know if u have any suggestions will be really appreciated.
ThanksSai




-- 
Nitin Pawar





-- 
Nitin Pawar

Re: Done SemanticException Line 1:17 issue

2013-03-05 Thread Sai Sai


Thanks for your help Nitin.
I have restarted my VM and tried again and it appears to work.

Thanks again.
Sai



 From: Sai Sai saigr...@yahoo.in
To: user@hive.apache.org user@hive.apache.org 
Sent: Tuesday, 5 March 2013 4:42 AM
Subject: Re: SemanticException Line 1:17 issue
 

Thanks for your help Nitin, here is what it displays:

satish@ubuntu:~/work/hadoop-1.0.4/bin$ $HADOOP_HOME/bin/hadoop dfs -ls /tmp/


Warning: $HADOOP_HOME is deprecated.
Found 3 items

drwxr-xr-x   - satish supergroup  0 2013-03-05 04:12 /tmp/hive-satish
-rw-r--r--   1 satish supergroup    654 2013-03-04 02:41 /tmp/states.txt
drwxr-xr-x   - satish supergroup  0 2013-02-16 00:46 
/tmp/temp-1850940621

**
I have done a search for the file states.txt and it refers to 3 places 2 of em 
refer to
proc/2693/cwd

but none of them refer to tmp folder.

Please let me know if you have any other suggestions.
In the meantime i will try with the [LOCAL] file and let you know.

Thanks
Sai




 From: Nitin Pawar nitinpawar...@gmail.com
To: user@hive.apache.org; Sai Sai saigr...@yahoo.in 
Sent: Tuesday, 5 March 2013 4:24 AM
Subject: Re: SemanticException Line 1:17 issue
 

it exists but where? on your hdfs or local linux filesystem ?  so if you are 
checking the file with ls -l /tmp/ then add word local

ls can you provide output of $HADOOP_HOME/bin/hadoop dfs -ls /tmp/ 


LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename
If the keyword LOCAL is specified, then:
* the load command will look for filepath in the local file system. If 
a relative path is specified - it will be interpreted relative to the current 
directory of the user



On Tue, Mar 5, 2013 at 5:48 PM, Sai Sai saigr...@yahoo.in wrote:

Yes Nitin it exists... but still getting the same issue.





 From: Nitin Pawar nitinpawar...@gmail.com
To: user@hive.apache.org; Sai Sai saigr...@yahoo.in 
Sent: Tuesday, 5 March 2013 4:14 AM
Subject: Re: SemanticException Line 1:17 issue
 


this file /tmp/o_small.tsv is on your local filesystem or hdfs? 



On Tue, Mar 5, 2013 at 5:39 PM, Sai Sai saigr...@yahoo.in wrote:

Hello


I have been stuck on this issue for quite some time and was wondering if 
anyone sees any problem with this that i am not seeing:


I have verified the file exists here and have also manually pasted the file 
into the tmp folder but still running into the same issue.


I am also wondering which folder this maps to in my local drive:
hdfs://ubuntu:9000/


***


hive LOAD DATA INPATH '/tmp/o_small.tsv' OVERWRITE INTO TABLE odata ;
FAILED: SemanticException Line 1:17 Invalid path ''/tmp/o_small.tsv'': No 
files matching path hdfs://ubuntu:9000/tmp/o_small.tsv


***
I have verified the file exists here and have also manually pasted the file 
here but still running into the same issue.
Please let me know if u have any suggestions will be really appreciated.
ThanksSai




-- 
Nitin Pawar





-- 
Nitin Pawar

Re: Location of external table in hdfs

2013-03-05 Thread Sai Sai
I have created an external table like below and wondering where (folder) in 
hdfs i can find this:

CREATE EXTERNAL TABLE states(abbreviation string, full_name string) ROW FORMAT 
DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/tmp/states' ;

Any help is really appreciated.

Thanks
Sai


Re: Location of external table in hdfs

2013-03-05 Thread Sai Sai
Thanks I figured this is in tmp/states
Thanks for your attention.





 From: Sai Sai saigr...@yahoo.in
To: user@hive.apache.org user@hive.apache.org 
Sent: Tuesday, 5 March 2013 8:56 AM
Subject: Re: Location of external table in hdfs
 

I have created an external table like below and wondering where (folder) in 
hdfs i can find this:

CREATE EXTERNAL TABLE states(abbreviation string, full_name string) ROW FORMAT 
DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/tmp/states' ;

Any help is really appreciated.

Thanks
Sai

hive newb questions

2013-03-04 Thread Sai Sai
Hi
I was wondering if it is right to assume:

1. The first time we create a table in hive and load it followed by running the 
first query like 

Select * from Table1

will result in a MR job running and will get the data to us.

If we run the same query second time MR job will not run but will result in 
just fetch the data.

2. If the above assumption is not right is possible to cache the data in hive 
so the MR job will not run 
again for the subsequent queries and just fetch it right away.

3. Once we load the data in hive table how many days should we keep it.

4. Is it a good practise to remove the data in a certain period of time as it 
may take a large space.

5. Should this really be a concern or not as the memory today is not that 
expensive.

Any inputs will be appreciated.
Thanks
Sai

Re: hive columns display

2013-03-04 Thread Sai Sai
When we run a query in hive like:

Select * from myTable limit 10;

We get the results successfully but the column names r not displayed.
Is it possible to display the column names also so the data and the columns can 
be related right away without running a describe table.

Thanks,
Sai


Re: hive to excel

2013-03-04 Thread Sai Sai
Just wondering how to save the data of a query to excel file.
For ex: 

After running the query:
Select * from myTable;
we would like to save the query results to xls file.
Any help is appreciated.
Thanks
Sai

Re: hive light weight reporting tool

2013-03-04 Thread Sai Sai
Just wondering if there is any light weight reporting tool with hive/hadoop 
which we can use for quick POCs.
Thanks
Sai


Re: hive light weight reporting tool

2013-03-04 Thread Sai Sai
Thanks again Jagat. just wanted to get a second opinion about my excel question.
Thanks again for the input.
Sai.





 From: Jagat Singh jagatsi...@gmail.com
To: user@hive.apache.org; Sai Sai saigr...@yahoo.in 
Sent: Monday, 4 March 2013 1:01 AM
Subject: Re: hive light weight reporting tool
 

Hi,


There are many reporting tool which can read from Hive server.

All you need is to start hive server and then point tool to use it.

Pentaho , Talend , ireport are few.

Just search over here.

Thanks.

Jagat Singh


On Mon, Mar 4, 2013 at 7:58 PM, Sai Sai saigr...@yahoo.in wrote:

Just wondering if there is any light weight reporting tool with hive/hadoop 
which we can use for quick POCs.
ThanksSai


Re: hive light weight reporting tool

2013-03-04 Thread Sai Sai
Thanks Jagat.





 From: Jagat Singh jagatsi...@gmail.com
To: user@hive.apache.org; Sai Sai saigr...@yahoo.in 
Sent: Monday, 4 March 2013 1:23 AM
Subject: Re: hive light weight reporting tool
 

Yes just wait for sometime.

We have awesome people here , they would suggest wonderful solutions to you.






On Mon, Mar 4, 2013 at 8:18 PM, Sai Sai saigr...@yahoo.in wrote:

Thanks again Jagat. just wanted to get a second opinion about my excel question.
Thanks again for the input.
Sai.







 From: Jagat Singh jagatsi...@gmail.com
To: user@hive.apache.org; Sai Sai saigr...@yahoo.in 
Sent: Monday, 4 March 2013 1:01 AM
Subject: Re: hive light weight reporting tool
 


Hi,


There are many reporting tool which can read from Hive server.

All you need is to start hive server and then point tool to use it.

Pentaho , Talend , ireport are few.

Just search over here.

Thanks.

Jagat Singh


On Mon, Mar 4, 2013 at 7:58 PM, Sai Sai saigr...@yahoo.in wrote:

Just wondering if there is any light weight reporting tool with hive/hadoop 
which we can use for quick POCs.
ThanksSai





Re: hive commands from a file

2013-03-04 Thread Sai Sai
Just wondering if it is possible to run a bunch of  hive commands from a file 
rather than one a time.
For ex:
1. Create external...
2. Load ...
3. Select * from ...
4

Thanks
Sai


Re: hive commands from a file

2013-03-04 Thread Sai Sai
Thanks Krishna/Nitin.






 From: Nitin Pawar nitinpawar...@gmail.com
To: user@hive.apache.org 
Sent: Monday, 4 March 2013 2:28 AM
Subject: Re: hive commands from a file
 

Try hive -f filename
On Mar 4, 2013 3:55 PM, Sai Sai saigr...@yahoo.in wrote:

Just wondering if it is possible to run a bunch of  hive commands from a file 
rather than one a time.
For ex:
1. Create external...
2. Load ...
3. Select * from ...
4


Thanks
Sai