why using mapreduce python scripts print more NULLs

2013-03-26 Thread 周梦想
hive version:0.10.0 hive from testpoker select transform(ldate,ltime,threadid,gameid,userid,pid,roundbet,fold,allin,cardtype,cards,chipwon) using 'calcpoker.py' as ldate,gameid,userid,pid,win,fold,allin,cardtype,cards ; 03/13/13 1009 185690475 8639 0 1 0 -1 NULL NULLNULLNULL NULL

S3/EMR Hive: Load contents of a single file

2013-03-26 Thread Tony Burton
Hi list, I've been using hive to perform queries on data hosted on AWS S3, and my tables point at data by specifying the directory in which the data is stored, eg $ create external table myData (str1 string, str2 string, count1 int) partitioned by snip row format snip stored as textfile

Re: S3/EMR Hive: Load contents of a single file

2013-03-26 Thread Sanjay Subramanian
Hi Tony Can u create the table without any location. After that you could do an ALTER TABLE add location and partition ALTER TABLE myData ADD PARTITION (partitionColumn1='$value1' , partitionColumn2='$value2') LOCATION '/path/to/your/directory/in/hdfs'; An example Without Partitions

RE: S3/EMR Hive: Load contents of a single file

2013-03-26 Thread Tony Burton
Thanks for the quick reply Sanjay. ALTER TABLE is the key, but slightly different to your suggestion. I create the table as before, but don't specify location: $ create external table myData (str1 string, str2 string, count1 int) partitioned by snip row format snip stored as textfile; Then

Re: S3/EMR Hive: Load contents of a single file

2013-03-26 Thread Ramki Palle
First of all, you cannot point a table to a file. Each table will have a corresponding table. If you want to have all the in the table contains in only one file, simply copy that one file into the directory. The table does not need to know the name of the file. It only matters whether the

Re: HDFS directory in /user/hive/warehouse getting hive as Owner ?

2013-03-26 Thread Sanjay Subramanian
Hi I added the following to hive-site.xml property namehive.security.authorization.enabled/name valuetrue/value descriptionenable or disable the hive client authorization/description /property I did not add hive.security.authorization.manager because I am currently using

Re: why using mapreduce python scripts print more NULLs

2013-03-26 Thread Abdelrhman Shettia
Hi Andy , Can you view the data from the table by hadoop fs -text $tabledir/$filename? The data may be corrupted or the filed delimiter is mixed with the data used in the transform script. Thanks On Mar 26, 2013, at 2:54 AM, 周梦想 abloz...@gmail.com wrote: testpoker

Re: S3/EMR Hive: Load contents of a single file

2013-03-26 Thread Keith Wiley
Are you sure this is doing what you think it's doing? Since Hive associates tables with directories (well external tables at least, I'm not very familiar with internal tables), my suspicion is that even if your approach described below works, what Hive actually did was use

Hive CLI works fine for ALTER TABLE but get HiveServerException using ThriftHive.Client

2013-03-26 Thread Sanjay Subramanian
Hive-site.xml setting - hive.security.authorization.enabled = true Script -- ALTER TABLE myTable ADD PARTITION (partition1='some_value1' , partition2='some_value2') LOCATION '/path/to/directory/on/hdfs/containing/data' I can execute this script using Hive CLI but

Re: HDFS directory in /user/hive/warehouse getting hive as Owner ?

2013-03-26 Thread Sanjay Subramanian
Ok I solved this The default setting hive.metastore.execute.setugi in Hive is FALSE Adding this to the hive-site.xml solved it property namehive.metastore.execute.setugi/name valuetrue/value descriptionIn unsecure mode, setting this property to true will cause the metastore to execute DFS

Re: why using mapreduce python scripts print more NULLs

2013-03-26 Thread 周梦想
Thank you,Abdelrhman! I notice that if the python script output delimiter is ' ',then it will print more NULL to fill as (a,b,c...) field. if I change the ' '.join() to '\t'.join(), it will be ok. so select clause will output field delimited by '\t'. Best Regards, Andy Zhou 2013/3/27

how to make data statistics efficiency in hive?

2013-03-26 Thread 周梦想
hello, about hsql statistics. table mytable date,uid,a,b,c 03/13/13 185690475 0 1 1 03/13/13 187270278 0 1 0 03/13/13 185690475 1 1 0 03/13/13 186012530 1 0 1 03/13/13 180286243