RE: NULL values and != operations
Look here : https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-RelationalOperators If one the sides of != is NULL, the results is NULL (not true, but not false also) From: Blaine Elliott [mailto:bla...@chegg.com] Sent: Wednesday, February 05, 2014 8:50 PM To: user@hive.apache.org Subject: NULL values and != operations I have come across a strange situation in hive and I want to know if there is an explanation. The CASE operation below does not work when the operator is != but does work then the operator is =. Maybe it is true that an = operation is valid if a value is NULL. But an != operation is invalid if a value is NULL. That seems bizarre. Is this a bug or can this be explained? I am using Amazon EMR w/hadoop v1.0.3 hive v0.11.0 -- the following SQL results are expected such that the last column is 1 or 0 SELECT user_name , val0 , val1 , CASE WHEN val0 = val1 THEN 1 ELSE 0 END FROM ( SELECT user_name , MIN(STR_TO_MAP(kvp, , =)['val0']) AS val0 , MIN(STR_TO_MAP(kvp, , =)['val1']) AS val1 FROM stgdb.fact_webrequest GROUP BY user_name ) x; user0 42.01 42.01 1 user1 NULL14.1301 0 user2 NULL15.03 0 user3 NULL43.01 0 user4 NULL40.05 0 user5 NULL13.1305 0 user6 51.0913 51.0913 1 user7 NULL11.0701 0 user8 NULL52.02 0 -- the following SQL results are strange such that the last column is always 0 SELECT user_name , val0 , val1 , CASE WHEN val0 != val1 THEN 1 ELSE 0 END FROM ( SELECT user_name , MIN(STR_TO_MAP(kvp, , =)['val0']) AS val0 , MIN(STR_TO_MAP(kvp, , =)['val1']) AS val1 FROM stgdb.fact_webrequest GROUP BY user_name ) x; user0 42.01 42.01 0 user1 NULL14.1301 0 user2 NULL15.03 0 user3 NULL43.01 0 user4 NULL40.05 0 user5 NULL13.1305 0 user6 51.0913 51.0913 0 user7 NULL11.0701 0 user8 NULL52.02 0 Blaine Elliott Chegg | Senior Data Engineer * 805 637 4556 | * bla...@chegg.commailto:bla...@chegg.com This footnote confirms that this email message has been scanned by PineApp Mail-SeCure for the presence of malicious code, vandals computer viruses.
Finding Hive and Hadoop version from command line
All, Is there any way from the command prompt I can find which hive version I am using and Hadoop version too? Thanks in advance. Regards, Raj
Re: Finding Hive and Hadoop version from command line
I like: hive version: $ schematool -info -dbType your metastore dbtype hadoop version: $ hadoop version Lefty recently plugged schematool not long ago so props to her on that one. Under the covers you'll see that it's a shortcut for hive --service schemaTool which is the official way to run it. Either way will do it. On Sun, Feb 9, 2014 at 8:32 AM, Raj Hadoop hadoop...@yahoo.com wrote: All, Is there any way from the command prompt I can find which hive version I am using and Hadoop version too? Thanks in advance. Regards, Raj
How to use hive api monitor hive job
Hi all How to use Hive API, monitor running Hive jobs yankunhad...@gmail.com
Add few record(s) to a Hive table or a HDFS file on a daily basis
Hi, My requirement is a typical Datawarehouse and ETL requirement. I need to accomplish 1) Daily Insert transaction records to a Hive table or a HDFS file. This table or file is not a big table ( approximately 10 records per day). I don't want to Partition the table / file. I am reading a few articles on this. It was being mentioned that we need to load to a staging table in Hive. And then insert like the below : insertoverwrite tablefinaltable select*fromstaging; I am not getting this logic. How should I populate the staging table daily. Thanks, Raj
Re: Add few record(s) to a Hive table or a HDFS file on a daily basis
Why not INSERT INTO for appending new records? a)load the new records into a staging table b)INSERT INTO final table from the staging table On 10-Feb-2014 8:16 am, Raj Hadoop hadoop...@yahoo.com wrote: Hi, My requirement is a typical Datawarehouse and ETL requirement. I need to accomplish 1) Daily Insert transaction records to a Hive table or a HDFS file. This table or file is not a big table ( approximately 10 records per day). I don't want to Partition the table / file. I am reading a few articles on this. It was being mentioned that we need to load to a staging table in Hive. And then insert like the below : insert overwrite table finaltable select * from staging; I am not getting this logic. How should I populate the staging table daily. Thanks, Raj