Hello Sanita, If you use a JSON try to add the jar 'hive-json-serde.jar' before you upload your data in the final table. And also try to make your date attributes in String format first to debug (if this is the cause).
I don't know if you are using an external table with regular expressions (regexp) to pasre your data?; if this is, can you send us the definition of table and the structure of a row from your data. the final way that I can suggest is to run an operation mapreduce over the table (select count (1) from your_table) and then see the log of jobtracker to debug the issue. hope this can help you ;) 2013/7/30 Sunita Arvind <sunitarv...@gmail.com> > Hi, > > I have written a script which generates JSON files, loads it into a > dictionary, adds a few attributes and uploads the modified files to HDFS. > After the files are generated, if I perform a select * from..; on the table > which points to this location, I get "null, null...." as the result. I also > tried without the added attributes and it did not make a difference. I > strongly suspect the data. > Currently I am using strip() to eliminate trailing and leading whitespaces > and newlines. Wondering if embedded "\n" that is, json string objects > containing "\n" in the value, causes such issues. > There are no parsing errors, so I am not able to debug this issue. Are > there any flags that I can set to figure out what is happening within the > parser code? > > I set this: > hive -hiveconf hive.root.logger=DEBUG,console > > But the output is not really useful: > > blocks=[LocatedBlock{BP-330966259-192.168.1.61-1351349834344:blk_-6076570611719758877_116734; > getBlockSize()=20635; corrupt=false; offset=0; locs=[192.168.1.61:50010, > 192.168.1.66:50010, 192.168.1.63:50010]}] > > lastLocatedBlock=LocatedBlock{BP-330966259-192.168.1.61-1351349834344:blk_-6076570611719758877_116734; > getBlockSize()=20635; corrupt=false; offset=0; locs=[192.168.1.61:50010, > 192.168.1.66:50010, 192.168.1.63:50010]} > isLastBlockComplete=true} > 13/07/30 11:49:41 DEBUG hdfs.DFSClient: Connecting to datanode > 192.168.1.61:50010 > null > null > null > null > null > null > null > null > null > null > null > null > null > null > null > null > 13/07/30 11:49:41 INFO exec. > > Also, the attributes I am adding are current year, month day and time. So > they are not null for any record. I even moved existing files which did not > have these fields set so that there are no records with these fields as > null. However, I dont think this is an issue as the advantage of JSON/Hive > JSON serde is that it allows object struct to be dynamic. Right? > > Any suggestion regarding debugging would be very helpful. > > thanks > Sunita >