Hi,

Im trying to create custom input format that will work with hive version 0.7.0 
& hadoop 0.20.205 (current amazon EMR setup)
in attachment are: dummy input format, record reader & input split i created

below are steps im performing to try make it work (without success), so if im 
missing something please let me know:

1. start EMR job with hive interactive mode (only one node - master) and ssh to 
it after it starts

2. copy jar with code from attachment into /home/hadoop/lib directory

3. run hive in interactive mode (hive command)

4. create hive table:

create table dummy (idx string)
stored as inputformat 'test.DummyInputFormat' OUTPUTFORMAT 
'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat';

5. select * from dummy;

and nothing happens (zero results)
im not getting any errors, so hard to say whats going on there

i know that hive can see my jar, because if i use inputformat name that dont 
exists in my jar im getting error message

i even tried to create some files on s3 from inside of recordreader and 
inputformat (to see if any method is actually called by hive), but no files 
were created

any help will be very appreciated
thx,

Attachment: DummyInputSplit.java
Description: DummyInputSplit.java

Attachment: DummyRecordReader.java
Description: DummyRecordReader.java

Attachment: DummyInputFormat.java
Description: DummyInputFormat.java

Reply via email to