Hi.

Started to lab with Hive today since it seems to suit us quite well and
since we are processing our weblogstats with Hadoop as of today and ends up
doing SQL in Hadoop form it seems fair to try out a system which does that
in one step :)

I've created and loaded data into Hive with the following statements;
hive> drop table DailyUniqueSiteVisitorSample;
OK
Time taken: 4.064 seconds
hive> CREATE TABLE DailyUniqueSiteVisitorSample (sampleDate date,uid
bigint,site int,concreteStatistics int,network smallint,category
smallint,country smallint,countryCode String,sessions
smallint,pageImpressions smallint) COMMENT 'This is our weblog stats table'
PARTITIONED BY(dt STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n' STORED AS TEXTFILE;
OK
Time taken: 0.248 seconds
hive> LOAD DATA LOCAL INPATH
'/tmp/data-DenormalizedSiteVisitor.VisitsPi.2009-03-02.csv' INTO TABLE
DailyUniqueSiteVisitorSample PARTITION(dt='2009-03-02');
Copying data from file:/tmp/data-2009-03-02.csv
Loading data to table dailyuniquesitevisitorsample partition {dt=2009-03-02}
OK
Time taken: 2.258 seconds

A little confused about the text-file part but since the csv I need to
insert is a text-file so... (the tutorial only uses SequenceFile(s)), seems
to work though.

Anyway this goes well but when I issue a simple query like the below it
throws an exception:
hive> select d.* from dailyuniquesitevisitorsample d where d.site=1;
Total MapReduce jobs = 1
Number of reduce tasks is set to 0 since there's no reduce operator
java.lang.AbstractMethodError:
org.apache.hadoop.hive.ql.io.HiveInputFormat.validateInput(Lorg/apache/hadoop/mapred/JobConf;)V
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:735)
        at
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:239)
        at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:174)
        at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:207)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:306)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
        at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
        at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

I run Hadoop-018.2

Not sure that I am doing this correctly. Please guide me if I am stupid.

Kindly

//Marcus


-- 
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
marcus.he...@tailsweep.com
http://www.tailsweep.com/
http://blogg.tailsweep.com/

Reply via email to