Again, this is now working. Thanks, Ryan
On Sat, Oct 10, 2009 at 9:30 PM, Ryan LeCompte <[email protected]> wrote: > Ah, this time I'm running into a different issue. > > So I've created my Hive table and I'm now at the point where I want to load > data into it from HDFS. However, I get the following error on the load data > command: > > Loading data to table actions > Failed with exception Wrong file format. Please check the file's format. > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > > > Any ideas how to get more info on what's wrong? The file is a SequenceFile. > > > > On Sat, Oct 10, 2009 at 9:10 PM, Ryan LeCompte <[email protected]> wrote: > >> I was able to get this working -- just needed to adjust classpaths. >> Thanks! >> >> Ryan >> >> >> >> On Sat, Oct 10, 2009 at 8:50 PM, Ryan LeCompte <[email protected]>wrote: >> >>> I printed out the classpath environment variables that I saw in the file, >>> and the paths were valid... hmmm... is there something else I could try? >>> >>> >>> On Sat, Oct 10, 2009 at 8:41 PM, Zheng Shao <[email protected]> wrote: >>> >>>> Try modify bin/hive and print out the last line in that file. >>>> It should display some classpaths stuff, make sure those classpaths are >>>> valid. >>>> >>>> Zheng >>>> >>>> >>>> On Sat, Oct 10, 2009 at 5:14 PM, Ryan LeCompte <[email protected]>wrote: >>>> >>>>> Thank you! >>>>> >>>>> Very helpful. >>>>> >>>>> Another problem: >>>>> >>>>> I am trying to install Hive 0.4, and I'm coming across the following >>>>> error when I try to start bin/hive after building: >>>>> >>>>> >>>>> java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf >>>>> at java.lang.Class.forName0(Native Method) >>>>> at java.lang.Class.forName(Class.java:247) >>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:158) >>>>> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54) >>>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >>>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >>>>> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68) >>>>> Caused by: java.lang.ClassNotFoundException: >>>>> org.apache.hadoop.hive.conf.HiveConf >>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:200) >>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:188) >>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:306) >>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:251) >>>>> at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319) >>>>> ... 7 more >>>>> >>>>> >>>>> Any ideas? >>>>> >>>>> Thanks, >>>>> Ryan >>>>> >>>>> >>>>> On Sat, Oct 10, 2009 at 2:47 PM, Zheng Shao <[email protected]> wrote: >>>>> >>>>>> Yes, we can do this: >>>>>> >>>>>> SELECT ip, SUM(IF(action = 'action1', 1, 0)), SUM(IF(action = >>>>>> 'action2', 1, 0)), SUM(IF(action = 'action3', 1, 0)) >>>>>> FROM mytable >>>>>> GROUP BY ip; >>>>>> >>>>>> For more details on IF, please refer to: >>>>>> http://dev.mysql.com/doc/refman/5.0/en/control-flow-functions.html#function_if >>>>>> >>>>>> Zheng >>>>>> >>>>>> >>>>>> On Sat, Oct 10, 2009 at 11:42 AM, Ryan LeCompte >>>>>> <[email protected]>wrote: >>>>>> >>>>>>> Hello all, >>>>>>> >>>>>>> Very newto Hive (haven't even installed it yet!), but I had a use >>>>>>> case that I didn't see demonstrated in any of the tutorial/documentation >>>>>>> that I've read thus far. >>>>>>> >>>>>>> Let's say that I have apache logs that I want to process with >>>>>>> Hadoop/Hive. Of course there may be different types of log records all >>>>>>> tying >>>>>>> back to the same user or IP address or other log attribute. Is there a >>>>>>> way >>>>>>> to submit a SINGLE Hive query to get back results that may look like: >>>>>>> >>>>>>> >>>>>>> IP Action1Count Action2Count Action3Count >>>>>>> >>>>>>> .. where the different actions correspond to different log events for >>>>>>> that IP address. >>>>>>> >>>>>>> Do I have to submit 3 different Hive queries here or can I submit a >>>>>>> single Hive query? In a regular Java-based map/reduce job, I would have >>>>>>> written a custom Writable that would record counts for each of the >>>>>>> different >>>>>>> actions, and submit it to the reducer using output.collect(IP, >>>>>>> customWritable). Here I wouldn't have to submit multiple map/reduce >>>>>>> jobs, >>>>>>> just 1. >>>>>>> >>>>>>> Thanks >>>>>>> Ryan >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Yours, >>>>>> Zheng >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Yours, >>>> Zheng >>>> >>> >>> >> >
