Re: java.io.FileNotFoundException

2010-05-02 Thread Ted Yu
Looks like localFs.mkdirs(tmpDir) failed. Can you check whether you can manually create E:/tmp/hadoop-SYSTEM/mapred/local/taskTracker/jobcache/job_201005020105_0001/attempt_201005020105_0001_m_02_0/work/tmp ? Also, what do you set mapred.local.dir to ? Try not using /tmp. On Sat, May 1, 2010

problem w/ data load

2010-05-02 Thread Susanne Lehmann
Hi, I want to load data from HDFS to Hive, the data is in compressed files. The data is stored in flat files, the delimiter is ^A (ctrl-A). As long as I use de-compressed files everything is working fine. Since ctrl-A is the default delimiter I even don't need a specification for it. I do the fol

Re: java.io.FileNotFoundException

2010-05-02 Thread Carlos Eduardo Moreira dos Santos
Yes, I can create it: $ ls E:/tmp/hadoop-SYSTEM/mapred/local/taskTracker/jobcache/job_201005020105_0001/ ls: cannot access E:/tmp/hadoop-SYSTEM/mapred/local/taskTracker/jobcache/job_201005020105_0001/: No such file or directory $ mkdir -p E:/tmp/hadoop-SYSTEM/mapred/local/taskTracker/jobcache/j

Re: problem w/ data load

2010-05-02 Thread Ted Yu
Did you add codec for the compressed files into io.compression.codecs in hadoop configuration files (core-site.xml) ? On Sun, May 2, 2010 at 11:22 AM, Susanne Lehmann < susanne.lehm...@metamarketsgroup.com> wrote: > Hi, > > I want to load data from HDFS to Hive, the data is in compressed files. >

Re: problem w/ data load

2010-05-02 Thread Susanne Lehmann
No, I did't. Can you specify what exactly I have to do? Thank you so much for your help! On Sun, May 2, 2010 at 1:11 PM, Ted Yu wrote: > Did you add codec for the compressed files into io.compression.codecs in > hadoop > configuration files (core-site.xml) ? > > On Sun, May 2, 2010 at 11:22 AM

Re: problem w/ data load

2010-05-02 Thread Ted Yu
You can find sample config from http://code.google.com/p/hadoop-gpl-compression/wiki/FAQ Look for io.compression.codecs On Sun, May 2, 2010 at 1:28 PM, Susanne Lehmann < susanne.lehm...@metamarketsgroup.com> wrote: > No, I did't. Can you specify what exactly I have to do? > Thank you so much for

Re: conf.get("map.input.file") returns null when using MultipleInputs in Hadoop 0.20

2010-05-02 Thread Yuanyuan Tian
Hi Farhan, I believe I have to use the old JobConf MapReduce interface in order to user MultipleInputs. As a result, I cannot do as you suggested. Yuanyuan |> | From: | |> >-

Re: problem w/ data load

2010-05-02 Thread Susanne Lehmann
I am using Hadoop on EC2 with pre-configured scripts. So I figured out, that the properties are already set correctly (I am using gzip): io.compression.codecs org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec Do you have another idea? On Sun, May 2, 201

Re: problem w/ data load

2010-05-02 Thread Ted Yu
Can you show us what you find in /tmp//hive.log ? On Sun, May 2, 2010 at 7:06 PM, Susanne Lehmann < susanne.lehm...@metamarketsgroup.com> wrote: > I am using Hadoop on EC2 with pre-configured scripts. So I figured > out, that the properties are already set correctly (I am using gzip): > > > io.

Re: problem w/ data load

2010-05-02 Thread Susanne Lehmann
There is nothing in the logfile. (I copied it away before the load and there is no new entry created.) The only thing is, that the values in the table are not ok. That's how the output looks like: hive> select * from test_new limit 5; OK NULLNULLNULLNULLNULLNULLNULLNULL

Re: problem w/ data load

2010-05-02 Thread Ted Yu
Susanne: I assume you have native libraries for GZIP installed. Meaning you have no trouble reading GZIP files in normal map/reduce jobs. On Sun, May 2, 2010 at 9:44 PM, Susanne Lehmann < susanne.lehm...@metamarketsgroup.com> wrote: > There is nothing in the logfile. (I copied it away before the