Re: map() does not get called

2010-09-25 Thread newpant
hi, could you post your code

2010/9/20 Mark Kerzner 

> Hi,
>
> I am upgrading to 0.20 from 0.18, and right now the setup() gets called,
> but
> the map() does not.
>
> The log indicates that an input record was found - but it is not processed.
>
> 10/09/19 23:56:21 INFO mapred.TaskRunner:
> Task:attempt_local_0001_r_00_0
> is done. And is in the process of commiting
> 10/09/19 23:56:21 INFO mapred.LocalJobRunner:
> 10/09/19 23:56:21 INFO mapred.TaskRunner: Task
> attempt_local_0001_r_00_0
> is allowed to commit now
> 10/09/19 23:56:21 INFO mapred.JobClient:  map 100% reduce 0%
> 10/09/19 23:56:21 INFO output.FileOutputCommitter: Saved output of task
> 'attempt_local_0001_r_00_0' to hdfs://localhost:9000/scala/p1/output
> 10/09/19 23:56:21 INFO mapred.LocalJobRunner: reduce > reduce
> 10/09/19 23:56:21 INFO mapred.TaskRunner: Task
> 'attempt_local_0001_r_00_0' done.
> 10/09/19 23:56:22 INFO mapred.JobClient:  map 100% reduce 100%
> 10/09/19 23:56:22 INFO mapred.JobClient: Job complete: job_local_0001
> 10/09/19 23:56:22 INFO mapred.JobClient: Counters: 14
> 10/09/19 23:56:22 INFO mapred.JobClient:   FileSystemCounters
> 10/09/19 23:56:22 INFO mapred.JobClient: FILE_BYTES_READ=35600
> 10/09/19 23:56:22 INFO mapred.JobClient: HDFS_BYTES_READ=10820
> 10/09/19 23:56:22 INFO mapred.JobClient: FILE_BYTES_WRITTEN=71798
> 10/09/19 23:56:22 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=421
> 10/09/19 23:56:22 INFO mapred.JobClient:   Map-Reduce Framework
> 10/09/19 23:56:22 INFO mapred.JobClient: Reduce input groups=1
> 10/09/19 23:56:22 INFO mapred.JobClient: Combine output records=0
> 10/09/19 23:56:22 INFO mapred.JobClient: Map input records=1
> 10/09/19 23:56:22 INFO mapred.JobClient: Reduce shuffle bytes=0
> 10/09/19 23:56:22 INFO mapred.JobClient: Reduce output records=1
> 10/09/19 23:56:22 INFO mapred.JobClient: Spilled Records=2
> 10/09/19 23:56:22 INFO mapred.JobClient: Map output bytes=14
> 10/09/19 23:56:22 INFO mapred.JobClient: Combine input records=0
> 10/09/19 23:56:22 INFO mapred.JobClient: Map output records=1
> 10/09/19 23:56:22 INFO mapred.JobClient: Reduce input records=1
>
> Thank you for advice.
>
> Sincerely,
> Mark
>


Re: map() does not get called

2010-09-25 Thread Biju .B
Hi
Think this will be solved once you restart the system
this solved this kind of problem for me..

On Sat, Sep 25, 2010 at 1:23 PM, newpant  wrote:

> hi, could you post your code
>
> 2010/9/20 Mark Kerzner 
>
> > Hi,
> >
> > I am upgrading to 0.20 from 0.18, and right now the setup() gets called,
> > but
> > the map() does not.
> >
> > The log indicates that an input record was found - but it is not
> processed.
> >
> > 10/09/19 23:56:21 INFO mapred.TaskRunner:
> > Task:attempt_local_0001_r_00_0
> > is done. And is in the process of commiting
> > 10/09/19 23:56:21 INFO mapred.LocalJobRunner:
> > 10/09/19 23:56:21 INFO mapred.TaskRunner: Task
> > attempt_local_0001_r_00_0
> > is allowed to commit now
> > 10/09/19 23:56:21 INFO mapred.JobClient:  map 100% reduce 0%
> > 10/09/19 23:56:21 INFO output.FileOutputCommitter: Saved output of task
> > 'attempt_local_0001_r_00_0' to hdfs://localhost:9000/scala/p1/output
> > 10/09/19 23:56:21 INFO mapred.LocalJobRunner: reduce > reduce
> > 10/09/19 23:56:21 INFO mapred.TaskRunner: Task
> > 'attempt_local_0001_r_00_0' done.
> > 10/09/19 23:56:22 INFO mapred.JobClient:  map 100% reduce 100%
> > 10/09/19 23:56:22 INFO mapred.JobClient: Job complete: job_local_0001
> > 10/09/19 23:56:22 INFO mapred.JobClient: Counters: 14
> > 10/09/19 23:56:22 INFO mapred.JobClient:   FileSystemCounters
> > 10/09/19 23:56:22 INFO mapred.JobClient: FILE_BYTES_READ=35600
> > 10/09/19 23:56:22 INFO mapred.JobClient: HDFS_BYTES_READ=10820
> > 10/09/19 23:56:22 INFO mapred.JobClient: FILE_BYTES_WRITTEN=71798
> > 10/09/19 23:56:22 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=421
> > 10/09/19 23:56:22 INFO mapred.JobClient:   Map-Reduce Framework
> > 10/09/19 23:56:22 INFO mapred.JobClient: Reduce input groups=1
> > 10/09/19 23:56:22 INFO mapred.JobClient: Combine output records=0
> > 10/09/19 23:56:22 INFO mapred.JobClient: Map input records=1
> > 10/09/19 23:56:22 INFO mapred.JobClient: Reduce shuffle bytes=0
> > 10/09/19 23:56:22 INFO mapred.JobClient: Reduce output records=1
> > 10/09/19 23:56:22 INFO mapred.JobClient: Spilled Records=2
> > 10/09/19 23:56:22 INFO mapred.JobClient: Map output bytes=14
> > 10/09/19 23:56:22 INFO mapred.JobClient: Combine input records=0
> > 10/09/19 23:56:22 INFO mapred.JobClient: Map output records=1
> > 10/09/19 23:56:22 INFO mapred.JobClient: Reduce input records=1
> >
> > Thank you for advice.
> >
> > Sincerely,
> > Mark
> >
>


Re: A new way to merge up those small files!

2010-09-25 Thread Ted Yu
Edward:
Thanks for the tool.

I think the last parameter can be omitted if you follow what hadoop fs -text
does.
It looks at a file's magic number so that it can attempt to *detect* the
type of the file.

Cheers

On Fri, Sep 24, 2010 at 11:41 PM, Edward Capriolo wrote:

> Many times a hadoop job produces a file per reducer and the job has
> many reducers. Or a map only job one output file per input file and
> you have many input files. Or you just have many small files from some
> external process. Hadoop has sub optimal handling of small files.
> There are some ways to handle this inside a map reduce program,
> IdentityMapper + IdentityReducer for example, or multi outputs However
> we wanted a tool that could be used by people using hive, or pig, or
> map reduce. We wanted to allow people to combine a directory with
> multiple files or a hierarchy of directories like the root of a hive
> partitioned table. We also wanted to be able to combine text or
> sequence files.
>
> What we came up with is the filecrusher.
>
> Usage:
> /usr/bin/hadoop jar filecrush.jar crush.Crush /directory/to/compact
> /user/edward/backup 50 SEQUENCE
> (50 is the number of mappers here)
>
> Code is Apache V2 and you can get it here:
> http://www.jointhegrid.com/hadoop_filecrush/index.jsp
>
> Enjoy,
> Edward
>


Current status of zip file input

2010-09-25 Thread Shi Yu

Hi,

What is the current status about the support of zip files input in Hadoop?

Regards,

Shi

--
Postdoctoral Scholar
Institute for Genomics and Systems Biology
Department of Medicine, the University of Chicago
Knapp Center for Biomedical Discovery
900 E. 57th St. Room 10148
Chicago, IL 60637, US
Tel: 773-702-6799



Re: Help for Sqlserver querying with hadoop

2010-09-25 Thread Sonal Goyal
Biju,

Have you tried using DataDrivenDBInputFormat?

Thanks and Regards,
Sonal

Sonal Goyal | Founder and CEO | Nube Technologies LLP
Ph: +91-8800541717 | so...@nubetech.co | Skype: sonal.goyal
http://www.nubetech.co | http://in.linkedin.com/in/sonalgoyal





On Fri, Sep 24, 2010 at 2:06 PM, Biju .B  wrote:

> Hi
>
> Need urgent help on using sql server with hadoop
>
> am using following code to connect to database
>
>
> DBConfiguration.configureDB(conf,"com.microsoft.sqlserver.jdbc.SQLServerDriver","jdbc:sqlserver://xxx.xxx.xxx.xxx;user=abc;password=abc;DatabaseName=dbname");
> String [] fields = { "id", "url" };
> String [] fields = { "id", "url" };
> DBInputFormat.setInput(conf,MyRecord.class,"urls",null,"id", fields);
>
> Am getting following error
>
> 10/09/24 13:26:42 INFO mapred.JobClient: Task Id :
> attempt_201009231924_0008_m_01_2, Status : FAILED
> java.io.IOException: Incorrect syntax near 'LIMIT'.
>at
>
> org.apache.hadoop.mapreduce.lib.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
>at
>
> org.apache.hadoop.mapreduce.lib.db.DBRecordReader.next(DBRecordReader.java:204)
>at
>
> org.apache.hadoop.mapred.lib.db.DBInputFormat$DBRecordReaderWrapper.next(DBInputFormat.java:118)
>at
>
> org.apache.hadoop.mapred.lib.db.DBInputFormat$DBRecordReaderWrapper.next(DBInputFormat.java:87)
>at
>
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)
>at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)
>at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
>
>
> Found that the error is due to query that each task tries to execute
>
> SELECT id, url FROM urls AS urls ORDER BY id LIMIT 13228 OFFSET 13228
>
>
> the "LIMIT" and "OFFSET" are not valid in Sqlserver and it returns error
>
> Pls tell me how to solve this problem
>
> Regards
> Biju
>


Can not upload local file to HDFS

2010-09-25 Thread He Chen
Hello everyone

I can not load local file to HDFS. It gave the following errors.

WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor exception  for block
blk_-236192853234282209_419415java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readLong(DataInputStream.java:416)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2397)
10/09/25 15:38:25 WARN hdfs.DFSClient: Error Recovery for block
blk_-236192853234282209_419415 bad datanode[0] 192.168.0.23:50010
10/09/25 15:38:25 WARN hdfs.DFSClient: Error Recovery for block
blk_-236192853234282209_419415 in pipeline 192.168.0.23:50010,
192.168.0.39:50010: bad datanode 192.168.0.23:50010
Any response will be appreciated!


-- 
Best Wishes!
顺送商祺!

--
Chen He


Re: Can not upload local file to HDFS

2010-09-25 Thread Neil Ghosh
How Big is the file? Did you try Formatting Name node and Datanode?

On Sun, Sep 26, 2010 at 2:12 AM, He Chen  wrote:

> Hello everyone
>
> I can not load local file to HDFS. It gave the following errors.
>
> WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor exception  for block
> blk_-236192853234282209_419415java.io.EOFException
>at java.io.DataInputStream.readFully(DataInputStream.java:197)
>at java.io.DataInputStream.readLong(DataInputStream.java:416)
>at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2397)
> 10/09/25 15:38:25 WARN hdfs.DFSClient: Error Recovery for block
> blk_-236192853234282209_419415 bad datanode[0] 192.168.0.23:50010
> 10/09/25 15:38:25 WARN hdfs.DFSClient: Error Recovery for block
> blk_-236192853234282209_419415 in pipeline 192.168.0.23:50010,
> 192.168.0.39:50010: bad datanode 192.168.0.23:50010
> Any response will be appreciated!
>
>
> --
> Best Wishes!
> 顺送商祺!
>
> --
> Chen He
>



-- 
Thanks and Regards
Neil
http://neilghosh.com


Re: Can not upload local file to HDFS

2010-09-25 Thread He Chen
Hello Neil

No matter how big the file is. It always report this to me. The file size is
from 10KB to 100MB.

On Sat, Sep 25, 2010 at 6:08 PM, Neil Ghosh  wrote:

> How Big is the file? Did you try Formatting Name node and Datanode?
>
> On Sun, Sep 26, 2010 at 2:12 AM, He Chen  wrote:
>
> > Hello everyone
> >
> > I can not load local file to HDFS. It gave the following errors.
> >
> > WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor exception  for
> block
> > blk_-236192853234282209_419415java.io.EOFException
> >at java.io.DataInputStream.readFully(DataInputStream.java:197)
> >at java.io.DataInputStream.readLong(DataInputStream.java:416)
> >at
> >
> >
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2397)
> > 10/09/25 15:38:25 WARN hdfs.DFSClient: Error Recovery for block
> > blk_-236192853234282209_419415 bad datanode[0] 192.168.0.23:50010
> > 10/09/25 15:38:25 WARN hdfs.DFSClient: Error Recovery for block
> > blk_-236192853234282209_419415 in pipeline 192.168.0.23:50010,
> > 192.168.0.39:50010: bad datanode 192.168.0.23:50010
> > Any response will be appreciated!
> >
> >
> > --
> > Best Wishes!
> > 顺送商祺!
> >
> > --
> > Chen He
> >
>
>
>
> --
> Thanks and Regards
> Neil
> http://neilghosh.com
>



-- 
Best Wishes!
顺送商祺!

--
Chen He
(402)613-9298
PhD. student of CSE Dept.
Research Assistant of Holland Computing Center
University of Nebraska-Lincoln
Lincoln NE 68588


datanode and tasktracker shutdown error

2010-09-25 Thread shangan
previously I have a cluster containing 8 nodes and it woks well. I add 24 new 
datanodes to the cluster, tasktracker and datanode deamons can start but when I 
shutdown the cluster I find those errors on these new added datanodes. Can 
anyone explain it?

log from tasktracker

2010-09-26 09:52:21,672 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG: 
/
STARTUP_MSG: Starting TaskTracker
STARTUP_MSG:   host = localhost/127.0.0.1
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.2
STARTUP_MSG:   build = 
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; 
compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
/
2010-09-26 09:52:21,876 INFO org.mortbay.log: Logging to 
org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2010-09-26 09:52:22,006 INFO org.apache.hadoop.http.HttpServer: Port returned 
by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the 
listener on 50060
2010-09-26 09:52:22,014 INFO org.apache.hadoop.http.HttpServer: 
listener.getLocalPort() returned 50060 
webServer.getConnectors()[0].getLocalPort() returned 50060
2010-09-26 09:52:22,014 INFO org.apache.hadoop.http.HttpServer: Jetty bound to 
port 50060
2010-09-26 09:52:22,014 INFO org.mortbay.log: jetty-6.1.14
2010-09-26 09:52:42,715 INFO org.mortbay.log: Started 
selectchannelconnec...@0.0.0.0:50060
2010-09-26 09:52:42,722 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: 
Initializing JVM Metrics with processName=TaskTracker, sessionId=
2010-09-26 09:52:42,737 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: 
Initializing RPC Metrics with hostName=TaskTracker, port=28404
2010-09-26 09:52:42,793 INFO org.apache.hadoop.ipc.Server: IPC Server 
Responder: starting
2010-09-26 09:52:42,793 INFO org.apache.hadoop.ipc.Server: IPC Server listener 
on 28404: starting
2010-09-26 09:52:42,794 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 
on 28404: starting
2010-09-26 09:52:42,795 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 
on 28404: starting
2010-09-26 09:52:42,795 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 
on 28404: starting
2010-09-26 09:52:42,795 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 
on 28404: starting
2010-09-26 09:52:42,795 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 
on 28404: starting
2010-09-26 09:52:42,795 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 
on 28404: starting
2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 
on 28404: starting
2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 
on 28404: starting
2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 
on 28404: starting
2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 
on 28404: starting
2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
10 on 28404: starting
2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
11 on 28404: starting
2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
12 on 28404: starting
2010-09-26 09:52:42,797 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
13 on 28404: starting
2010-09-26 09:52:42,797 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
14 on 28404: starting
2010-09-26 09:52:42,797 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
15 on 28404: starting
2010-09-26 09:52:42,797 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker 
up at: localhost/127.0.0.1:28404
2010-09-26 09:52:42,797 INFO org.apache.hadoop.mapred.TaskTracker: Starting 
tracker tracker_localhost:localhost/127.0.0.1:28404
2010-09-26 09:52:55,025 INFO org.apache.hadoop.mapred.TaskTracker: Starting 
thread: Map-events fetcher for all reduce tasks on 
tracker_localhost:localhost/127.0.0.1:28404
2010-09-26 09:52:55,027 INFO org.apache.hadoop.mapred.TaskTracker:  Using 
MemoryCalculatorPlugin : 
org.apache.hadoop.util.linuxmemorycalculatorplu...@2de12f6d
2010-09-26 09:52:55,031 WARN org.apache.hadoop.mapred.TaskTracker: 
TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
2010-09-26 09:52:55,032 INFO org.apache.hadoop.mapred.IndexCache: IndexCache 
created with max memory = 10485760
2010-09-26 09:54:13,298 ERROR org.apache.hadoop.mapred.TaskTracker: Caught 
exception: java.io.IOException: Call to vm221/10.11.2.221:9001 failed on local 
exception: java.io.EOFException
at org.apache.hadoop.ipc.Client.wrapException(Client.java:775)
at org.apache.hadoop.ipc.Client.call(Client.java:743)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
at org.apache.hadoop.mapred.$Proxy4.heartbeat(Unknown Source)
at 
org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1215)
at 
org.apache.hadoop.mapred.TaskTracker.o

Re: Can not upload local file to HDFS

2010-09-25 Thread Nan Zhu
Hi Chen,

It seems that you have a bad datanode? maybe you should reformat them?

Nan

On Sun, Sep 26, 2010 at 10:42 AM, He Chen  wrote:

> Hello Neil
>
> No matter how big the file is. It always report this to me. The file size
> is
> from 10KB to 100MB.
>
> On Sat, Sep 25, 2010 at 6:08 PM, Neil Ghosh  wrote:
>
> > How Big is the file? Did you try Formatting Name node and Datanode?
> >
> > On Sun, Sep 26, 2010 at 2:12 AM, He Chen  wrote:
> >
> > > Hello everyone
> > >
> > > I can not load local file to HDFS. It gave the following errors.
> > >
> > > WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor exception  for
> > block
> > > blk_-236192853234282209_419415java.io.EOFException
> > >at java.io.DataInputStream.readFully(DataInputStream.java:197)
> > >at java.io.DataInputStream.readLong(DataInputStream.java:416)
> > >at
> > >
> > >
> >
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2397)
> > > 10/09/25 15:38:25 WARN hdfs.DFSClient: Error Recovery for block
> > > blk_-236192853234282209_419415 bad datanode[0] 192.168.0.23:50010
> > > 10/09/25 15:38:25 WARN hdfs.DFSClient: Error Recovery for block
> > > blk_-236192853234282209_419415 in pipeline 192.168.0.23:50010,
> > > 192.168.0.39:50010: bad datanode 192.168.0.23:50010
> > > Any response will be appreciated!
> > >
> > >
> > > --
> > > Best Wishes!
> > > 顺送商祺!
> > >
> > > --
> > > Chen He
> > >
> >
> >
> >
> > --
> > Thanks and Regards
> > Neil
> > http://neilghosh.com
> >
>
>
>
> --
> Best Wishes!
> 顺送商祺!
>
> --
> Chen He
> (402)613-9298
> PhD. student of CSE Dept.
> Research Assistant of Holland Computing Center
> University of Nebraska-Lincoln
> Lincoln NE 68588
>


Re: Help for Sqlserver querying with hadoop

2010-09-25 Thread Biju .B
Sonal

Thanks, Will try it and and let you know

Regards
Biju

On Sun, Sep 26, 2010 at 12:00 AM, Sonal Goyal  wrote:

> Biju,
>
> Have you tried using DataDrivenDBInputFormat?
>
> Thanks and Regards,
> Sonal
>
> Sonal Goyal | Founder and CEO | Nube Technologies LLP
> Ph: +91-8800541717 | so...@nubetech.co | Skype: sonal.goyal
> http://www.nubetech.co | http://in.linkedin.com/in/sonalgoyal
>
>
>
>
>
> On Fri, Sep 24, 2010 at 2:06 PM, Biju .B  wrote:
>
> > Hi
> >
> > Need urgent help on using sql server with hadoop
> >
> > am using following code to connect to database
> >
> >
> >
> DBConfiguration.configureDB(conf,"com.microsoft.sqlserver.jdbc.SQLServerDriver","jdbc:sqlserver://xxx.xxx.xxx.xxx;user=abc;password=abc;DatabaseName=dbname");
> > String [] fields = { "id", "url" };
> > String [] fields = { "id", "url" };
> > DBInputFormat.setInput(conf,MyRecord.class,"urls",null,"id", fields);
> >
> > Am getting following error
> >
> > 10/09/24 13:26:42 INFO mapred.JobClient: Task Id :
> > attempt_201009231924_0008_m_01_2, Status : FAILED
> > java.io.IOException: Incorrect syntax near 'LIMIT'.
> >at
> >
> >
> org.apache.hadoop.mapreduce.lib.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
> >at
> >
> >
> org.apache.hadoop.mapreduce.lib.db.DBRecordReader.next(DBRecordReader.java:204)
> >at
> >
> >
> org.apache.hadoop.mapred.lib.db.DBInputFormat$DBRecordReaderWrapper.next(DBInputFormat.java:118)
> >at
> >
> >
> org.apache.hadoop.mapred.lib.db.DBInputFormat$DBRecordReaderWrapper.next(DBInputFormat.java:87)
> >at
> >
> >
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)
> >at
> >
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)
> >at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
> >at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> >at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> >at org.apache.hadoop.mapred.Child.main(Child.java:170)
> >
> >
> >
> > Found that the error is due to query that each task tries to execute
> >
> > SELECT id, url FROM urls AS urls ORDER BY id LIMIT 13228 OFFSET 13228
> >
> >
> > the "LIMIT" and "OFFSET" are not valid in Sqlserver and it returns error
> >
> > Pls tell me how to solve this problem
> >
> > Regards
> > Biju
> >
>


Re: common-user Digest 26 Sep 2010 04:20:50 -0000 Issue 1548

2010-09-25 Thread Sudhir Vallamkondu
The exceptions below before shutdown are because of inability to access the
jobtracker and namenode. I am guessing jobtracker and namenode are getting
shutdown before these tasktracker and datanodes get shutdown. The errors
below are with connecting to port 9000 and 9001 which are default ports for
namenode and jobtracker (one of the default options)

http://www.cloudera.com/blog/2009/08/hadoop-default-ports-quick-reference/

- Sudhir


On Sep/25/ 9:20 PM, "common-user-digest-h...@hadoop.apache.org"
 wrote:

> From: shangan 
> Date: Sun, 26 Sep 2010 11:08:22 +0800
> To: hadoop-user 
> Subject: datanode and tasktracker shutdown error
> 
> previously I have a cluster containing 8 nodes and it woks well. I add 24 new
> datanodes to the cluster, tasktracker and datanode deamons can start but when
> I shutdown the cluster I find those errors on these new added datanodes. Can
> anyone explain it?
> 
> log from tasktracker
> 
> 2010-09-26 09:52:21,672 INFO org.apache.hadoop.mapred.TaskTracker:
> STARTUP_MSG: 
> /
> STARTUP_MSG: Starting TaskTracker
> STARTUP_MSG:   host = localhost/127.0.0.1
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 0.20.2
> STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707;
> compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
> /
> 2010-09-26 09:52:21,876 INFO org.mortbay.log: Logging to
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog
> 2010-09-26 09:52:22,006 INFO org.apache.hadoop.http.HttpServer: Port returned
> by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening
> the listener on 50060
> 2010-09-26 09:52:22,014 INFO org.apache.hadoop.http.HttpServer:
> listener.getLocalPort() returned 50060
> webServer.getConnectors()[0].getLocalPort() returned 50060
> 2010-09-26 09:52:22,014 INFO org.apache.hadoop.http.HttpServer: Jetty bound to
> port 50060
> 2010-09-26 09:52:22,014 INFO org.mortbay.log: jetty-6.1.14
> 2010-09-26 09:52:42,715 INFO org.mortbay.log: Started
> selectchannelconnec...@0.0.0.0:50060
> 2010-09-26 09:52:42,722 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=TaskTracker, sessionId=
> 2010-09-26 09:52:42,737 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
> Initializing RPC Metrics with hostName=TaskTracker, port=28404
> 2010-09-26 09:52:42,793 INFO org.apache.hadoop.ipc.Server: IPC Server
> Responder: starting
> 2010-09-26 09:52:42,793 INFO org.apache.hadoop.ipc.Server: IPC Server listener
> on 28404: starting
> 2010-09-26 09:52:42,794 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 0 on 28404: starting
> 2010-09-26 09:52:42,795 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 1 on 28404: starting
> 2010-09-26 09:52:42,795 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 3 on 28404: starting
> 2010-09-26 09:52:42,795 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 2 on 28404: starting
> 2010-09-26 09:52:42,795 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 4 on 28404: starting
> 2010-09-26 09:52:42,795 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 5 on 28404: starting
> 2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 6 on 28404: starting
> 2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 7 on 28404: starting
> 2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 8 on 28404: starting
> 2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 9 on 28404: starting
> 2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 10 on 28404: starting
> 2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 11 on 28404: starting
> 2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 12 on 28404: starting
> 2010-09-26 09:52:42,797 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 13 on 28404: starting
> 2010-09-26 09:52:42,797 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 14 on 28404: starting
> 2010-09-26 09:52:42,797 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 15 on 28404: starting
> 2010-09-26 09:52:42,797 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker
> up at: localhost/127.0.0.1:28404
> 2010-09-26 09:52:42,797 INFO org.apache.hadoop.mapred.TaskTracker: Starting
> tracker tracker_localhost:localhost/127.0.0.1:28404
> 2010-09-26 09:52:55,025 INFO org.apache.hadoop.mapred.TaskTracker: Starting
> thread: Map-events fetcher for all reduce tasks on
> tracker_localhost:localhost/127.0.0.1:28404
> 2010-09-26 09:52:55,027 INFO org.apache.hadoop.mapred.TaskTracker:  Using
> MemoryCalculatorPlugin :
> org.apache.hadoop.util.linuxmemorycalculatorplu...@2de12f6d
> 2010-09-26 09:52:55,031 WARN org.apache.hadoop.mapred.TaskTracker:
> TaskTracker's tot