答复: Possibility to specify some type o f files in a directory as input

2008-06-05 Thread

Put the input path like : dir1/type1*.txt


Hi,
  I need a help in setting my map-reduce job to  consider only certain type
of files as input in a specific directory.
For example, Suppose there is a directory dir1 and I have files like 
type1_1.txt 
type1_2.txt
type1_3.txt
type2_1.txt
type2_2.txt
and If I want to consider only those files whose name starting with type1 as
input to my mapper. Then, can some one please let me know how to specify
this while configuring job? 

Thanks


-- 
View this message in context:
http://www.nabble.com/Possibility-to-specify-some-type-of-files-in-a-directo
ry-as-input-tp17665598p17665598.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.




question about hadoop 0.17 upgrade

2008-05-25 Thread
 

upgrade 0.16.3 to 0.17, error appears when start dfs and jobtracker. How can
I do with it? Thanks!

 

I have use the “start-dfs.sh �Cupgrade” command to upgrade the filesystem

 

below is the error log:

 

2008-05-26 09:14:33,463 INFO org.apache.hadoop.mapred.JobTracker:
STARTUP_MSG: 

/

STARTUP_MSG: Starting JobTracker

STARTUP_MSG:   host = test180.sqa/192.168.207.180

STARTUP_MSG:   args = []

STARTUP_MSG:   version = 0.17.0

STARTUP_MSG:   build =
http://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.17 -r 656523;
compiled by 'hadoopqa' on Thu May 15 07:22:55 UTC 2008

/

2008-05-26 09:14:33,567 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
Initializing RPC Metrics with hostName=JobTracker, port=9001

2008-05-26 09:14:33,610 INFO org.apache.hadoop.ipc.Server: IPC Server
Responder: starting

2008-05-26 09:14:33,611 INFO org.apache.hadoop.ipc.Server: IPC Server
listener on 9001: starting

2008-05-26 09:14:33,611 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 0 on 9001: starting

2008-05-26 09:14:33,611 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 1 on 9001: starting

2008-05-26 09:14:33,612 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 2 on 9001: starting

2008-05-26 09:14:33,612 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 3 on 9001: starting

2008-05-26 09:14:33,612 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 4 on 9001: starting

2008-05-26 09:14:33,612 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 5 on 9001: starting

2008-05-26 09:14:33,612 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 6 on 9001: starting

2008-05-26 09:14:33,612 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 7 on 9001: starting

2008-05-26 09:14:33,613 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 8 on 9001: starting

2008-05-26 09:14:33,613 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 9 on 9001: starting

2008-05-26 09:14:33,664 INFO org.mortbay.util.Credential: Checking Resource
aliases

2008-05-26 09:14:33,733 INFO org.mortbay.http.HttpServer: Version Jetty/5.1.
4

2008-05-26 09:14:33,734 INFO org.mortbay.util.Container: Started
HttpContext[/static,/static]

2008-05-26 09:14:33,734 INFO org.mortbay.util.Container: Started
HttpContext[/logs,/logs]

2008-05-26 09:14:33,962 INFO org.mortbay.util.Container: Started
[EMAIL PROTECTED]

2008-05-26 09:14:33,998 INFO org.mortbay.util.Container: Started
WebApplicationContext[/,/]

2008-05-26 09:14:34,000 INFO org.mortbay.http.SocketListener: Started
SocketListener on 0.0.0.0:50030

2008-05-26 09:14:34,000 INFO org.mortbay.util.Container: Started
[EMAIL PROTECTED]

2008-05-26 09:14:34,002 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=JobTracker, sessionId=

2008-05-26 09:14:34,003 INFO org.apache.hadoop.mapred.JobTracker: JobTracker
up at: 9001

2008-05-26 09:14:34,003 INFO org.apache.hadoop.mapred.JobTracker: JobTracker
webserver: 50030

2008-05-26 09:14:34,096 INFO org.apache.hadoop.mapred.JobTracker: problem
cleaning system directory: /home/hadoop/HadoopInstall/tmp/mapred/system

org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.dfs.SafeModeException: Cannot delete
/home/hadoop/HadoopInstall/tmp/mapred/system. Name node is in safe mode.

The ratio of reported blocks 0. has not reached the threshold 0.9990.
Safe mode will be turned off automatically.

at
org.apache.hadoop.dfs.FSNamesystem.deleteInternal(FSNamesystem.java:1519)

at org.apache.hadoop.dfs.FSNamesystem.delete(FSNamesystem.java:1498)

at org.apache.hadoop.dfs.NameNode.delete(NameNode.java:383)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)

 

at org.apache.hadoop.ipc.Client.call(Client.java:557)

at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212)

at org.apache.hadoop.dfs.$Proxy4.delete(Unknown Source)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)

at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocati
onHandler.java:82)

at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHand
ler.java:59)

at org.apache.hadoop.dfs.$Proxy4.delete(Unknown Source)

at o