Integration of pig and hadoop fails with "Failed to create DataStorage" error.

2008-08-11 Thread us latha
Hi

Followed the link http://wiki.apache.org/pig/PigTutorial  to setup PIG with
an existing HADOOP SINGLE NODE cluster.
Am trying to execute the example mentioned in the link under section "Pig
Scripts: Hadoop Cluster"

steps followed:
-
1) setup hadoop on single node. The storage and retrieving data on dfs was
successful
2) on the same node, compiled the pig code and tried the following command
java -cp $PIGDIR/pig.jar:$HADOOPSITEPATH org.apache.pig.Main
script1-hadoop.pig


Am getting following stack trace


]$ java -cp $PIGDIR/pig.jar:$HADOOPSITEPATH org.apache.pig.Main
script1-hadoop.pig

2008-08-11 12:09:23,829 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
to hadoop file system at: hdfs://localhost.loca
ldomain:54310
java.lang.RuntimeException: Failed to create DataStorage
at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:56)
at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.(HDataStorage.java:39)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:160)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:108)
at org.apache.pig.impl.PigContext.connect(PigContext.java:177)
at org.apache.pig.PigServer.(PigServer.java:149)
at org.apache.pig.tools.grunt.Grunt.(Grunt.java:43)
at org.apache.pig.Main.main(Main.java:293)
Caused by: java.net.SocketTimeoutException: timed out waiting for rpc
response
at org.apache.hadoop.ipc.Client.call(Client.java:559)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212)
at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:313)
at
org.apache.hadoop.dfs.DFSClient.createRPCNamenode(DFSClient.java:102)
at org.apache.hadoop.dfs.DFSClient.(DFSClient.java:178)
at
org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFileSystem.java:68)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1280)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:56)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1291)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:203)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:108)
at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:53)
... 7 more


Could you please help me to resolve above issue.  My hadoop setup is working
fine. The hadoop-site.xml am using is as follows.
namenode url is working fine.. (http://localhost:50070/)
---


  hadoop.tmp.dir
  /tmp/hadoop-suravako
  A base for other temporary directories.



  fs.default.name
  hdfs://localhost.localdomain:54310
  



  mapred.job.tracker
  localhost:54311
  
  



  dfs.replication
  1
  
  





Please let me know if am missing something..

Thankyou
Srilatha


Re: Where can i download hadoop-0.17.1-examples.jar

2008-07-31 Thread us latha
Thankyou Amareshwari..Was looking in the source code of 0.17.1 release which
i had downloaded through svn. Hence, could not find it previously. Now, am
able to run wordcount example successfully.

Thankyou
Srilatha

On Thu, Jul 31, 2008 at 10:17 AM, Amareshwari Sriramadasu <
[EMAIL PROTECTED]> wrote:

> Hi Srilatha,
>
> You can download hadoop release tar ball from
> http://hadoop.apache.org/core/releases.html
> You will find hadoop-*-examples.jar when you untar it.
>
> Thanks,
> Amareshwari
>
>
> us latha wrote:
>
>> HI All,
>>
>> Trying to run the wordcount example on single node hadoop setup.
>> Could anyone please point me the location from where I could download
>> hadoop-0.17.1-examples.jar?
>>
>> Thankyou
>> Srilatha
>>
>>
>>
>
>


Where can i download hadoop-0.17.1-examples.jar

2008-07-30 Thread us latha
HI All,

Trying to run the wordcount example on single node hadoop setup.
Could anyone please point me the location from where I could download
hadoop-0.17.1-examples.jar?

Thankyou
Srilatha


unable to run wordcount example on two node cluster

2008-07-06 Thread us latha
Hi All,

I have the following setup
( Node 1, 2 are redhat linux 4;  Node 3,4 are redhat linux 3)

Node1 -> namenode
Node2 -> job tracker
Node3 -> slave (data node)
Node4 -> slave (data node)

was able to input some data to the datanodes and then
could find that  the data is properly getting stored on data nodes.

[NODE1]$ bin/hadoop dfs -ls

Found 3 items
drwxr-xr-x   - user1 supergroup  0 2008-07-04 07:10
/user/user1/input
drwxr-xr-x   - user1 supergroup  0 2008-07-04 09:17
/user/user1/test3
-rw-r--r-- 3 user1 supergroup   3951 2008-07-04 07:10
/user/user1/wordcount.jar


Now, am trying to run wordcount example mentioned in the link
http://hadoop.apache.org/core/docs/r0.17.0/mapred_tutorial.html

Followed steps:

1)[NODE1]$ javac -classpath ${HADOOP_HOME}/ hadoop-0.17.1-core.jar -d
wordcount_classes WordCount.java

2) [NODE1]$jar -cvf wordcount.jar -C wordcount_classes/ .

3)[NODE1]$ bin/hadoop dfs -copyFromLocal wordcount.jar wordcount.jar

4) [NODE1]$ bin/hadoop jar wordcount.jar org.myorg.WordCount input output

The output is as follows

[NODE1]$ bin/hadoop jar wordcount.jar  org.myorg.WordCount input output2
08/07/06 03:10:23 INFO mapred.FileInputFormat: Total input paths to process
: 3
08/07/06 03:10:23 INFO mapred.FileInputFormat: Total input paths to process
: 3
08/07/06 03:10:24 INFO mapred.JobClient: Running job: job_200806290715_0027
08/07/06 03:10:25 INFO mapred.JobClient:  map 0% reduce 0%

*** It hangs here forever

The log file on the node1 (namenode) says ...

java.io.IOException: Inconsistent checkpoint fileds. LV = -16 namespaceID =
315235321 cTime = 0. Expecting respectively: -16; 902613609; 0
at
org.apache.hadoop.dfs.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:65)
at
org.apache.hadoop.dfs.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:568)
at
org.apache.hadoop.dfs.SecondaryNameNode$CheckpointStorage.access$000(SecondaryNameNode.java:464)
at
org.apache.hadoop.dfs.SecondaryNameNode.doMerge(SecondaryNameNode.java:341)
at
org.apache.hadoop.dfs.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:305)
at
org.apache.hadoop.dfs.SecondaryNameNode.run(SecondaryNameNode.java:216)
"hadoop-suravako-secondarynamenode-stapj13.out" 15744L, 1409088C

-

Please let me know if i am missing something and please help me resolve the
above issue.

I shall provide any specific log info if required.

Thankyou

Srilatha