Fabio, It looks like you need to set your environment variables to connect to cassandra. Check out the readme. Quoting here: Finally, set the following as environment variables (uppercase, underscored), or as Hadoop configuration variables (lowercase, dotted): * PIG_RPC_PORT or cassandra.thrift.port : the port thrift is listening on * PIG_INITIAL_ADDRESS or cassandra.thrift.address : initial address to connect to * PIG_PARTITIONER or cassandra.partitioner.class : cluster partitioner
So you'll probably want to do: export PIG_INITIAL_ADDRESS=localhost export PIG_RPC_PORT=9160 export PIG_PARTITIONER=org.apache.cassandra.dht.RandomPartitioner Tante belle cose and let me know if this doesn't work, Jeremy On Apr 5, 2011, at 9:38 AM, Fabio Souto wrote: > Hi Jeremy, > > Of course, here it is: > > Backend error message > --------------------- > java.lang.NumberFormatException: null > at java.lang.Integer.parseInt(Integer.java:417) > at java.lang.Integer.parseInt(Integer.java:499) > at > org.apache.cassandra.hadoop.ConfigHelper.getRpcPort(ConfigHelper.java:233) > at > org.apache.cassandra.hadoop.pig.CassandraStorage.setConnectionInformation(Unknown > Source) > at org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(Unknown > Source) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.mergeSplitSpecificConf(PigInputFormat.java:133) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:111) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:613) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322) > at org.apache.hadoop.mapred.Child$4.run(Child.java:240) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) > at org.apache.hadoop.mapred.Child.main(Child.java:234) > > Pig Stack Trace > --------------- > ERROR 2997: Unable to recreate exception from backed error: > java.lang.NumberFormatException: null > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias A. Backend error : Unable to recreate exception from > backed error: java.lang.NumberFormatException: null > at org.apache.pig.PigServer.openIterator(PigServer.java:742) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141) > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76) > at org.apache.pig.Main.run(Main.java:465) > at org.apache.pig.Main.main(Main.java:107) > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2997: > Unable to recreate exception from backed error: > java.lang.NumberFormatException: null > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:221) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:151) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:337) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:378) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1198) > at org.apache.pig.PigServer.storeEx(PigServer.java:874) > at org.apache.pig.PigServer.store(PigServer.java:816) > at org.apache.pig.PigServer.openIterator(PigServer.java:728) > ... 7 more > ================================================================================ > > > Thanks for all, > Fabio > > > On 05/04/2011, at 16:19, Jeremy Hanna wrote: > >> Fabio, >> >> Could you post the full stack trace that's found in the pig_<long >> number>.log that's in the directory that you ran pig? >> >> Thanks, >> >> Jeremy >> >> On Apr 5, 2011, at 8:42 AM, Fabio Souto wrote: >> >>> Hello, >>> >>> I have installed Pig 0.8.0 and Cassandra 0.7.4 and I'm not able to read >>> data from cassandra. I write a simple query just to test: >>> >>> grunt> A = LOAD 'cassandra://msg_keyspace/messages' USING >>> org.apache.cassandra.hadoop.pig.CassandraStorage(); >>> >>> grunt> dump A; >>> >>> >>> And i'm getting the following error: >>> ========================================================================== >>> 2011-04-05 15:33:57,669 [main] INFO >>> org.apache.pig.tools.pigstats.ScriptState - Pig features used in the >>> script: UNKNOWN >>> 2011-04-05 15:33:57,669 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - >>> pig.usenewlogicalplan is set to true. New logical plan will be used. >>> 2011-04-05 15:33:57,819 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: A: >>> Store(hdfs://localhost/tmp/temp2037710644/tmp-29784200:org.apache.pig.impl.io.InterStorage) >>> - scope-1 Operator Key: scope-1) >>> 2011-04-05 15:33:57,850 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - >>> File concatenation threshold: 100 optimistic? false >>> 2011-04-05 15:33:57,877 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer >>> - MR plan size before optimization: 1 >>> 2011-04-05 15:33:57,877 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer >>> - MR plan size after optimization: 1 >>> 2011-04-05 15:33:57,969 [main] INFO >>> org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added >>> to the job >>> 2011-04-05 15:33:57,990 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler >>> - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 >>> 2011-04-05 15:34:03,376 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler >>> - Setting up single store job >>> 2011-04-05 15:34:03,416 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>> - 1 map-reduce job(s) waiting for submission. >>> 2011-04-05 15:34:03,929 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>> - 0% complete >>> 2011-04-05 15:34:04,597 [Thread-5] INFO >>> org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input >>> paths (combined) to process : 1 >>> 2011-04-05 15:34:05,942 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>> - HadoopJobId: job_201104051459_0008 >>> 2011-04-05 15:34:05,943 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>> - More information at: >>> http://localhost:50030/jobdetails.jsp?jobid=job_201104051459_0008 >>> 2011-04-05 15:34:35,912 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>> - job job_201104051459_0008 has failed! Stop running all dependent jobs >>> 2011-04-05 15:34:35,918 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>> - 100% complete >>> 2011-04-05 15:34:35,931 [main] ERROR org.apache.pig.tools.pigstats.PigStats >>> - ERROR 2997: Unable to recreate exception from backed error: >>> java.lang.NumberFormatException: null >>> 2011-04-05 15:34:35,931 [main] ERROR >>> org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed! >>> 2011-04-05 15:34:35,933 [main] INFO org.apache.pig.tools.pigstats.PigStats >>> - Script Statistics: >>> >>> HadoopVersion PigVersion UserId StartedAt FinishedAt >>> Features >>> 0.20.2-CDH3B4 0.8.0-SNAPSHOT root 2011-04-05 15:33:57 >>> 2011-04-05 15:34:35 UNKNOWN >>> >>> Failed! >>> >>> Failed Jobs: >>> JobId Alias Feature Message Outputs >>> job_201104051459_0008 A MAP_ONLY Message: Job failed! >>> Error - NA hdfs://localhost/tmp/temp2037710644/tmp-29784200, >>> >>> Input(s): >>> Failed to read data from "cassandra://msg_keyspace/messages" >>> >>> Output(s): >>> Failed to produce result in >>> "hdfs://localhost/tmp/temp2037710644/tmp-29784200" >>> ========================================================================== >>> >>> Any idea how to fix this? >>> Cheers >> >