help : error run pig

2010-09-27 Thread Ngô Văn Vĩ
I run Pig at Hadoop Mode
(Pig-0.7.0 and hadoop-0.20.2)
have error?
ng...@master:~/pig-0.7.0$ bin/pig
10/09/27 08:39:40 INFO pig.Main: Logging error messages to:
/home/ngovi/pig-0.7.0/pig_1285601980268.log
2010-09-27 08:39:40,538 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
to hadoop file system at: hdfs://master:54310/
2010-09-27 08:39:41,760 [main] INFO  org.apache.hadoop.ipc.Client - Retrying
connect to server: master/192.168.230.130:54310. Already tried 0 time(s).
2010-09-27 08:39:42,762 [main] INFO  org.apache.hadoop.ipc.Client - Retrying
connect to server: master/192.168.230.130:54310. Already tried 1 time(s).
2010-09-27 08:39:43,763 [main] INFO  org.apache.hadoop.ipc.Client - Retrying
connect to server: master/192.168.230.130:54310. Already tried 2 time(s).
2010-09-27 08:39:44,765 [main] INFO  org.apache.hadoop.ipc.Client - Retrying
connect to server: master/192.168.230.130:54310. Already tried 3 time(s).
2010-09-27 08:39:45,766 [main] INFO  org.apache.hadoop.ipc.Client - Retrying
connect to server: master/192.168.230.130:54310. Already tried 4 time(s).
2010-09-27 08:39:46,767 [main] INFO  org.apache.hadoop.ipc.Client - Retrying
connect to server: master/192.168.230.130:54310. Already tried 5 time(s).
2010-09-27 08:39:47,768 [main] INFO  org.apache.hadoop.ipc.Client - Retrying
connect to server: master/192.168.230.130:54310. Already tried 6 time(s).
2010-09-27 08:39:48,769 [main] INFO  org.apache.hadoop.ipc.Client - Retrying
connect to server: master/192.168.230.130:54310. Already tried 7 time(s).
2010-09-27 08:39:49,770 [main] INFO  org.apache.hadoop.ipc.Client - Retrying
connect to server: master/192.168.230.130:54310. Already tried 8 time(s).
2010-09-27 08:39:50,771 [main] INFO  org.apache.hadoop.ipc.Client - Retrying
connect to server: master/192.168.230.130:54310. Already tried 9 time(s).
2010-09-27 08:39:50,780 [main] ERROR org.apache.pig.Main - ERROR 2999:
Unexpected internal error. Failed to create DataStorage

Help me??
Thanks
-- 
Ngô Văn Vĩ
Công Nghệ Phần Mềm
Phone: 01695893851


[jira] Updated: (PIG-1642) Order by doesn't use estimation to determine the parallelism

2010-09-27 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1642:
--

  Status: Resolved  (was: Patch Available)
Hadoop Flags: [Reviewed]
  Resolution: Fixed

Patch committed to both trunk and 0.8 branch.

 Order by doesn't use estimation to determine the parallelism
 

 Key: PIG-1642
 URL: https://issues.apache.org/jira/browse/PIG-1642
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Richard Ding
Assignee: Richard Ding
 Fix For: 0.8.0

 Attachments: PIG-1642.patch, PIG-1642_1.patch, PIG-1642_1.patch


 With PIG-1249, a simple heuristic is used to determine the number of reducers 
 if it isn't specified (via PARALLEL or default_parallel). For order by 
 statement, however, it still defaults to 1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1647) Logical simplifier throws a NPE

2010-09-27 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12915365#action_12915365
 ] 

Daniel Dai commented on PIG-1647:
-

+1. Please commit.

 Logical simplifier throws a NPE
 ---

 Key: PIG-1647
 URL: https://issues.apache.org/jira/browse/PIG-1647
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Yan Zhou
Assignee: Yan Zhou
 Fix For: 0.8.0

 Attachments: PIG-1647.patch, PIG-1647.patch


 A query like:
 A = load 'd.txt' as (a:chararray, b:long, c:map[], d:chararray, e:chararray);
 B = filter A by a == 'v' and b == 117L and c#'p1' == 'h' and c#'p2' == 'to' 
 and ((d is not null and d != '') or (e is not null and e != ''));
 will cause the logical expression simplifier to throw a NPE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1647) Logical simplifier throws a NPE

2010-09-27 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-1647:
--

Status: Resolved  (was: Patch Available)
Resolution: Fixed

Patch committed to both trunk and the 0.8 branch.

 Logical simplifier throws a NPE
 ---

 Key: PIG-1647
 URL: https://issues.apache.org/jira/browse/PIG-1647
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Yan Zhou
Assignee: Yan Zhou
 Fix For: 0.8.0

 Attachments: PIG-1647.patch, PIG-1647.patch


 A query like:
 A = load 'd.txt' as (a:chararray, b:long, c:map[], d:chararray, e:chararray);
 B = filter A by a == 'v' and b == 117L and c#'p1' == 'h' and c#'p2' == 'to' 
 and ((d is not null and d != '') or (e is not null and e != ''));
 will cause the logical expression simplifier to throw a NPE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1641) Incorrect counters in local mode

2010-09-27 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12915408#action_12915408
 ] 

Ashutosh Chauhan commented on PIG-1641:
---

Tested manually for local mode. Messages were same as proposed above. +1 for 
the commit. One minor suggestion is to put a line at the start saying something 
like: Detected Local mode. Stats reported below may be incomplete. This will 
reinforce the message to users that stats reporting is not transparent across 
different modes (local Vs map-reduce).

 Incorrect counters in local mode
 

 Key: PIG-1641
 URL: https://issues.apache.org/jira/browse/PIG-1641
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Ashutosh Chauhan
Assignee: Richard Ding
 Fix For: 0.8.0

 Attachments: PIG-1641.patch


 User report, not verified.
 email
 HadoopVersionPigVersionUserIdStartedAtFinishedAtFeatures
 0.20.20.8.0-SNAPSHOTuser2010-09-21 19:25:582010-09-21 
 21:58:42ORDER_BY
 Success!
 Job Stats (time in seconds):
 JobIdMapsReducesMaxMapTimeMinMapTImeAvgMapTime
 MaxReduceTimeMinReduceTimeAvgReduceTimeAliasFeatureOutputs
 job_local_000100000000rawMAP_ONLY
 job_local_000200000000rank_sort
 SAMPLER
 job_local_000300000000rank_sort
 ORDER_BYProcessed/user_visits_table,
 Input(s):
 Successfully read 0 records from: Data/Raw/UserVisits.dat
 Output(s):
 Successfully stored 0 records in: Processed/user_visits_table
 However, when I look in the output:
 $ ls -lh Processed/user_visits_table/CG0/
 total 15250760
 -rwxrwxrwx  1 user  _lpoperator   7.3G Sep 21 21:58 part-0*
 It read a 20G input file and generated some output...
 /email
 Is it that in local mode counters are not available? If so, instead of 
 printing zeros we should print Information Unavailable or some such.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1641) Incorrect counters in local mode

2010-09-27 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1641:
--

  Status: Resolved  (was: Patch Available)
Hadoop Flags: [Reviewed]
  Resolution: Fixed

Patch committed to both trunk and 0.8 branch.

 Incorrect counters in local mode
 

 Key: PIG-1641
 URL: https://issues.apache.org/jira/browse/PIG-1641
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Ashutosh Chauhan
Assignee: Richard Ding
 Fix For: 0.8.0

 Attachments: PIG-1641.patch


 User report, not verified.
 email
 HadoopVersionPigVersionUserIdStartedAtFinishedAtFeatures
 0.20.20.8.0-SNAPSHOTuser2010-09-21 19:25:582010-09-21 
 21:58:42ORDER_BY
 Success!
 Job Stats (time in seconds):
 JobIdMapsReducesMaxMapTimeMinMapTImeAvgMapTime
 MaxReduceTimeMinReduceTimeAvgReduceTimeAliasFeatureOutputs
 job_local_000100000000rawMAP_ONLY
 job_local_000200000000rank_sort
 SAMPLER
 job_local_000300000000rank_sort
 ORDER_BYProcessed/user_visits_table,
 Input(s):
 Successfully read 0 records from: Data/Raw/UserVisits.dat
 Output(s):
 Successfully stored 0 records in: Processed/user_visits_table
 However, when I look in the output:
 $ ls -lh Processed/user_visits_table/CG0/
 total 15250760
 -rwxrwxrwx  1 user  _lpoperator   7.3G Sep 21 21:58 part-0*
 It read a 20G input file and generated some output...
 /email
 Is it that in local mode counters are not available? If so, instead of 
 printing zeros we should print Information Unavailable or some such.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: help : error run pig

2010-09-27 Thread Alan Gates
Pig is failing to connect to your namenode.  Is the address Pig is  
trying to use (hdfs://master:54310/) correct?  Can you connect using  
that string from the same machine using bin/hadoop?


Alan.

On Sep 27, 2010, at 8:45 AM, Ngô Văn Vĩ wrote:


I run Pig at Hadoop Mode
(Pig-0.7.0 and hadoop-0.20.2)
have error?
ng...@master:~/pig-0.7.0$ bin/pig
10/09/27 08:39:40 INFO pig.Main: Logging error messages to:
/home/ngovi/pig-0.7.0/pig_1285601980268.log
2010-09-27 08:39:40,538 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -  
Connecting

to hadoop file system at: hdfs://master:54310/
2010-09-27 08:39:41,760 [main] INFO  org.apache.hadoop.ipc.Client -  
Retrying
connect to server: master/192.168.230.130:54310. Already tried 0  
time(s).
2010-09-27 08:39:42,762 [main] INFO  org.apache.hadoop.ipc.Client -  
Retrying
connect to server: master/192.168.230.130:54310. Already tried 1  
time(s).
2010-09-27 08:39:43,763 [main] INFO  org.apache.hadoop.ipc.Client -  
Retrying
connect to server: master/192.168.230.130:54310. Already tried 2  
time(s).
2010-09-27 08:39:44,765 [main] INFO  org.apache.hadoop.ipc.Client -  
Retrying
connect to server: master/192.168.230.130:54310. Already tried 3  
time(s).
2010-09-27 08:39:45,766 [main] INFO  org.apache.hadoop.ipc.Client -  
Retrying
connect to server: master/192.168.230.130:54310. Already tried 4  
time(s).
2010-09-27 08:39:46,767 [main] INFO  org.apache.hadoop.ipc.Client -  
Retrying
connect to server: master/192.168.230.130:54310. Already tried 5  
time(s).
2010-09-27 08:39:47,768 [main] INFO  org.apache.hadoop.ipc.Client -  
Retrying
connect to server: master/192.168.230.130:54310. Already tried 6  
time(s).
2010-09-27 08:39:48,769 [main] INFO  org.apache.hadoop.ipc.Client -  
Retrying
connect to server: master/192.168.230.130:54310. Already tried 7  
time(s).
2010-09-27 08:39:49,770 [main] INFO  org.apache.hadoop.ipc.Client -  
Retrying
connect to server: master/192.168.230.130:54310. Already tried 8  
time(s).
2010-09-27 08:39:50,771 [main] INFO  org.apache.hadoop.ipc.Client -  
Retrying
connect to server: master/192.168.230.130:54310. Already tried 9  
time(s).

2010-09-27 08:39:50,780 [main] ERROR org.apache.pig.Main - ERROR 2999:
Unexpected internal error. Failed to create DataStorage

Help me??
Thanks
--
Ngô Văn Vĩ
Công Nghệ Phần Mềm
Phone: 01695893851




[jira] Created: (PIG-1649) FRJoin fails to compute number of input files for replicated input

2010-09-27 Thread Thejas M Nair (JIRA)
FRJoin fails to compute number of input files for replicated input
--

 Key: PIG-1649
 URL: https://issues.apache.org/jira/browse/PIG-1649
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.8.0


In FRJoin, if input path has curly braces, it fails to compute number of input 
files and logs the following exception in the log -

10/09/27 14:31:13 WARN mapReduceLayer.MRCompiler: failed to get number of input 
files
java.net.URISyntaxException: Illegal character in path at index 12: 
/user/tejas/{std*txt}
at java.net.URI$Parser.fail(URI.java:2809)
at java.net.URI$Parser.checkChars(URI.java:2982)
at java.net.URI$Parser.parseHierarchical(URI.java:3066)
at java.net.URI$Parser.parse(URI.java:3024)
at java.net.URI.init(URI.java:578)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.hasTooManyInputFiles(MRCompiler.java:1283)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:1203)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.visit(POFRJoin.java:188)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:475)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:454)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:336)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:468)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:116)
at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:301)
at 
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1197)
at org.apache.pig.PigServer.storeEx(PigServer.java:873)
at org.apache.pig.PigServer.store(PigServer.java:815)
at org.apache.pig.PigServer.openIterator(PigServer.java:727)
at 
org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
at 
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:301)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
at org.apache.pig.Main.run(Main.java:453)
at org.apache.pig.Main.main(Main.java:107)

This does not cause a query to fail. But since the number of input files don't 
get calculated, the optimizations added in PIG-1458 to reduce load on name node 
will not get used.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1637) Combiner not use because optimizor inserts a foreach between group and algebric function

2010-09-27 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1637:


Attachment: PIG-1637-1.patch

 Combiner not use because optimizor inserts a foreach between group and 
 algebric function
 

 Key: PIG-1637
 URL: https://issues.apache.org/jira/browse/PIG-1637
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.8.0

 Attachments: PIG-1637-1.patch


 The following script does not use combiner after new optimization change.
 {code}
 A = load ':INPATH:/pigmix/page_views' using 
 org.apache.pig.test.udf.storefunc.PigPerformanceLoader()
 as (user, action, timespent, query_term, ip_addr, timestamp, 
 estimated_revenue, page_info, page_links);
 B = foreach A generate user, (int)timespent as timespent, 
 (double)estimated_revenue as estimated_revenue;
 C = group B all; 
 D = foreach C generate SUM(B.timespent), AVG(B.estimated_revenue);
 store D into ':OUTPATH:';
 {code}
 This is because after group, optimizer detect group key is not used 
 afterward, it add a foreach statement after C. This is how it looks like 
 after optimization:
 {code}
 A = load ':INPATH:/pigmix/page_views' using 
 org.apache.pig.test.udf.storefunc.PigPerformanceLoader()
 as (user, action, timespent, query_term, ip_addr, timestamp, 
 estimated_revenue, page_info, page_links);
 B = foreach A generate user, (int)timespent as timespent, 
 (double)estimated_revenue as estimated_revenue;
 C = group B all; 
 C1 = foreach C generate B;
 D = foreach C1 generate SUM(B.timespent), AVG(B.estimated_revenue);
 store D into ':OUTPATH:';
 {code}
 That cancel the combiner optimization for D. 
 The way to solve the issue is to merge the C1 we inserted and D. Currently, 
 we do not merge these two foreach. The reason is that one output of the first 
 foreach (B) is referred twice in D, and currently rule assume after merge, we 
 need to calculate B twice in D. Actually, C1 is only doing projection, no 
 calculation of B. Merging C1 and D will not result calculating B twice. So C1 
 and D should be merged.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-1650) pig grunt shell breaks for many commands like perl , awk , pipe , 'ls -l' etc

2010-09-27 Thread niraj rai (JIRA)
pig grunt shell breaks for many commands like perl , awk , pipe , 'ls -l' etc
-

 Key: PIG-1650
 URL: https://issues.apache.org/jira/browse/PIG-1650
 Project: Pig
  Issue Type: Bug
Reporter: niraj rai
Assignee: niraj rai


grunt shell breaks for many unix xommands

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1650) pig grunt shell breaks for many commands like perl , awk , pipe , 'ls -l' etc

2010-09-27 Thread niraj rai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niraj rai updated PIG-1650:
---

Attachment: PIG-1650_0.patch

This patch will fix many broken commands inside the grunt shell.

 pig grunt shell breaks for many commands like perl , awk , pipe , 'ls -l' etc
 -

 Key: PIG-1650
 URL: https://issues.apache.org/jira/browse/PIG-1650
 Project: Pig
  Issue Type: Bug
Reporter: niraj rai
Assignee: niraj rai
 Attachments: PIG-1650_0.patch


 grunt shell breaks for many unix xommands

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1650) pig grunt shell breaks for many commands like perl , awk , pipe , 'ls -l' etc

2010-09-27 Thread niraj rai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niraj rai updated PIG-1650:
---

Status: Patch Available  (was: Open)

 pig grunt shell breaks for many commands like perl , awk , pipe , 'ls -l' etc
 -

 Key: PIG-1650
 URL: https://issues.apache.org/jira/browse/PIG-1650
 Project: Pig
  Issue Type: Bug
Reporter: niraj rai
Assignee: niraj rai
 Attachments: PIG-1650_0.patch


 grunt shell breaks for many unix xommands

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-1651) PIG class loading mishandled

2010-09-27 Thread Yan Zhou (JIRA)
PIG class loading mishandled


 Key: PIG-1651
 URL: https://issues.apache.org/jira/browse/PIG-1651
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Yan Zhou
Assignee: Richard Ding
 Fix For: 0.8.0


If just having zebra.jar as being registered in a PIG script but not in the 
CLASSPATH, the query using zebra fails since there appear to be multiple 
classes loaded into JVM, causing static variable set previously not seen after 
one instance of the class is created through reflection. (After the zebra.jar 
is specified in CLASSPATH, it works fine.) The exception stack is as follows:

ackend error message during job submission
---
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to 
create input splits for: hdfs://hostname/pathto/zebra_dir :: null
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:284)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:907)
at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:801)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:752)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
at 
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at 
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.zebra.io.ColumnGroup.getNonDataFilePrefix(ColumnGroup.java:123)
at 
org.apache.hadoop.zebra.io.ColumnGroup$CGPathFilter.accept(ColumnGroup.java:2413)
at 
org.apache.hadoop.zebra.mapreduce.TableInputFormat$DummyFileInputFormat$MultiPathFilter.accept(TableInputFormat.java:718)
at 
org.apache.hadoop.fs.FileSystem$GlobFilter.accept(FileSystem.java:1084)
at 
org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:919)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:866)
at 
org.apache.hadoop.zebra.mapreduce.TableInputFormat$DummyFileInputFormat.listStatus(TableInputFormat.java:780)
at 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:246)
at 
org.apache.hadoop.zebra.mapreduce.TableInputFormat.getRowSplits(TableInputFormat.java:863)
at 
org.apache.hadoop.zebra.mapreduce.TableInputFormat.getSplits(TableInputFormat.java:1017)
at 
org.apache.hadoop.zebra.mapreduce.TableInputFormat.getSplits(TableInputFormat.java:961)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:269)
... 7 more



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: help : error run pig

2010-09-27 Thread Ngô Văn Vĩ
have you help me?
i have configuration
*-  bin/pig*
export JAVA_HOME=/home/ngovi/jdk1.6.0_21
export PIG_INSTALL=/home/ngovi/pig-0.7.0
export PATH=$PATH:$PIG_INSTALL/bin
export PIG_HADOOP_VERSION=0.20.2
export PIG_CLASSPATH=/home/ngovi/hadoop-0.20.2/conf/

*- conf/pig.properties*
fs.default.name=hdfs://localhost:9000/
mapred.job.tracker=localhost:9001
# log4jconf log4j configuration file
i run pig that have error

*- in hadoop-0.20.2/conf*
*core-site.xml*
configuration
property
namefs.default.name/name
valuehdfs://localhost:9000/value
description
the name of the default file system
/description
/property
/configuration
*hdfs-site.xml*
configuration
property
namedfs.replication/name
value1/value
descriptionDefault block replication /description
/property
/configuration

*mapred-site.xml*

configuration
property
namemapred.job.tracker/name
valuelocalhost:9001/value
description
the host and port that the mapreduce job tracker run at
/description
/property
/configuration

I run pig that have error??
*ng...@master:~/pig-0.7.0$ bin/pig -x mapreduce
10/09/27 18:16:29 INFO pig.Main: Logging error messages to:
/home/ngovi/pig-0.7.0/pig_1285636589590.log
2010-09-27 18:16:30,029 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
to hadoop file system at: hdfs://localhost:9000/
2010-09-27 18:16:30,347 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
to map-reduce job tracker at: localhost:9001
grunt *


thanks all

On Mon, Sep 27, 2010 at 1:14 PM, Alan Gates ga...@yahoo-inc.com wrote:

 Pig is failing to connect to your namenode.  Is the address Pig is trying
 to use (hdfs://master:54310/) correct?  Can you connect using that string
 from the same machine using bin/hadoop?

 Alan.


 On Sep 27, 2010, at 8:45 AM, Ngô Văn Vĩ wrote:

  I run Pig at Hadoop Mode
 (Pig-0.7.0 and hadoop-0.20.2)
 have error?
 ng...@master:~/pig-0.7.0$ bin/pig
 10/09/27 08:39:40 INFO pig.Main: Logging error messages to:
 /home/ngovi/pig-0.7.0/pig_1285601980268.log
 2010-09-27 08:39:40,538 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
 Connecting
 to hadoop file system at: hdfs://master:54310/
 2010-09-27 08:39:41,760 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 0 time(s).
 2010-09-27 08:39:42,762 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 1 time(s).
 2010-09-27 08:39:43,763 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 2 time(s).
 2010-09-27 08:39:44,765 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 3 time(s).
 2010-09-27 08:39:45,766 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 4 time(s).
 2010-09-27 08:39:46,767 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 5 time(s).
 2010-09-27 08:39:47,768 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 6 time(s).
 2010-09-27 08:39:48,769 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 7 time(s).
 2010-09-27 08:39:49,770 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 8 time(s).
 2010-09-27 08:39:50,771 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 9 time(s).
 2010-09-27 08:39:50,780 [main] ERROR org.apache.pig.Main - ERROR 2999:
 Unexpected internal error. Failed to create DataStorage

 Help me??
 Thanks
 --
 Ngô Văn Vĩ
 Công Nghệ Phần Mềm
 Phone: 01695893851





-- 
Ngô Văn Vĩ
Công Nghệ Phần Mềm
Phone: 01695893851


Re: help : error run pig

2010-09-27 Thread Jeff Zhang
It seems you have connected to the right hadoop when you start pig
grunt. But connect to the wrong hadoop when you run pig script.
Try to search whether there's other configuration files that mess up
with your default configuration. And what is machine 192.168.230.130
?


On Tue, Sep 28, 2010 at 9:23 AM, Ngô Văn Vĩ ngovi.se@gmail.com wrote:
 have you help me?
 i have configuration
 *-  bin/pig*
 export JAVA_HOME=/home/ngovi/jdk1.6.0_21
 export PIG_INSTALL=/home/ngovi/pig-0.7.0
 export PATH=$PATH:$PIG_INSTALL/bin
 export PIG_HADOOP_VERSION=0.20.2
 export PIG_CLASSPATH=/home/ngovi/hadoop-0.20.2/conf/
 
 *- conf/pig.properties*
 fs.default.name=hdfs://localhost:9000/
 mapred.job.tracker=localhost:9001
 # log4jconf log4j configuration file
 i run pig that have error

 *- in hadoop-0.20.2/conf*
 *core-site.xml*
 configuration
 property
 namefs.default.name/name
 valuehdfs://localhost:9000/value
 description
 the name of the default file system
 /description
 /property
 /configuration
 *hdfs-site.xml*
 configuration
 property
 namedfs.replication/name
 value1/value
 descriptionDefault block replication /description
 /property
 /configuration

 *mapred-site.xml*

 configuration
 property
 namemapred.job.tracker/name
 valuelocalhost:9001/value
 description
 the host and port that the mapreduce job tracker run at
 /description
 /property
 /configuration

 I run pig that have error??
 *ng...@master:~/pig-0.7.0$ bin/pig -x mapreduce
 10/09/27 18:16:29 INFO pig.Main: Logging error messages to:
 /home/ngovi/pig-0.7.0/pig_1285636589590.log
 2010-09-27 18:16:30,029 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
 to hadoop file system at: hdfs://localhost:9000/
 2010-09-27 18:16:30,347 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
 to map-reduce job tracker at: localhost:9001
 grunt *


 thanks all

 On Mon, Sep 27, 2010 at 1:14 PM, Alan Gates ga...@yahoo-inc.com wrote:

 Pig is failing to connect to your namenode.  Is the address Pig is trying
 to use (hdfs://master:54310/) correct?  Can you connect using that string
 from the same machine using bin/hadoop?

 Alan.


 On Sep 27, 2010, at 8:45 AM, Ngô Văn Vĩ wrote:

  I run Pig at Hadoop Mode
 (Pig-0.7.0 and hadoop-0.20.2)
 have error?
 ng...@master:~/pig-0.7.0$ bin/pig
 10/09/27 08:39:40 INFO pig.Main: Logging error messages to:
 /home/ngovi/pig-0.7.0/pig_1285601980268.log
 2010-09-27 08:39:40,538 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
 Connecting
 to hadoop file system at: hdfs://master:54310/
 2010-09-27 08:39:41,760 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 0 time(s).
 2010-09-27 08:39:42,762 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 1 time(s).
 2010-09-27 08:39:43,763 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 2 time(s).
 2010-09-27 08:39:44,765 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 3 time(s).
 2010-09-27 08:39:45,766 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 4 time(s).
 2010-09-27 08:39:46,767 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 5 time(s).
 2010-09-27 08:39:47,768 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 6 time(s).
 2010-09-27 08:39:48,769 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 7 time(s).
 2010-09-27 08:39:49,770 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 8 time(s).
 2010-09-27 08:39:50,771 [main] INFO  org.apache.hadoop.ipc.Client -
 Retrying
 connect to server: master/192.168.230.130:54310. Already tried 9 time(s).
 2010-09-27 08:39:50,780 [main] ERROR org.apache.pig.Main - ERROR 2999:
 Unexpected internal error. Failed to create DataStorage

 Help me??
 Thanks
 --
 Ngô Văn Vĩ
 Công Nghệ Phần Mềm
 Phone: 01695893851





 --
 Ngô Văn Vĩ
 Công Nghệ Phần Mềm
 Phone: 01695893851




-- 
Best Regards

Jeff Zhang


Re: help : error run pig

2010-09-27 Thread Renato Marroquín Mogrovejo
I thought it was good to go.
Hey, have you tried maybe just doing a simple load test? I mean just loading
a file into grunt with the LOAD command, and then doing a DUMP on it. So
after that, we could see if there is actually something wrong with your
installation.


Renato M.

2010/9/27 Ngô Văn Vĩ ngovi.se@gmail.com

 192.168.230.130 is IP of my machine
 @JeffZhang: can you explain clearly?
 Thanks

 On Tue, Sep 28, 2010 at 8:39 AM, Jeff Zhang zjf...@gmail.com wrote:

  It seems you have connected to the right hadoop when you start pig
  grunt. But connect to the wrong hadoop when you run pig script.
  Try to search whether there's other configuration files that mess up
  with your default configuration. And what is machine 192.168.230.130
  ?
 
 
  On Tue, Sep 28, 2010 at 9:23 AM, Ngô Văn Vĩ ngovi.se@gmail.com
  wrote:
   have you help me?
   i have configuration
   *-  bin/pig*
   export JAVA_HOME=/home/ngovi/jdk1.6.0_21
   export PIG_INSTALL=/home/ngovi/pig-0.7.0
   export PATH=$PATH:$PIG_INSTALL/bin
   export PIG_HADOOP_VERSION=0.20.2
   export PIG_CLASSPATH=/home/ngovi/hadoop-0.20.2/conf/
   
   *- conf/pig.properties*
   fs.default.name=hdfs://localhost:9000/
   mapred.job.tracker=localhost:9001
   # log4jconf log4j configuration file
   i run pig that have error
  
   *- in hadoop-0.20.2/conf*
   *core-site.xml*
   configuration
   property
   namefs.default.name/name
   valuehdfs://localhost:9000/value
   description
   the name of the default file system
   /description
   /property
   /configuration
   *hdfs-site.xml*
   configuration
   property
   namedfs.replication/name
   value1/value
   descriptionDefault block replication /description
   /property
   /configuration
  
   *mapred-site.xml*
  
   configuration
   property
   namemapred.job.tracker/name
   valuelocalhost:9001/value
   description
   the host and port that the mapreduce job tracker run at
   /description
   /property
   /configuration
  
   I run pig that have error??
   *ng...@master:~/pig-0.7.0$ bin/pig -x mapreduce
   10/09/27 18:16:29 INFO pig.Main: Logging error messages to:
   /home/ngovi/pig-0.7.0/pig_1285636589590.log
   2010-09-27 18:16:30,029 [main] INFO
   org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
  Connecting
   to hadoop file system at: hdfs://localhost:9000/
   2010-09-27 18:16:30,347 [main] INFO
   org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
  Connecting
   to map-reduce job tracker at: localhost:9001
   grunt *
  
  
   thanks all
  
   On Mon, Sep 27, 2010 at 1:14 PM, Alan Gates ga...@yahoo-inc.com
 wrote:
  
   Pig is failing to connect to your namenode.  Is the address Pig is
  trying
   to use (hdfs://master:54310/) correct?  Can you connect using that
  string
   from the same machine using bin/hadoop?
  
   Alan.
  
  
   On Sep 27, 2010, at 8:45 AM, Ngô Văn Vĩ wrote:
  
I run Pig at Hadoop Mode
   (Pig-0.7.0 and hadoop-0.20.2)
   have error?
   ng...@master:~/pig-0.7.0$ bin/pig
   10/09/27 08:39:40 INFO pig.Main: Logging error messages to:
   /home/ngovi/pig-0.7.0/pig_1285601980268.log
   2010-09-27 08:39:40,538 [main] INFO
   org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
   Connecting
   to hadoop file system at: hdfs://master:54310/
   2010-09-27 08:39:41,760 [main] INFO  org.apache.hadoop.ipc.Client -
   Retrying
   connect to server: master/192.168.230.130:54310. Already tried 0
  time(s).
   2010-09-27 08:39:42,762 [main] INFO  org.apache.hadoop.ipc.Client -
   Retrying
   connect to server: master/192.168.230.130:54310. Already tried 1
  time(s).
   2010-09-27 08:39:43,763 [main] INFO  org.apache.hadoop.ipc.Client -
   Retrying
   connect to server: master/192.168.230.130:54310. Already tried 2
  time(s).
   2010-09-27 08:39:44,765 [main] INFO  org.apache.hadoop.ipc.Client -
   Retrying
   connect to server: master/192.168.230.130:54310. Already tried 3
  time(s).
   2010-09-27 08:39:45,766 [main] INFO  org.apache.hadoop.ipc.Client -
   Retrying
   connect to server: master/192.168.230.130:54310. Already tried 4
  time(s).
   2010-09-27 08:39:46,767 [main] INFO  org.apache.hadoop.ipc.Client -
   Retrying
   connect to server: master/192.168.230.130:54310. Already tried 5
  time(s).
   2010-09-27 08:39:47,768 [main] INFO  org.apache.hadoop.ipc.Client -
   Retrying
   connect to server: master/192.168.230.130:54310. Already tried 6
  time(s).
   2010-09-27 08:39:48,769 [main] INFO  org.apache.hadoop.ipc.Client -
   Retrying
   connect to server: master/192.168.230.130:54310. Already tried 7
  time(s).
   2010-09-27 08:39:49,770 [main] INFO  org.apache.hadoop.ipc.Client -
   Retrying
   connect to server: master/192.168.230.130:54310. Already tried 8
  time(s).
   2010-09-27 08:39:50,771 [main] INFO  org.apache.hadoop.ipc.Client -
   Retrying
   connect to server: master/192.168.230.130:54310. Already tried 9
  time(s).
   2010-09-27 08:39:50,780 [main] ERROR org.apache.pig.Main - ERROR
 2999:
   Unexpected internal error.