Pig in Windows)

Satish Kolli Sun, 13 Jul 2014 09:51:05 -0700

You can't do some pig operations in windows especially with hadoop 1.x.
Following article talks about couple of options(hacks) that you can use to
run Pig scripts on windows.


http://simpletoad.blogspot.com/2013/05/pigunit-issue-on-windows.html?m=1
On Jul 13, 2014 12:42 PM, "Krishnan K" <[email protected]> wrote:

> Hi Suraj,
>
> Thanks for replying.
>
>
> \tmp\hadoop-krkrishnamoorthy\mapred\staging\krkrishnamoorthy502928296\.staging
> seems to refer to a path in the unix filesystem. I dont have c:\temp.
>
> it is trying to set the permissions to 700, which should apparently be
> possible in the Unix environment.
>
> I'll try to setup cygwin. Is that all that is required ?
>
> Thanks!
>
>
>
>
> On Sun, Jul 13, 2014 at 7:51 AM, Suraj Nayak M <[email protected]> wrote:
>
> >  Hi Krishnan,
> >
> > Regarding the error, I can see line
> >
> > 14/07/12 17:55:31 ERROR security.UserGroupInformation:
> > PriviledgedActionException as:krkrishnamoorthy cause:java.io.IOException:
> > Failed to set permissions of path:
> >
> >
> \tmp\hadoop-krkrishnamoorthy\mapred\staging\krkrishnamoorthy502928296\.staging
> > to 0700
> >
> > Do you have C:\tmp in C drive ?
> >
> > I have added my replies to your questions inline below.
> >
> >
> > On Sunday 13 July 2014 09:44 AM, Krishnan K wrote:
> >
> > Hi,
> >
> > I'm running a PigScript on my Windows machine. I don't have a hadoop/pig
> > environment installed.
> >
> > Some questions :
> > 1. Can I run PigUnit test cases in *Windows *without having any
> *hadoop*/*pig
> > environment setup *?
> >
> >  You can run PigUnit test cases locally. I have tried in Linux, it works
> > and do not require hadoop to be installed. If you have cygwin installed,
> > you should also be able to run PigUnit test cases.
> >
> > 2. Can I run PigUnit testcases in *local *mode through eclipse if I can
> > configure the cluster details ? If yes, where can I provide my cluster
> > details ?
> >
> >  No need of cluster configuration in local mode
> >
> > 3. Can I run PigUnit testcases in *mapreduce *mode through eclipse if I
> can
> > configure the cluster details ? If yes, where can I provide my cluster
> > details ?
> >
> >  Copy *.xml from cluster to local machine (in a folder) and add the
> folder
> > to classpath. (I have not tested this).
> >
> >  4. Can I build maven jar without running test cases in my Windows
> machine
> > and deploy them in a cluster having hadoop/pig ?
> >
> >  Yes. U can use* -DskipTests* option in maven goal. Alternatively, If you
> > are using eclipse to build maven jar, in the build dialog of
> eclipse(where
> > you specify goals), you can check the option to skip the tests option.
> >
> >
> > Appreciate your help.
> >
> > I executed a pigunit test case and it errored out. Please find the log
> > below which has error details :
> >
> > 14/07/12 17:55:30 INFO pigunit.PigTest: Using default local mode
> > 14/07/12 17:55:30 INFO executionengine.HExecutionEngine: Connecting to
> > hadoop file system at: file:///
> > 14/07/12 17:55:30 INFO pigunit.PigTest: -- Load users from hdfs
> > users = LOAD 'src/test/resources/input/users.txt' USING PigStorage(',')
> AS
> > (id:long, firstName:chararray, lastName:chararray, country:chararray,
> > city:chararray, company:chararray);
> >
> > -- Load ratings from hdfs
> > awesomenessRating = LOAD 'src/test/resources/input/rating.txt' USING
> > PigStorage(',') AS (userId:long, rating:long);
> >
> > -- Join records by userId
> > joinedRecords = JOIN users BY id, awesomenessRating BY userId;
> >
> > -- Filter users with awesomenessRating > 150
> > filteredRecords = FILTER joinedRecords BY awesomenessRating::rating >
> 150;
> >
> > -- Generate fields that we are interested in
> > generatedRecords = FOREACH filteredRecords GENERATE
> >  users::id AS id,
> > users::firstName AS firstName,
> >  users::country AS country,
> > awesomenessRating::rating AS rating;
> >
> > -- Store results
> > STORE generatedRecords INTO 'src/test/resources/results/awesomeness'
> USING
> > PigStorage();
> >
> > 14/07/12 17:55:30 INFO util.Utils: Default bootup file
> > C:\Users\krkrishnamoorthy/.pigbootup not found
> > users = LOAD 'src/test/resources/input/users.txt' USING PigStorage(',')
> AS
> > (id:long, firstName:chararray, lastName:chararray, country:chararray,
> > city:chararray, company:chararray);
> > --> users = LOAD 'src/test/resources/input/users.txt' USING
> PigStorage(',')
> > AS
> >
> (id:long,firstName:chararray,lastName:chararray,country:chararray,city:chararray,company:chararray);
> > awesomenessRating = LOAD 'src/test/resources/input/rating.txt' USING
> > PigStorage(',') AS (userId:long, rating:long);
> >  --> awesomenessRating = LOAD
> > 'src/test/resources/input/awesomeness-rating.txt' USING PigStorage(',')
> AS
> > (userId:long, rating:long);
> > STORE generatedRecords INTO 'src/test/resources/results/awesomeness'
> USING
> > PigStorage();
> > --> none
> > 14/07/12 17:55:31 INFO pigstats.ScriptState: Pig features used in the
> > script: HASH_JOIN
> > 14/07/12 17:55:31 INFO optimizer.LogicalPlanOptimizer:
> > {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune,
> > DuplicateForEachColumnRewrite, FilterLogicExpressionSimplifier,
> > GroupByConstParallelSetter, ImplicitSplitInserter, LimitOptimizer,
> > LoadTypeCastInserter, MergeFilter, MergeForEach,
> > NewPartitionFilterOptimizer, PartitionFilterOptimizer,
> > PushDownForEachFlatten, PushUpFilter, SplitFilter,
> StreamTypeCastInserter]}
> > 14/07/12 17:55:31 INFO mapReduceLayer.MRCompiler: File concatenation
> > threshold: 100 optimistic? false
> > 14/07/12 17:55:31 INFO
> > mapReduceLayer.MRCompiler$LastInputStreamingOptimizer: Rewrite:
> > POPackage->POForEach to POJoinPackage
> > 14/07/12 17:55:31 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size
> > before optimization: 1
> > 14/07/12 17:55:31 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size
> > after optimization: 1
> > 14/07/12 17:55:31 INFO pigstats.ScriptState: Pig script settings are
> added
> > to the job
> > 14/07/12 17:55:31 INFO mapReduceLayer.JobControlCompiler:
> > mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> > 14/07/12 17:55:31 INFO mapReduceLayer.JobControlCompiler: Setting up
> single
> > store job
> > 14/07/12 17:55:31 INFO data.SchemaTupleFrontend: Key [pig.schematuple] is
> > false, will not generate code.
> > 14/07/12 17:55:31 INFO data.SchemaTupleFrontend: Starting process to move
> > generated code to distributed cache
> > 14/07/12 17:55:31 INFO data.SchemaTupleFrontend: Distributed cache not
> > supported or needed in local mode. Setting key
> [pig.schematuple.local.dir]
> > with code temp directory:
> > C:\Users\KRKRIS~1\AppData\Local\Temp\1405212931260-0
> > 14/07/12 17:55:31 INFO mapReduceLayer.JobControlCompiler: Reduce phase
> > detected, estimating # of required reducers.
> > 14/07/12 17:55:31 INFO mapReduceLayer.JobControlCompiler: Using reducer
> > estimator:
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
> > 14/07/12 17:55:31 INFO mapReduceLayer.InputSizeReducerEstimator:
> > BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=-1
> > 14/07/12 17:55:31 INFO mapReduceLayer.JobControlCompiler: Could not
> > estimate number of reducers and no requested or default parallelism set.
> > Defaulting to 1 reducer.
> > 14/07/12 17:55:31 INFO mapReduceLayer.JobControlCompiler: Setting
> > Parallelism to 1
> > 14/07/12 17:55:31 INFO mapReduceLayer.MapReduceLauncher: 1 map-reduce
> > job(s) waiting for submission.
> > 14/07/12 17:55:31 WARN util.NativeCodeLoader: Unable to load
> native-hadoop
> > library for your platform... using builtin-java classes where applicable
> > 14/07/12 17:55:31 ERROR security.UserGroupInformation:
> > PriviledgedActionException as:krkrishnamoorthy cause:java.io.IOException:
> > Failed to set permissions of path:
> >
> \tmp\hadoop-krkrishnamoorthy\mapred\staging\krkrishnamoorthy502928296\.staging
> > to 0700
> > 14/07/12 17:55:31 INFO mapReduceLayer.MapReduceLauncher: 0% complete
> > 14/07/12 17:55:31 WARN mapReduceLayer.MapReduceLauncher: Ooops! Some job
> > has failed! Specify -stop_on_failure if you want Pig to stop immediately
> on
> > failure.
> > 14/07/12 17:55:31 INFO mapReduceLayer.MapReduceLauncher: job null has
> > failed! Stop running all dependent jobs
> > 14/07/12 17:55:31 INFO mapReduceLayer.MapReduceLauncher: 100% complete
> > 14/07/12 17:55:31 WARN mapReduceLayer.Launcher: There is no log file to
> > write to.
> > 14/07/12 17:55:31 ERROR mapReduceLayer.Launcher: Backend error message
> > during job submission
> > java.io.IOException: Failed to set permissions of path:
> >
> \tmp\hadoop-krkrishnamoorthy\mapred\staging\krkrishnamoorthy502928296\.staging
> > to 0700
> >  at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:691)
> > at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:664)
> >  at
> >
> org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:514)
> >  at
> >
> org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:349)
> > at
> org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:193)
> >  at
> >
> org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:126)
> >  at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:942)
> > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
> >  at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:422)
> >  at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> >  at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
> > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)
> >  at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
> > at
> >
> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
> >  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> >  at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >  at java.lang.reflect.Method.invoke(Method.java:483)
> > at
> >
> org.apache.pig.backend.hadoop20.PigJobControl.mainLoopAction(PigJobControl.java:157)
> >  at
> > org.apache.pig.backend.hadoop20.PigJobControl.run(PigJobControl.java:134)
> > at java.lang.Thread.run(Thread.java:744)
> >  at
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:270)
> >
> > 14/07/12 17:55:31 ERROR pigstats.SimplePigStats: ERROR: Failed to set
> > permissions of path:
> >
> \tmp\hadoop-krkrishnamoorthy\mapred\staging\krkrishnamoorthy502928296\.staging
> > to 0700
> > 14/07/12 17:55:31 ERROR pigstats.PigStatsUtil: 1 map reduce job(s)
> failed!
> > 14/07/12 17:55:31 INFO pigstats.SimplePigStats: Detected Local mode.
> Stats
> > reported below may be incomplete
> > 14/07/12 17:55:31 INFO pigstats.SimplePigStats: Script Statistics:
> >
> > HadoopVersion PigVersion UserId StartedAt FinishedAt Features
> > 1.2.1 0.12.0 krkrishnamoorthy 2014-07-12 17:55:31 2014-07-12 17:55:31
> > HASH_JOIN
> >
> > Failed!
> >
> > Failed Jobs:
> > JobId Alias Feature Message Outputs
> > N/A awesomenessRating,joinedRecords,users HASH_JOIN Message:
> > java.io.IOException: Failed to set permissions of path:
> >
> \tmp\hadoop-krkrishnamoorthy\mapred\staging\krkrishnamoorthy502928296\.staging
> > to 0700
> >  at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:691)
> > at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:664)
> >  at
> >
> org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:514)
> >  at
> >
> org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:349)
> > at
> org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:193)
> >  at
> >
> org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:126)
> >  at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:942)
> > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
> >  at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:422)
> >  at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> >  at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
> > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)
> >  at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
> > at
> >
> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
> >  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> >  at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >  at java.lang.reflect.Method.invoke(Method.java:483)
> > at
> >
> org.apache.pig.backend.hadoop20.PigJobControl.mainLoopAction(PigJobControl.java:157)
> >  at
> > org.apache.pig.backend.hadoop20.PigJobControl.run(PigJobControl.java:134)
> > at java.lang.Thread.run(Thread.java:744)
> >  at
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:270)
> >  file:/tmp/temp49116140/tmp1118481539,
> >
> > Input(s):
> > Failed to read data
> from"file:///C:/Users/krkrishnamoorthy/workspace/test/pig-unit-example/src/test/resources/input/awesomeness-rating.txt"
> > Failed to read data
> from"file:///C:/Users/krkrishnamoorthy/workspace/test/pig-unit-example/src/test/resources/input/users.txt"
> >
> > Output(s):
> > Failed to produce result in "file:/tmp/temp49116140/tmp1118481539"
> >
> > Job DAG:
> > null
> >
> > 14/07/12 17:55:32 INFO mapReduceLayer.MapReduceLauncher: Failed!
> >
> >
> > Thanks,
> > Krishnan
> >
> >
> >
> >
>

Re: Error : PigUnit in Windows->Eclipse (without Hadoop/Pig in Windows)

Reply via email to