[jira] [Updated] (PIG-3286) TestPigContext.testImportList fails in trunk
[ https://issues.apache.org/jira/browse/PIG-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-3286: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thank you Daniel and Prashant for reviewing it. > TestPigContext.testImportList fails in trunk > > > Key: PIG-3286 > URL: https://issues.apache.org/jira/browse/PIG-3286 > Project: Pig > Issue Type: Bug > Components: build >Affects Versions: 0.12 >Reporter: Cheolsoo Park >Assignee: Cheolsoo Park > Labels: test > Fix For: 0.12 > > Attachments: PIG-3286-2.patch, PIG-3286.patch > > > To reproduce, run ant clean test -Dtestcase=TestPigContext. It fails with the > following error: > {code} > junit.framework.AssertionFailedError: expected:<5> but was:<6> > at > org.apache.pig.test.TestPigContext.testImportList(TestPigContext.java:157) > {code} > This is a regression from PIG-3198 that added "java.lang." to the default > import list. Here is relevant code: > {code} > @@ -739,6 +739,7 @@ public class PigContext implements Serializable { > if (packageImportList.get() == null) { > ArrayList importlist = new ArrayList(); > importlist.add(""); > +importlist.add("java.lang."); > importlist.add("org.apache.pig.builtin."); > importlist.add("org.apache.pig.impl.builtin."); > packageImportList.set(importlist); > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Want to contribute
Thanks Daniel. I am new to this group. Regards, Naidu On Thu, May 2, 2013 at 3:45 AM, Daniel Dai wrote: > Hi, Naidu, > Those are Hadoop question and you should send to u...@hadoop.apache.org. A > quick answer to double "hadoop namenode -format" question is, once you do > that, you lose the metadata, you would have trouble getting hdfs files > back. > > Thanks, > Daniel > > > On Wed, May 1, 2013 at 12:25 AM, Naidu MS > wrote: > > > Hi i have two questions regarding hdfs and jps utility > > > > I am new to Hadoop and started leraning hadoop from the past week > > > > 1.when ever i start start-all.sh and jps in console it showing the > > processes started > > > > *naidu@naidu:~/work/hadoop-1.0.4/bin$ jps* > > *22283 NameNode* > > *23516 TaskTracker* > > *26711 Jps* > > *22541 DataNode* > > *23255 JobTracker* > > *22813 SecondaryNameNode* > > *Could not synchronize with target* > > > > But along with the list of process stared it always showing *" Could not > > synchronize with target" *in the jps output. What is meant by "Could not > > synchronize with target"? Can some one explain why this is happening? > > > > > > 2.Is it possible to format namenode multiple times? When i enter the > > namenode -format command, it not formatting the name node and showing > the > > following ouput. > > > > *naidu@naidu:~/work/hadoop-1.0.4/bin$ hadoop namenode -format* > > *Warning: $HADOOP_HOME is deprecated.* > > * > > * > > *13/05/01 12:08:04 INFO namenode.NameNode: STARTUP_MSG: * > > */* > > *STARTUP_MSG: Starting NameNode* > > *STARTUP_MSG: host = naidu/127.0.0.1* > > *STARTUP_MSG: args = [-format]* > > *STARTUP_MSG: version = 1.0.4* > > *STARTUP_MSG: build = > > https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r > > 1393290; compiled by 'hortonfo' on Wed Oct 3 05:13:58 UTC 2012* > > */* > > *Re-format filesystem in /home/naidu/dfs/namenode ? (Y or N) y* > > *Format aborted in /home/naidu/dfs/namenode* > > *13/05/01 12:08:05 INFO namenode.NameNode: SHUTDOWN_MSG: * > > */* > > *SHUTDOWN_MSG: Shutting down NameNode at naidu/127.0.0.1* > > * > > * > > */* > > > > Can someone help me in understanding this? Why is it not possible to > format > > name node multiple times? > > > > > > On Wed, May 1, 2013 at 9:47 AM, Cheolsoo Park > > wrote: > > > > > Please see the following wiki page: > > > https://cwiki.apache.org/confluence/display/PIG/HowToContribute > > > > > > Thanks, > > > Cheolsoo > > > > > > > > > On Tue, Apr 30, 2013 at 9:10 PM, Naidu MS > > > wrote: > > > > > > > Hi How to get the source of pig? > > > > I am interested in going to source code so that i can learn how the > > > > framework is written. > > > > I can help in fixing some minor bugs/jira issues. > > > > Can some one help me how to get the source code ? > > > > > > > > > > > > > > > > Regards, > > > > Naidu > > > > > > > > > > > > On Wed, May 1, 2013 at 9:30 AM, Cheolsoo Park > > > > wrote: > > > > > > > > > Welcome to Pig. There are hundreds of open jiras: > > > > > > > > > > > > > > > > > > > > > > > > > https://issues.apache.org/jira/issues/?jql=project%20%3D%20PIG%20AND%20status%20%3D%20Open%20ORDER%20BY%20created%20DESC%2C%20priority%20DESC > > > > > > > > > > Please feel free to submit patches. > > > > > > > > > > Thanks, > > > > > Cheolsoo > > > > > > > > > > > > > > > > > > > > On Tue, Apr 30, 2013 at 4:16 PM, Vineet Nair > > > > wrote: > > > > > > > > > > > Hello all , > > > > > > > > > > > > I was just going through the source code of Pig and I would very > > much > > > > > like > > > > > > to contribute to it. > > > > > > I was just wondering if there are any small Jira requests that i > > can > > > > > start > > > > > > working on. > > > > > > > > > > > > Thanks and regards, > > > > > > Vineet > > > > > > > > > > > > > > > > > > > > >
Physical operators refactoring
Just a heads up that I'm looking into this and that it is potentially a giant patch: https://issues.apache.org/jira/browse/PIG-3307 Feedback appreciated. Julien
[jira] [Commented] (PIG-3307) Refactor physical operators to remove methods parameters that are always null
[ https://issues.apache.org/jira/browse/PIG-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13647174#comment-13647174 ] Julien Le Dem commented on PIG-3307: It looks like we can get rid of the parameter that is only used for method dispatch. I will replace all calls to getNext(Tuple t) to getNextTuple() in PhysicalOperator. > Refactor physical operators to remove methods parameters that are always null > - > > Key: PIG-3307 > URL: https://issues.apache.org/jira/browse/PIG-3307 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem >Assignee: Julien Le Dem > Attachments: PIG-3307_0.patch, PIG-3307_1.patch > > > The physical operators are sometimes overly complex. I'm trying to cleanup > some unnecessary code. > in particular there is an array of getNext(*T* v) where the value v does not > seem to have any importance and is just used to pick the correct method. > I have started a refactoring for a more readable getNext*T*(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3307) Refactor physical operators to remove methods parameters that are always null
[ https://issues.apache.org/jira/browse/PIG-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Le Dem updated PIG-3307: --- Attachment: PIG-3307_1.patch PIG-3307_1.patch introduces some more refactoring > Refactor physical operators to remove methods parameters that are always null > - > > Key: PIG-3307 > URL: https://issues.apache.org/jira/browse/PIG-3307 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem >Assignee: Julien Le Dem > Attachments: PIG-3307_0.patch, PIG-3307_1.patch > > > The physical operators are sometimes overly complex. I'm trying to cleanup > some unnecessary code. > in particular there is an array of getNext(*T* v) where the value v does not > seem to have any importance and is just used to pick the correct method. > I have started a refactoring for a more readable getNext*T*(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: A major addition to Pig. Working with spatial data
Thanks for your response. I was never good at differentiating all those open source licenses. I mean what is the point making open source licenses if it blocks me from using a library in an open source project. Any way, I'm not going into debate here. Just one question, if we use JTS as a library (jar file) without adding the code in Pig, is it still a violation? We'll use ivy, for example, to download the jar file when compiling. On May 1, 2013 7:50 PM, "Alan Gates" wrote: > Passing on the technical details for a moment, I see a licensing issue. > JTS is licensed under LGPL. Apache projects cannot contain or ship > [L]GPL. Apache does not meet the requirements of GPL and thus we cannot > repackage their code. If you wanted to go forward using that class this > would have to be packaged as an add on that was downloaded separately and > not from Apache. Another option is to work with the JTS community and see > if they are willing to dual license their code under BSD or Apache license > so that Pig could include it. If neither of those are an option you would > need to come up with a new class to contain your spatial data. > > Alan. > > On May 1, 2013, at 5:40 PM, Ahmed Eldawy wrote: > > > Hi all, > > First, sorry for the long email. I wanted to put all my thoughts here > and > > get your feedback. > > I'm proposing a major addition to Pig that will greatly increase its > > functionality and user base. It is simply to add spatial support to the > > language and the framework. I've already started working on that but I > > don't want it to be just another branch. I want it, eventually, to be > > merged with the trunk of Apache Pig. So, I'm sending this email mainly to > > reach out the main contributors of Pig to see the feasibility of this. > > This addition is a part of a big project we have been working on in > > University of Minnesota; the project is called Spatial Hadoop. > > http://spatialhadoop.cs.umn.edu. It's about building a MapReduce > framework > > (Hadoop) that is capable of maintaining and analyzing spatial data > > efficiently. I'm the main guy behind that project and since we released > its > > first version, we received very encouraging responses from different > groups > > in the research and industrial community. I'm sure the addition we want > to > > make to Pig Latin will be widely accepted by the people in the spatial > > community. > > I'm proposing a plan here while we're still in the early phases of this > > task to be able to discuss it with the main contributors and see its > > feasibility. First of all, I think that we need to change the core of Pig > > to be able to support spatial data. Providing a set of UDFs only is not > > enough. The main reason is that Pig Latin does not provide a way to > create > > a new data type which is needed for spatial data. Once we have the > spatial > > data types we need, the functionality can be expanded using more UDFs. > > > > Here's the plan as I see it. > > 1- Introduce a new primitive data type Geometry which represents all > > spatial data types. In the underlying system, this will map to > > com.vividsolutions.jts.geom.Geometry. This is a class from Java Topology > > Suite (JTS) [http://www.vividsolutions.com/jts/JTSHome.htm], a stable > and > > efficient open source Java library for spatial data types and algorithms. > > It is very popular in the spatial community and a C++ port of it is used > in > > PostGIS [http://postgis.net/] (a spatial library for Postgres). JTS also > > conforms with Open Geospatial Consortium (OGC) [ > > http://www.opengeospatial.org/] which is an open standard for the > spatial > > data types. The Geometry data type is read from and written to text files > > using the Well Known Text (WKT) format. There is also a way to convert it > > to/from binary so that it can work with binary files and streams. > > 2- Add functions that manipulate spatial data types. These will be added > as > > UDFs and we will not need to mess with the internals of Pig. Most > probably, > > there will be one new class for each operation (e.g., union or > > intersection). I think it will be good to put these new operations inside > > the core of Pig so that users can use it without having to write the > fully > > qualified class name. Also, since there is no way to implicitly cast a > > spatial data type to a non-spatial data types, there will not be any > > conflicts in existing operations or new operations. All new operations, > and > > only the new operations, will be working on spatial data types. Here is > an > > initial list of operations that can be added. All those operations are > > already implemented in JTS and the UDFs added to Pig will be just > wrappers > > around them. > > **Predicates (used for spatial filtering) > > Equals > > Disjoint > > Intersects > > Touches > > Crosses > > Within > > Contains > > Overlaps > > > > **Operations > > Envelope > > Area > > Length > > Buffer > > ConvexHull > > Intersection > > Un
[jira] Subscription: PIG patch available
Issue Subscription Filter: PIG patch available (29 issues) Subscriber: pigdaily Key Summary PIG-3297Avro files with stringType set to String cannot be read by the AvroStorage LoadFunc https://issues.apache.org/jira/browse/PIG-3297 PIG-3295Casting from bytearray failing after Union (even when each field is from a single Loader) https://issues.apache.org/jira/browse/PIG-3295 PIG-3291TestExampleGenerator fails on Windows because of lack of file name escaping https://issues.apache.org/jira/browse/PIG-3291 PIG-3288Kill jobs if the number of output files is over a configurable limit https://issues.apache.org/jira/browse/PIG-3288 PIG-3286TestPigContext.testImportList fails in trunk https://issues.apache.org/jira/browse/PIG-3286 PIG-3285Jobs using HBaseStorage fail to ship dependency jars https://issues.apache.org/jira/browse/PIG-3285 PIG-3281Pig version in pig.pom is incorrect in branch-0.11 https://issues.apache.org/jira/browse/PIG-3281 PIG-3258Patch to allow MultiStorage to use more than one index to generate output tree https://issues.apache.org/jira/browse/PIG-3258 PIG-3257Add unique identifier UDF https://issues.apache.org/jira/browse/PIG-3257 PIG-3247Piggybank functions to mimic OVER clause in SQL https://issues.apache.org/jira/browse/PIG-3247 PIG-3223AvroStorage does not handle comma separated input paths https://issues.apache.org/jira/browse/PIG-3223 PIG-3210Pig fails to start when it cannot write log to log files https://issues.apache.org/jira/browse/PIG-3210 PIG-3199Expose LogicalPlan via PigServer API https://issues.apache.org/jira/browse/PIG-3199 PIG-3166Update eclipse .classpath according to ivy library.properties https://issues.apache.org/jira/browse/PIG-3166 PIG-3123Simplify Logical Plans By Removing Unneccessary Identity Projections https://issues.apache.org/jira/browse/PIG-3123 PIG-3105Fix TestJobSubmission unit test failure. https://issues.apache.org/jira/browse/PIG-3105 PIG-3097HiveColumnarLoader doesn't correctly load partitioned Hive table https://issues.apache.org/jira/browse/PIG-3097 PIG-3088Add a builtin udf which removes prefixes https://issues.apache.org/jira/browse/PIG-3088 PIG-3069Native Windows Compatibility for Pig E2E Tests and Harness https://issues.apache.org/jira/browse/PIG-3069 PIG-3026Pig checked-in baseline comparisons need a pre-filter to address OS-specific newline differences https://issues.apache.org/jira/browse/PIG-3026 PIG-3025TestPruneColumn unit test - SimpleEchoStreamingCommand perl inline script needs simplification https://issues.apache.org/jira/browse/PIG-3025 PIG-3024TestEmptyInputDir unit test - hadoop version detection logic is brittle https://issues.apache.org/jira/browse/PIG-3024 PIG-3015Rewrite of AvroStorage https://issues.apache.org/jira/browse/PIG-3015 PIG-2959Add a pig.cmd for Pig to run under Windows https://issues.apache.org/jira/browse/PIG-2959 PIG-2955 Fix bunch of Pig e2e tests on Windows https://issues.apache.org/jira/browse/PIG-2955 PIG-2873Converting bin/pig shell script to python https://issues.apache.org/jira/browse/PIG-2873 PIG-2248Pig parser does not detect when a macro name masks a UDF name https://issues.apache.org/jira/browse/PIG-2248 PIG-2244Macros cannot be passed relation names https://issues.apache.org/jira/browse/PIG-2244 PIG-1914Support load/store JSON data in Pig https://issues.apache.org/jira/browse/PIG-1914 You may edit this subscription at: https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384
Re: A major addition to Pig. Working with spatial data
Passing on the technical details for a moment, I see a licensing issue. JTS is licensed under LGPL. Apache projects cannot contain or ship [L]GPL. Apache does not meet the requirements of GPL and thus we cannot repackage their code. If you wanted to go forward using that class this would have to be packaged as an add on that was downloaded separately and not from Apache. Another option is to work with the JTS community and see if they are willing to dual license their code under BSD or Apache license so that Pig could include it. If neither of those are an option you would need to come up with a new class to contain your spatial data. Alan. On May 1, 2013, at 5:40 PM, Ahmed Eldawy wrote: > Hi all, > First, sorry for the long email. I wanted to put all my thoughts here and > get your feedback. > I'm proposing a major addition to Pig that will greatly increase its > functionality and user base. It is simply to add spatial support to the > language and the framework. I've already started working on that but I > don't want it to be just another branch. I want it, eventually, to be > merged with the trunk of Apache Pig. So, I'm sending this email mainly to > reach out the main contributors of Pig to see the feasibility of this. > This addition is a part of a big project we have been working on in > University of Minnesota; the project is called Spatial Hadoop. > http://spatialhadoop.cs.umn.edu. It's about building a MapReduce framework > (Hadoop) that is capable of maintaining and analyzing spatial data > efficiently. I'm the main guy behind that project and since we released its > first version, we received very encouraging responses from different groups > in the research and industrial community. I'm sure the addition we want to > make to Pig Latin will be widely accepted by the people in the spatial > community. > I'm proposing a plan here while we're still in the early phases of this > task to be able to discuss it with the main contributors and see its > feasibility. First of all, I think that we need to change the core of Pig > to be able to support spatial data. Providing a set of UDFs only is not > enough. The main reason is that Pig Latin does not provide a way to create > a new data type which is needed for spatial data. Once we have the spatial > data types we need, the functionality can be expanded using more UDFs. > > Here's the plan as I see it. > 1- Introduce a new primitive data type Geometry which represents all > spatial data types. In the underlying system, this will map to > com.vividsolutions.jts.geom.Geometry. This is a class from Java Topology > Suite (JTS) [http://www.vividsolutions.com/jts/JTSHome.htm], a stable and > efficient open source Java library for spatial data types and algorithms. > It is very popular in the spatial community and a C++ port of it is used in > PostGIS [http://postgis.net/] (a spatial library for Postgres). JTS also > conforms with Open Geospatial Consortium (OGC) [ > http://www.opengeospatial.org/] which is an open standard for the spatial > data types. The Geometry data type is read from and written to text files > using the Well Known Text (WKT) format. There is also a way to convert it > to/from binary so that it can work with binary files and streams. > 2- Add functions that manipulate spatial data types. These will be added as > UDFs and we will not need to mess with the internals of Pig. Most probably, > there will be one new class for each operation (e.g., union or > intersection). I think it will be good to put these new operations inside > the core of Pig so that users can use it without having to write the fully > qualified class name. Also, since there is no way to implicitly cast a > spatial data type to a non-spatial data types, there will not be any > conflicts in existing operations or new operations. All new operations, and > only the new operations, will be working on spatial data types. Here is an > initial list of operations that can be added. All those operations are > already implemented in JTS and the UDFs added to Pig will be just wrappers > around them. > **Predicates (used for spatial filtering) > Equals > Disjoint > Intersects > Touches > Crosses > Within > Contains > Overlaps > > **Operations > Envelope > Area > Length > Buffer > ConvexHull > Intersection > Union > Difference > SymDifference > > **Aggregate functions > Accum > ConvexHull > Union > > 3- The third step is to implement spatial indexes (e.g., Grid or R-tree). A > Pig loader and Pig output classes will be created for those indexes. Note > that currently we have SpatialOutputFormat and SpatialInputFormat for those > indexes inside the Spatial Hadoop project, but we need to tweak them to > work with Pig. > > 4- (Advanced) Implement more sophisticated algorithms for spatial > operations that utilize the indexes. For example, we can have a specific > algorithm for spatial range query or spatial join. Again, we already have > algorithms built for differe
[jira] [Updated] (PIG-2970) Nested foreach getting incorrect schema when having unrelated inner query
[ https://issues.apache.org/jira/browse/PIG-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-2970: Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Patch committed to trunk. > Nested foreach getting incorrect schema when having unrelated inner query > - > > Key: PIG-2970 > URL: https://issues.apache.org/jira/browse/PIG-2970 > Project: Pig > Issue Type: Bug > Components: parser >Affects Versions: 0.10.0 >Reporter: Koji Noguchi >Assignee: Daniel Dai >Priority: Minor > Fix For: 0.12 > > Attachments: PIG-2970-0.patch, PIG-2970-1.patch, PIG-2970-2.patch, > pig-2970-trunk-v01.txt, pig-2970-trunk-v02.txt > > > While looking at PIG-2968, hit a weird error message. > {noformat} > $ cat -n test/foreach2.pig > 1 daily = load 'nyse' as (exchange, symbol); > 2 grpd = group daily by exchange; > 3 unique = foreach grpd { > 4 sym = daily.symbol; > 5 uniq_sym = distinct sym; > 6 --ignoring uniq_sym result > 7 generate group, daily; > 8 }; > 9 describe unique; > 10 zzz = foreach unique generate group; > 11 explain zzz; > % pig -x local -t ColumnMapKeyPrune test/foreach2.pig > ... > unique: {symbol: bytearray} > 2012-10-12 16:55:44,226 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1025: > Invalid field projection. > Projected field [group] does not exist in schema: symbol:bytearray. > ... > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
A major addition to Pig. Working with spatial data
Hi all, First, sorry for the long email. I wanted to put all my thoughts here and get your feedback. I'm proposing a major addition to Pig that will greatly increase its functionality and user base. It is simply to add spatial support to the language and the framework. I've already started working on that but I don't want it to be just another branch. I want it, eventually, to be merged with the trunk of Apache Pig. So, I'm sending this email mainly to reach out the main contributors of Pig to see the feasibility of this. This addition is a part of a big project we have been working on in University of Minnesota; the project is called Spatial Hadoop. http://spatialhadoop.cs.umn.edu. It's about building a MapReduce framework (Hadoop) that is capable of maintaining and analyzing spatial data efficiently. I'm the main guy behind that project and since we released its first version, we received very encouraging responses from different groups in the research and industrial community. I'm sure the addition we want to make to Pig Latin will be widely accepted by the people in the spatial community. I'm proposing a plan here while we're still in the early phases of this task to be able to discuss it with the main contributors and see its feasibility. First of all, I think that we need to change the core of Pig to be able to support spatial data. Providing a set of UDFs only is not enough. The main reason is that Pig Latin does not provide a way to create a new data type which is needed for spatial data. Once we have the spatial data types we need, the functionality can be expanded using more UDFs. Here's the plan as I see it. 1- Introduce a new primitive data type Geometry which represents all spatial data types. In the underlying system, this will map to com.vividsolutions.jts.geom.Geometry. This is a class from Java Topology Suite (JTS) [http://www.vividsolutions.com/jts/JTSHome.htm], a stable and efficient open source Java library for spatial data types and algorithms. It is very popular in the spatial community and a C++ port of it is used in PostGIS [http://postgis.net/] (a spatial library for Postgres). JTS also conforms with Open Geospatial Consortium (OGC) [ http://www.opengeospatial.org/] which is an open standard for the spatial data types. The Geometry data type is read from and written to text files using the Well Known Text (WKT) format. There is also a way to convert it to/from binary so that it can work with binary files and streams. 2- Add functions that manipulate spatial data types. These will be added as UDFs and we will not need to mess with the internals of Pig. Most probably, there will be one new class for each operation (e.g., union or intersection). I think it will be good to put these new operations inside the core of Pig so that users can use it without having to write the fully qualified class name. Also, since there is no way to implicitly cast a spatial data type to a non-spatial data types, there will not be any conflicts in existing operations or new operations. All new operations, and only the new operations, will be working on spatial data types. Here is an initial list of operations that can be added. All those operations are already implemented in JTS and the UDFs added to Pig will be just wrappers around them. **Predicates (used for spatial filtering) Equals Disjoint Intersects Touches Crosses Within Contains Overlaps **Operations Envelope Area Length Buffer ConvexHull Intersection Union Difference SymDifference **Aggregate functions Accum ConvexHull Union 3- The third step is to implement spatial indexes (e.g., Grid or R-tree). A Pig loader and Pig output classes will be created for those indexes. Note that currently we have SpatialOutputFormat and SpatialInputFormat for those indexes inside the Spatial Hadoop project, but we need to tweak them to work with Pig. 4- (Advanced) Implement more sophisticated algorithms for spatial operations that utilize the indexes. For example, we can have a specific algorithm for spatial range query or spatial join. Again, we already have algorithms built for different operations implemented in Spatial Hadoop as MapReduce programs, but they will need to be modified to work in Pig environment and get to work with other operations. This is my whole plan for the spatial extension to Pig. I've already started with the first step but as I mentioned earlier, I don't want to do the work for our project and then the work gets forgotten. I want to contribute to Pig and do my research at the same time. If you think the plan is plausible, I'll open JIRA issues for the above tasks and start shipping patches to do the stuff. I'll conform with the standards of the project such as adding tests and well commenting the code. Sorry for the long email and hope to hear back from you. Best regards, Ahmed Eldawy
[jira] [Commented] (PIG-3286) TestPigContext.testImportList fails in trunk
[ https://issues.apache.org/jira/browse/PIG-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13647138#comment-13647138 ] Daniel Dai commented on PIG-3286: - +1 > TestPigContext.testImportList fails in trunk > > > Key: PIG-3286 > URL: https://issues.apache.org/jira/browse/PIG-3286 > Project: Pig > Issue Type: Bug > Components: build >Affects Versions: 0.12 >Reporter: Cheolsoo Park >Assignee: Cheolsoo Park > Labels: test > Fix For: 0.12 > > Attachments: PIG-3286-2.patch, PIG-3286.patch > > > To reproduce, run ant clean test -Dtestcase=TestPigContext. It fails with the > following error: > {code} > junit.framework.AssertionFailedError: expected:<5> but was:<6> > at > org.apache.pig.test.TestPigContext.testImportList(TestPigContext.java:157) > {code} > This is a regression from PIG-3198 that added "java.lang." to the default > import list. Here is relevant code: > {code} > @@ -739,6 +739,7 @@ public class PigContext implements Serializable { > if (packageImportList.get() == null) { > ArrayList importlist = new ArrayList(); > importlist.add(""); > +importlist.add("java.lang."); > importlist.add("org.apache.pig.builtin."); > importlist.add("org.apache.pig.impl.builtin."); > packageImportList.set(importlist); > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3305) Infinite loop when input path contains empty partition directory
[ https://issues.apache.org/jira/browse/PIG-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13647134#comment-13647134 ] Daniel Dai commented on PIG-3305: - [~maczech]Can you make a patch for trunk? > Infinite loop when input path contains empty partition directory > - > > Key: PIG-3305 > URL: https://issues.apache.org/jira/browse/PIG-3305 > Project: Pig > Issue Type: Bug > Components: piggybank >Affects Versions: 0.10.1 >Reporter: Marcin Czech >Priority: Critical > Fix For: 0.10.1 > > Attachments: PIG-3305.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-3304) XMLLoader in piggybank does not work with inline closed tags
[ https://issues.apache.org/jira/browse/PIG-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai resolved PIG-3304. - Resolution: Fixed Fix Version/s: 0.12 Assignee: Ahmed Eldawy Hadoop Flags: Reviewed Piggybank tests pass. Patch commmitted to trunk. Thanks Ahmed! > XMLLoader in piggybank does not work with inline closed tags > > > Key: PIG-3304 > URL: https://issues.apache.org/jira/browse/PIG-3304 > Project: Pig > Issue Type: Bug > Components: piggybank >Affects Versions: 0.11.1 >Reporter: Ahmed Eldawy >Assignee: Ahmed Eldawy > Labels: patch > Fix For: 0.12 > > Attachments: xmlloader_inline_close_tag_1.patch, > xmlloader_inline_close_tag.patch > > > The XMLLoader fails to return elements when tags are closed inline such as > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3097) HiveColumnarLoader doesn't correctly load partitioned Hive table
[ https://issues.apache.org/jira/browse/PIG-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13647027#comment-13647027 ] Daniel Dai commented on PIG-3097: - The patch applies to trunk and all piggybank tests pass. Is it Ok to committed to trunk? > HiveColumnarLoader doesn't correctly load partitioned Hive table > - > > Key: PIG-3097 > URL: https://issues.apache.org/jira/browse/PIG-3097 > Project: Pig > Issue Type: Bug >Affects Versions: 0.10.1 >Reporter: Richard Ding >Assignee: Richard Ding > Labels: patch > Attachments: PIG-3097.patch > > > Given a partitioned Hive table: > {code} > hive> describe mytable; > OK > f1string > f2 string > f3 string > partition_dtstring > {code} > The following Pig script gives the correct schema: > {code} > grunt> A = load '/hive/warehouse/mytable' using > org.apache.pig.piggybank.storage.HiveColumnarLoader('f1 string,f2string,f3 > string'); > grunt> describe A > A: {f1: chararray,f2: chararray,f3: chararray,partition_dt: chararray} > {code} > But, the command > {code} > grunt> dump A > {code} > only produces the first column of all records in the table (all four columns > are expected). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2586) A better plan/data flow visualizer
[ https://issues.apache.org/jira/browse/PIG-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13647006#comment-13647006 ] Daniel Dai commented on PIG-2586: - Thanks, I will take a look. > A better plan/data flow visualizer > -- > > Key: PIG-2586 > URL: https://issues.apache.org/jira/browse/PIG-2586 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai > Labels: gsoc2013 > > Pig supports a dot graph style plan to visualize the > logical/physical/mapreduce plan (explain with -dot option, see > http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). > However, dot graph takes extra step to generate the plan graph and the > quality of the output is not good. It's better we can implement a better > visualizer for Pig. It should: > 1. show operator type and alias > 2. turn on/off output schema > 3. dive into foreach inner plan on demand > 4. provide a way to show operator source code, eg, tooltip of an operator > (plan don't currently have this information, but you can assume this is in > place) > 5. besides visualize logical/physical/mapreduce plan, visualize the script > itself is also useful > 6. may rely on some java graphic library such as Swing > This is a candidate project for Google summer of code 2013. More information > about the program can be found at > https://cwiki.apache.org/confluence/display/PIG/GSoc2013 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Want to contribute
Hi, Naidu, Those are Hadoop question and you should send to u...@hadoop.apache.org. A quick answer to double "hadoop namenode -format" question is, once you do that, you lose the metadata, you would have trouble getting hdfs files back. Thanks, Daniel On Wed, May 1, 2013 at 12:25 AM, Naidu MS wrote: > Hi i have two questions regarding hdfs and jps utility > > I am new to Hadoop and started leraning hadoop from the past week > > 1.when ever i start start-all.sh and jps in console it showing the > processes started > > *naidu@naidu:~/work/hadoop-1.0.4/bin$ jps* > *22283 NameNode* > *23516 TaskTracker* > *26711 Jps* > *22541 DataNode* > *23255 JobTracker* > *22813 SecondaryNameNode* > *Could not synchronize with target* > > But along with the list of process stared it always showing *" Could not > synchronize with target" *in the jps output. What is meant by "Could not > synchronize with target"? Can some one explain why this is happening? > > > 2.Is it possible to format namenode multiple times? When i enter the > namenode -format command, it not formatting the name node and showing the > following ouput. > > *naidu@naidu:~/work/hadoop-1.0.4/bin$ hadoop namenode -format* > *Warning: $HADOOP_HOME is deprecated.* > * > * > *13/05/01 12:08:04 INFO namenode.NameNode: STARTUP_MSG: * > */* > *STARTUP_MSG: Starting NameNode* > *STARTUP_MSG: host = naidu/127.0.0.1* > *STARTUP_MSG: args = [-format]* > *STARTUP_MSG: version = 1.0.4* > *STARTUP_MSG: build = > https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r > 1393290; compiled by 'hortonfo' on Wed Oct 3 05:13:58 UTC 2012* > */* > *Re-format filesystem in /home/naidu/dfs/namenode ? (Y or N) y* > *Format aborted in /home/naidu/dfs/namenode* > *13/05/01 12:08:05 INFO namenode.NameNode: SHUTDOWN_MSG: * > */* > *SHUTDOWN_MSG: Shutting down NameNode at naidu/127.0.0.1* > * > * > */* > > Can someone help me in understanding this? Why is it not possible to format > name node multiple times? > > > On Wed, May 1, 2013 at 9:47 AM, Cheolsoo Park > wrote: > > > Please see the following wiki page: > > https://cwiki.apache.org/confluence/display/PIG/HowToContribute > > > > Thanks, > > Cheolsoo > > > > > > On Tue, Apr 30, 2013 at 9:10 PM, Naidu MS > > wrote: > > > > > Hi How to get the source of pig? > > > I am interested in going to source code so that i can learn how the > > > framework is written. > > > I can help in fixing some minor bugs/jira issues. > > > Can some one help me how to get the source code ? > > > > > > > > > > > > Regards, > > > Naidu > > > > > > > > > On Wed, May 1, 2013 at 9:30 AM, Cheolsoo Park > > > wrote: > > > > > > > Welcome to Pig. There are hundreds of open jiras: > > > > > > > > > > > > > > > > > > https://issues.apache.org/jira/issues/?jql=project%20%3D%20PIG%20AND%20status%20%3D%20Open%20ORDER%20BY%20created%20DESC%2C%20priority%20DESC > > > > > > > > Please feel free to submit patches. > > > > > > > > Thanks, > > > > Cheolsoo > > > > > > > > > > > > > > > > On Tue, Apr 30, 2013 at 4:16 PM, Vineet Nair > > > wrote: > > > > > > > > > Hello all , > > > > > > > > > > I was just going through the source code of Pig and I would very > much > > > > like > > > > > to contribute to it. > > > > > I was just wondering if there are any small Jira requests that i > can > > > > start > > > > > working on. > > > > > > > > > > Thanks and regards, > > > > > Vineet > > > > > > > > > > > > > > >
[jira] [Commented] (PIG-3285) Jobs using HBaseStorage fail to ship dependency jars
[ https://issues.apache.org/jira/browse/PIG-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646998#comment-13646998 ] Daniel Dai commented on PIG-3285: - Agree with Rohini, that should be a simple fix to add protobuf.jar. We don't need to double ship jars and make things complicated. > Jobs using HBaseStorage fail to ship dependency jars > > > Key: PIG-3285 > URL: https://issues.apache.org/jira/browse/PIG-3285 > Project: Pig > Issue Type: Bug >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk > Fix For: 0.11.1 > > Attachments: 0001-PIG-3285-Add-HBase-dependency-jars.patch, > 0001-PIG-3285-Add-HBase-dependency-jars.patch, 1.pig, 1.txt, 2.pig > > > Launching a job consuming {{HBaseStorage}} fails out of the box. The user > must specify {{-Dpig.additional.jars}} for HBase and all of its dependencies. > Exceptions look something like this: > {noformat} > 2013-04-19 18:58:39,360 FATAL org.apache.hadoop.mapred.Child: Error running > child : java.lang.NoClassDefFoundError: com/google/protobuf/Message > at > org.apache.hadoop.hbase.io.HbaseObjectWritable.(HbaseObjectWritable.java:266) > at org.apache.hadoop.hbase.ipc.Invocation.write(Invocation.java:139) > at > org.apache.hadoop.hbase.ipc.HBaseClient$Connection.sendParam(HBaseClient.java:612) > at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:975) > at > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:84) > at $Proxy7.getProtocolVersion(Unknown Source) > at > org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:136) > at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3307) Refactor physical operators to remove methods parameters that are always null
[ https://issues.apache.org/jira/browse/PIG-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Le Dem updated PIG-3307: --- Description: The physical operators are sometimes overly complex. I'm trying to cleanup some unnecessary code. in particular there is an array of getNext(*T* v) where the value v does not seem to have any importance and is just used to pick the correct method. I have started a refactoring for a more readable getNext*T*(). was: The physical operators are sometimes overly complex. I'm trying to cleanup some unnecessary code. in particular there is an array of getNext(*T* v) where the value v does not seem to have any importance and is just use to pick the correct method. I have started a refactoring for a more readable getNext*T*(). > Refactor physical operators to remove methods parameters that are always null > - > > Key: PIG-3307 > URL: https://issues.apache.org/jira/browse/PIG-3307 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem >Assignee: Julien Le Dem > Attachments: PIG-3307_0.patch > > > The physical operators are sometimes overly complex. I'm trying to cleanup > some unnecessary code. > in particular there is an array of getNext(*T* v) where the value v does not > seem to have any importance and is just used to pick the correct method. > I have started a refactoring for a more readable getNext*T*(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3307) Refactor physical operators to remove methods parameters that are always null
[ https://issues.apache.org/jira/browse/PIG-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Le Dem updated PIG-3307: --- Attachment: PIG-3307_0.patch PIG-3307_0.patch contains the initial refactoring > Refactor physical operators to remove methods parameters that are always null > - > > Key: PIG-3307 > URL: https://issues.apache.org/jira/browse/PIG-3307 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem >Assignee: Julien Le Dem > Attachments: PIG-3307_0.patch > > > The physical operators are sometimes overly complex. I'm trying to cleanup > some unnecessary code. > in particular there is an array of getNext(*T* v) where the value v does not > seem to have any importance and is just use to pick the correct method. > I have started a refactoring for a more readable getNext*T*(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (PIG-3307) Refactor physical operators to remove methods parameters that are always null
Julien Le Dem created PIG-3307: -- Summary: Refactor physical operators to remove methods parameters that are always null Key: PIG-3307 URL: https://issues.apache.org/jira/browse/PIG-3307 Project: Pig Issue Type: Improvement Reporter: Julien Le Dem Assignee: Julien Le Dem Attachments: PIG-3307_0.patch The physical operators are sometimes overly complex. I'm trying to cleanup some unnecessary code. in particular there is an array of getNext(*T* v) where the value v does not seem to have any importance and is just use to pick the correct method. I have started a refactoring for a more readable getNext*T*(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-3306) Publish h2 artifact to maven
[ https://issues.apache.org/jira/browse/PIG-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham resolved PIG-3306. -- Resolution: Not A Problem Yup [~rohini] you're right we already do that. I should have known, I've published the last two releases. :) > Publish h2 artifact to maven > > > Key: PIG-3306 > URL: https://issues.apache.org/jira/browse/PIG-3306 > Project: Pig > Issue Type: Bug >Reporter: Bill Graham > > The Pig artifact built with hadoopversion=23 should be published to maven. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-3303) add hadoop h2 artifact to publications in ivy.xml
[ https://issues.apache.org/jira/browse/PIG-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Le Dem resolved PIG-3303. Resolution: Fixed Fix Version/s: 0.12 Merged in trunk > add hadoop h2 artifact to publications in ivy.xml > - > > Key: PIG-3303 > URL: https://issues.apache.org/jira/browse/PIG-3303 > Project: Pig > Issue Type: Bug >Reporter: Julien Le Dem >Assignee: Julien Le Dem > Fix For: 0.12 > > Attachments: PIG-3303.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3306) Publish h2 artifact to maven
[ https://issues.apache.org/jira/browse/PIG-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646796#comment-13646796 ] Rohini Palaniswamy commented on PIG-3306: - Is this jira for ivy-publish-local ? > Publish h2 artifact to maven > > > Key: PIG-3306 > URL: https://issues.apache.org/jira/browse/PIG-3306 > Project: Pig > Issue Type: Bug >Reporter: Bill Graham > > The Pig artifact built with hadoopversion=23 should be published to maven. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3306) Publish h2 artifact to maven
[ https://issues.apache.org/jira/browse/PIG-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646784#comment-13646784 ] Rohini Palaniswamy commented on PIG-3306: - Isn't it already published? http://repo1.maven.org/maven2/org/apache/pig/pig/0.10.1/ and http://repo1.maven.org/maven2/org/apache/pig/pig/0.11.1/ have those jars. > Publish h2 artifact to maven > > > Key: PIG-3306 > URL: https://issues.apache.org/jira/browse/PIG-3306 > Project: Pig > Issue Type: Bug >Reporter: Bill Graham > > The Pig artifact built with hadoopversion=23 should be published to maven. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3303) add hadoop h2 artifact to publications in ivy.xml
[ https://issues.apache.org/jira/browse/PIG-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646720#comment-13646720 ] Bill Graham commented on PIG-3303: -- +1 Created PIG-3306 for publishing the h2 artifact to maven. > add hadoop h2 artifact to publications in ivy.xml > - > > Key: PIG-3303 > URL: https://issues.apache.org/jira/browse/PIG-3303 > Project: Pig > Issue Type: Bug >Reporter: Julien Le Dem >Assignee: Julien Le Dem > Attachments: PIG-3303.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3303) add hadoop h2 artifact to publications in ivy.xml
[ https://issues.apache.org/jira/browse/PIG-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3303: - Assignee: Julien Le Dem > add hadoop h2 artifact to publications in ivy.xml > - > > Key: PIG-3303 > URL: https://issues.apache.org/jira/browse/PIG-3303 > Project: Pig > Issue Type: Bug >Reporter: Julien Le Dem >Assignee: Julien Le Dem > Attachments: PIG-3303.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (PIG-3306) Publish h2 artifact to maven
Bill Graham created PIG-3306: Summary: Publish h2 artifact to maven Key: PIG-3306 URL: https://issues.apache.org/jira/browse/PIG-3306 Project: Pig Issue Type: Bug Reporter: Bill Graham The Pig artifact built with hadoopversion=23 should be published to maven. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3285) Jobs using HBaseStorage fail to ship dependency jars
[ https://issues.apache.org/jira/browse/PIG-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646715#comment-13646715 ] Rohini Palaniswamy commented on PIG-3285: - Actually I got a little confused with the patch as TableMapReduce.addDependencyJars(job) was adding all those and focused just on Daniel's comment. If your intention is to only add protobuf jar, you can do a Class.forName(some protobuf class name) and if that does not throw a CNFE (meaning older hbase versions) you can pass that class also to the TableMapreduceUtil.addDependencyJars(Configuration conf, Class... classes) > Jobs using HBaseStorage fail to ship dependency jars > > > Key: PIG-3285 > URL: https://issues.apache.org/jira/browse/PIG-3285 > Project: Pig > Issue Type: Bug >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk > Fix For: 0.11.1 > > Attachments: 0001-PIG-3285-Add-HBase-dependency-jars.patch, > 0001-PIG-3285-Add-HBase-dependency-jars.patch, 1.pig, 1.txt, 2.pig > > > Launching a job consuming {{HBaseStorage}} fails out of the box. The user > must specify {{-Dpig.additional.jars}} for HBase and all of its dependencies. > Exceptions look something like this: > {noformat} > 2013-04-19 18:58:39,360 FATAL org.apache.hadoop.mapred.Child: Error running > child : java.lang.NoClassDefFoundError: com/google/protobuf/Message > at > org.apache.hadoop.hbase.io.HbaseObjectWritable.(HbaseObjectWritable.java:266) > at org.apache.hadoop.hbase.ipc.Invocation.write(Invocation.java:139) > at > org.apache.hadoop.hbase.ipc.HBaseClient$Connection.sendParam(HBaseClient.java:612) > at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:975) > at > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:84) > at $Proxy7.getProtocolVersion(Unknown Source) > at > org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:136) > at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3303) add hadoop h2 artifact to publications in ivy.xml
[ https://issues.apache.org/jira/browse/PIG-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646710#comment-13646710 ] Rohini Palaniswamy commented on PIG-3303: - +1 > add hadoop h2 artifact to publications in ivy.xml > - > > Key: PIG-3303 > URL: https://issues.apache.org/jira/browse/PIG-3303 > Project: Pig > Issue Type: Bug >Reporter: Julien Le Dem > Attachments: PIG-3303.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3305) Infinite loop when input path contains empty partition directory
[ https://issues.apache.org/jira/browse/PIG-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcin Czech updated PIG-3305: -- Attachment: PIG-3305.patch > Infinite loop when input path contains empty partition directory > - > > Key: PIG-3305 > URL: https://issues.apache.org/jira/browse/PIG-3305 > Project: Pig > Issue Type: Bug > Components: piggybank >Affects Versions: 0.10.1 >Reporter: Marcin Czech >Priority: Critical > Fix For: 0.10.1 > > Attachments: PIG-3305.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (PIG-3305) Infinite loop when input path contains empty partition directory
Marcin Czech created PIG-3305: - Summary: Infinite loop when input path contains empty partition directory Key: PIG-3305 URL: https://issues.apache.org/jira/browse/PIG-3305 Project: Pig Issue Type: Bug Components: piggybank Affects Versions: 0.10.1 Reporter: Marcin Czech Priority: Critical Fix For: 0.10.1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3097) HiveColumnarLoader doesn't correctly load partitioned Hive table
[ https://issues.apache.org/jira/browse/PIG-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcin Czech updated PIG-3097: -- Attachment: PIG-3097.patch > HiveColumnarLoader doesn't correctly load partitioned Hive table > - > > Key: PIG-3097 > URL: https://issues.apache.org/jira/browse/PIG-3097 > Project: Pig > Issue Type: Bug >Affects Versions: 0.10.1 >Reporter: Richard Ding >Assignee: Richard Ding > Labels: patch > Attachments: PIG-3097.patch > > > Given a partitioned Hive table: > {code} > hive> describe mytable; > OK > f1string > f2 string > f3 string > partition_dtstring > {code} > The following Pig script gives the correct schema: > {code} > grunt> A = load '/hive/warehouse/mytable' using > org.apache.pig.piggybank.storage.HiveColumnarLoader('f1 string,f2string,f3 > string'); > grunt> describe A > A: {f1: chararray,f2: chararray,f3: chararray,partition_dt: chararray} > {code} > But, the command > {code} > grunt> dump A > {code} > only produces the first column of all records in the table (all four columns > are expected). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3097) HiveColumnarLoader doesn't correctly load partitioned Hive table
[ https://issues.apache.org/jira/browse/PIG-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcin Czech updated PIG-3097: -- Labels: patch (was: ) Affects Version/s: 0.10.1 Status: Patch Available (was: Open) We are using 0.10.1 version, so this fix is for this version. The fix is extremely simple so it should be easy to put it to trunk. > HiveColumnarLoader doesn't correctly load partitioned Hive table > - > > Key: PIG-3097 > URL: https://issues.apache.org/jira/browse/PIG-3097 > Project: Pig > Issue Type: Bug >Affects Versions: 0.10.1 >Reporter: Richard Ding >Assignee: Richard Ding > Labels: patch > > Given a partitioned Hive table: > {code} > hive> describe mytable; > OK > f1string > f2 string > f3 string > partition_dtstring > {code} > The following Pig script gives the correct schema: > {code} > grunt> A = load '/hive/warehouse/mytable' using > org.apache.pig.piggybank.storage.HiveColumnarLoader('f1 string,f2string,f3 > string'); > grunt> describe A > A: {f1: chararray,f2: chararray,f3: chararray,partition_dt: chararray} > {code} > But, the command > {code} > grunt> dump A > {code} > only produces the first column of all records in the table (all four columns > are expected). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Want to contribute
Hi i have two questions regarding hdfs and jps utility I am new to Hadoop and started leraning hadoop from the past week 1.when ever i start start-all.sh and jps in console it showing the processes started *naidu@naidu:~/work/hadoop-1.0.4/bin$ jps* *22283 NameNode* *23516 TaskTracker* *26711 Jps* *22541 DataNode* *23255 JobTracker* *22813 SecondaryNameNode* *Could not synchronize with target* But along with the list of process stared it always showing *" Could not synchronize with target" *in the jps output. What is meant by "Could not synchronize with target"? Can some one explain why this is happening? 2.Is it possible to format namenode multiple times? When i enter the namenode -format command, it not formatting the name node and showing the following ouput. *naidu@naidu:~/work/hadoop-1.0.4/bin$ hadoop namenode -format* *Warning: $HADOOP_HOME is deprecated.* * * *13/05/01 12:08:04 INFO namenode.NameNode: STARTUP_MSG: * */* *STARTUP_MSG: Starting NameNode* *STARTUP_MSG: host = naidu/127.0.0.1* *STARTUP_MSG: args = [-format]* *STARTUP_MSG: version = 1.0.4* *STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; compiled by 'hortonfo' on Wed Oct 3 05:13:58 UTC 2012* */* *Re-format filesystem in /home/naidu/dfs/namenode ? (Y or N) y* *Format aborted in /home/naidu/dfs/namenode* *13/05/01 12:08:05 INFO namenode.NameNode: SHUTDOWN_MSG: * */* *SHUTDOWN_MSG: Shutting down NameNode at naidu/127.0.0.1* * * */* Can someone help me in understanding this? Why is it not possible to format name node multiple times? On Wed, May 1, 2013 at 9:47 AM, Cheolsoo Park wrote: > Please see the following wiki page: > https://cwiki.apache.org/confluence/display/PIG/HowToContribute > > Thanks, > Cheolsoo > > > On Tue, Apr 30, 2013 at 9:10 PM, Naidu MS > wrote: > > > Hi How to get the source of pig? > > I am interested in going to source code so that i can learn how the > > framework is written. > > I can help in fixing some minor bugs/jira issues. > > Can some one help me how to get the source code ? > > > > > > > > Regards, > > Naidu > > > > > > On Wed, May 1, 2013 at 9:30 AM, Cheolsoo Park > > wrote: > > > > > Welcome to Pig. There are hundreds of open jiras: > > > > > > > > > > > > https://issues.apache.org/jira/issues/?jql=project%20%3D%20PIG%20AND%20status%20%3D%20Open%20ORDER%20BY%20created%20DESC%2C%20priority%20DESC > > > > > > Please feel free to submit patches. > > > > > > Thanks, > > > Cheolsoo > > > > > > > > > > > > On Tue, Apr 30, 2013 at 4:16 PM, Vineet Nair > > wrote: > > > > > > > Hello all , > > > > > > > > I was just going through the source code of Pig and I would very much > > > like > > > > to contribute to it. > > > > I was just wondering if there are any small Jira requests that i can > > > start > > > > working on. > > > > > > > > Thanks and regards, > > > > Vineet > > > > > > > > > >