Re: GSoc 2013 proposal for Map Reduce support
Hi Udesh, Awesome, looking forward to working with you in the Gora community! Cheers, Chris ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -Original Message- From: Udesh Liyanaarachchi la.udesh1...@gmail.com Reply-To: dev@gora.apache.org dev@gora.apache.org Date: Sunday, March 31, 2013 7:27 AM To: dev@gora.apache.org dev@gora.apache.org Subject: GSoc 2013 proposal for Map Reduce support Hi Folks, I'm Udesh Liyanaarachchi and looking forward to participate for GSoc 2013 under an ASF project. I came across Gora and really interested on working a proposal related to these project goals of Gora. - *Analysis :* Accesing the data and making analysis through adapters for Apache Pig, Apache Hive and Cascading - *MapReduce support :* Out-of-the-box and extensive MapReduce (Apache Hadoop) support for data in the data store. I went trough the issues of Gora and found out the [1] new feature request which was not assigned yet. I would like to work on this to full fill the project goals above mentioned. ( Adoption of Pig and Hadoop.) It would be great to know more details about this idea and suggestions of issues related to this idea would be appreciated. Any help will be appreciated. Link [1] https://issues.apache.org/jira/browse/GORA-112 -- * **Udesh Liyanaarachchi* B.sc. Eng (Undergraduate) Department of Computer Science Engineering University Of Moratuwa Sri Lanka.
Re: [Tajo Wiki] Update of Roadmap by HyunsikChoi
Thanks Hyunsik! I wonder if we can make a connection with Tajo to Gora on this project. Maybe we can generate a Gora based front-end to Tajo? I'm CC'ing the Gora folks here for thoughts. Great roadmap! Cheers, Chris On 3/26/13 2:28 AM, Apache Wiki wikidi...@apache.org wrote: Dear Wiki user, You have subscribed to a wiki page or wiki category on Tajo Wiki for change notification. The Roadmap page has been changed by HyunsikChoi: http://wiki.apache.org/tajo/Roadmap Comment: Moved the roadmap from github wiki. New page: = Roadmap = == Milestone == * 0.2 - first release as an incubating project focused on ASF compliance * 0.3 - more stable API and robust features and a rudimentary cost-based optimizer * 0.4 - more SQL supports and more improved cost-based optimizer * 0.5 - a native columnar execution engine == Long Term Plan == * Integration with Hadoop ecosystem * Tajo catalog needs to support HCatalog or needs to be compatible to Hive meta. * The native columnar execution engine * Cost-based optimization which also includes a rewrite rule engine and various rewrite rules == Short/Mid Term Plan == * Improvement of the DAG framework * Query is both FSM and a DAG representation. * It would be good to separate Query to a FSM part and a DAG part. * We need easier interface to edit and build DAGs. * RCFile * In the current implementation, RCFile is not compatible to Hive's one because Tajo's RCFile uses Datum to (de)serialize data. So, we will have additional RCFile wrapper class compatible to Hive's files. * ORCFile * It looks promising. We need to port ORCFile. * Trevni * TrevniScanner works well in most cases. However, it doesn't support null value. We need to handle it. * hadoop security in tajo-rpc * tajo-rpc does not support hadoop security. Since Tajo will be a part of Hadoop ecosystem, we need to apply hadoop security to tajo-rpc. * Intermediate Data Format * As I mentioned above, Tajo uses CSV as the intermediatee data format. It may cause CPU overhead and is relatively large to be transmitted via networks. We need to change it. * JDBC/ODBC drivers * Tajo is a relational DW system. If we have such connectors, it can be easily integrated with existing BI and OLAP tools. * Restful API * It's very useful for web-based applications. * Proper resource allocation for SubQuery (i.e., Execution Block in PPT) * SubQuery is one step of multiple query steps. For each subquery, QueryMaster launches TaskRunners via Yarn, and the launched TaskRunners are reused within a subquery. * Now, QueryMaster assigns the fixed-sized resource (2G memory) to subqueries regardless of necessary resource. We need to improve it to allocate proper resources to subqueries. For example, QueryMaster assigns 1G to one subquery for only scan or assigns 2G to another subquery including joins. * Error handling of TajoCli * TajoCli is a command line interface that uses Jline2. However, its error handling is awful. It frequently halts when trivial exceptions onccur. * SQL data types * Currently, Tajo provides data types (i.e., byte, bool, int, long, float, double, bytes, and string) based on Java primitive types. Tajo should support SQL standard data types. * Local mode * Queries are always executed in a distributed mode. In other words, it always uses Yarn. However, it is inconvenience for debugging and is inefficient in single machine. We need to implement something for local mode. * Parallel launch of containers * Currently, node containers are executed sequentially (see TaskRunnerLauncherImpl.java). It looks very inefficient. We can improve it by using ExecutorService. * Output commit * In some cases, Tajo is fault tolerance. It requires output commit mechanism. However, Tajo does not support it, and we need this feature. * Broadcast join and Limit operator * As I mentioned before, they are disabled after Yarn port. We should enable them. * HbaseScanner/Appender * Hbase will be a great storage for Tajo.
Re: [DRAFT] Gora Report
Haha with the extra T! :) On 2/9/13 9:51 AM, Ioannis Canellos ioca...@gmail.com wrote: +1 LGTMT :-) -- *Ioannis Canellos* * ** Blog: http://iocanel.blogspot.com ** Twitter: iocanel *
Re: [DRAFT] Gora Report
+1 from me. Cheers, Chris On 2/6/13 3:17 PM, Henry Saputra henry.sapu...@gmail.com wrote: +1 LGTM Thanks for the report writeup Lewis. - Henry On Thu, Jan 31, 2013 at 4:05 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi All, We need to report again this month. Please see below for the report and please add/remove content where you see appropriate. I'll get this committed when we are done. Thanks Lewis - The Apache Gora open source framework provides an in-memory data model and persistence for big data. Gora supports persisting to column stores, key value stores, document stores and RDBMSs, and analyzing the data with extensive Apache Hadoop MapReduce support. Project Releases The Apache Gora team was happy to announce the release of Gora 0.2.1 on 7th August 2012. No releases have been made since however a clear staregy has been established for the 0.3 release. Overall Project Activity since last report Since last reporting, the PMC has geared the development drive towards the 0.3 release. We have addressed and resolved 28 of 33 issues meaning that the progression towards an RC for 0.3 is well on the way. We currently have two blockers which nee to be addressed before we can consider the 0.3 RC. How has the community developed since the last report? Activity on the user@ list has been very slow since last reporting. It was invisaged that after ApacheConEU user interest might pick up slightly, however this has not materialized as we hoped. Activity on dev@ has developed in line with our expectations as we move towards more regular Gora releases. Generally speaking more work needs to be done in an attempt to make it easier for people to use Gora. This is something which the PMC need to work on. Changes to PMC Committers The Gora PMC were very pleased to invite and have Alfonso Nishikawa join our ranks in early December. After working with the PMC to ensure smooth transition into the Apache community Alfonso is now contributing to Gora and making a real impact. Alfonso also joined the Gora PMC. PMC and Committer diversity We currently have committers from a wide variety of Apache projects including, Nutch, Tika, OODT, Camel, Solr, Accumulo, Whirr Hadoop (this is not an exhaustive list). We are still actively seeking one or more members to join the team from the Avro community so this will be a main target for us in the future post 0.3 release. Project Branding or Naming issues NONE Legal issues NONE -- Lewis
Re: Making Sense of NoSQL
Thanks Lewis! From: Henry Saputra henry.sapu...@gmail.commailto:henry.sapu...@gmail.com Reply-To: u...@gora.apache.orgmailto:u...@gora.apache.org u...@gora.apache.orgmailto:u...@gora.apache.org Date: Thursday, February 7, 2013 12:46 PM To: u...@gora.apache.orgmailto:u...@gora.apache.org u...@gora.apache.orgmailto:u...@gora.apache.org Cc: dev@gora.apache.orgmailto:dev@gora.apache.org dev@gora.apache.orgmailto:dev@gora.apache.org Subject: Re: Making Sense of NoSQL Hey Lewis, Thanks for sharing - Henry On Thu, Feb 7, 2013 at 12:40 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.commailto:lewis.mcgibb...@gmail.com wrote: Hi, I recently spoke with Dan McGreary the (co)Author of the soon to be published Making sense of NoSQL http://www.manning.com/mccreary/ Thought that it may be a link a few of us would be interested in :) Best Lewis -- Lewis
Re: [DISCUSS] Review Board Opinions/Policy for Gora
Hey Lewis, Same opinion here as for nutch and OODT for that matter. +1 to using it but not mandating that it be used. Thanks buddy Cheers, Chris Sent from my iPhone On Jan 31, 2013, at 4:00 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi All, I thought I would create this thread as the Review Board platform has been floating around now for a bit and I wonder if we can leverage it to improve/streamline the efficiency of Gora community contributions. So I thought I'd leave this thread nice and short. 1) I am new to Review Board. I don't know much about it. I haven't used it before. 2) I am interested to see if we can make contributions and particularly reviewing a more open and transparent process. 3) I want to hear what you guys think. Some links which may be of interest [0][1][2] Ta Lewis [0] https://blogs.apache.org/infra/entry/reviewboard_instance_running_at_the [1] https://reviews.apache.org [2] http://www.reviewboard.org/ -- Lewis
Re: Attendance @ ApacheCon NA 2013 Portland
I'll be there (and so will Paul + Cam + Andrew + the rest of the OODT/Gora/etc. peeps that you know and love from JPL) Cheers! Chris On 1/9/13 9:52 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi All, This thread speaks for itself. Who is going, who is not. I'm just in a new position so I don't know if it is appropriate, convenient for me to take the short trip up to Portland, however Gora PMC/community members would surely build the case for me going. Any takers? Best Lewis -- *Lewis*
Re: [DISCUSS] Timeline for Gora 0.3 Release Thoughts
Happy New Year my friend and my only comment is I'll be happy to use some cycles to review the RC once it's produced. That and on my plate this year might be to develop and Apache OODT file manager catalog impl based on Gora :) But that's for later. Until then 0.3 away! Cheers, Chris On 1/3/13 5:42 AM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi All, Firstly, Happy New Year everyone. I really hope that 2013 is a good year for everyone. It would be excellent to get a Gora 0.3 release done, however there are a couple of blocking issues. As I see it we have the following GORA-182 https://issues.apache.org/jira/browse/GORA-182 Nutch 2.1 does not work with gora-cassandra 0.2.1 GORA-170 https://issues.apache.org/jira/browse/GORA-170 Getting a BufferUnderflowException in class CassandraColumn, method fromByteBuffer() GORA-188 https://issues.apache.org/jira/browse/GORA-188 testSerdeWebPage failure - PersistentBase#equals() fails with map fields GORA-189 https://issues.apache.org/jira/browse/GORA-189 String parameters in generated Persistent subclasses by Compiler -not only Utf8- The thing is that some of these are linked, and I also anticipate that we may run into other problems once some/all have been resolved so to speak. The purpose of this thread is to attempt to draw up some roadmap for releasing, and of course to understand what is required in the development drive for us to reach this target. Any input would be excellent. Best Lewis -- *Lewis*
Re: Gora Board Report
+1 looking good to me Lewis. Maybe we should add that Julien's ACEU Nutch talk had a few key slides on Gora and mentioned it. Cheers, Chris On Nov 13, 2012, at 10:10 AM, Lewis John Mcgibbney wrote: Hi All, Please see the proposed report as below. I have committed it to the agenda however changes csn be made so please chip in if there is anything you wish to add. Thanks for now Lewis The Apache Gora open source framework provides an in-memory data model and persistence for big data. Gora supports persisting to column stores, key value stores, document stores and RDBMSs, and analyzing the data with extensive Apache Hadoop MapReduce support. Project Releases The Apache Gora team was happy to announce the release of Gora 0.2.1 on 7th August 2012. No releases have been made since. Overall Project Activity since last report The last report quoted that good progress was being made on the GSoC project, these efforts have now come to fruition with the recent merge of a goraamazon branch with the trunk code. The entire efforts which went into GSoC project can now be leveraged and enjoyed by Gora users and dev's. In all 17 of 77 issues have been addressed since we last reported. How has the community developed since the last report? Our GSoC student (and now PMC member and Committer) Renato Marroquín Mogrovejo recently presented on Gora @ACEU, continued exposure of this calibre will most certainly aid in building out the community. We have also witnessed Gora users from outside the typical community posting presentations based on Gora use cases, this is also very encouraging. Changes to PMC Committers The Gora PMC were very pleased to invite and have Renato Marroquín Mogrovejo join as PMC and Committer. This was the result of a long summers participation in the GSoC project as well as Renato's interest in getting the Gora brand out there at this years ACEU. We look forward to more contributions which build on the great work done over the summer. PMC and Committer diversity We currently have committers from a wide variety of Apache projects including, Nutch, Tika, OODT, Camel, Solr, Accumulo, Whirr Hadoop (this is not an exhaustive list). Within the scope of the 0.3 development drive is an extensive upgrade of the Avro code, so we will be vigilant in attracting new members. Project Branding or Naming issues NONE Legal issues NONE -- Lewis
Re: Gora @ ACEU
That is awesome guys! Wish I could be there! Cheers, Chris On Sep 13, 2012, at 9:39 AM, Lewis John Mcgibbney wrote: Hi Everyone, A quick note to say that Renato and myself will be giving a presentation on Gora at this years ApacheCon EU Community Developers conference (Sinsheim, Germany 5–8 November 2012). The presentation will cover Gora from incubation through to continuous ingestion (e.g. the full works) and will also show off this years GSoC project as well. In addition it will will be an excellent opportunity for any Gora community members and dev's to get together for a meet up and to meet developers from a whole host of other Apache projects and backgrounds. Early Birds tickets are available until the 1st October. For more information please see here [0] and the announcement on our site [1] As always it would be great to hear from anyone who wishes a particular item to be included in the agenda so please get in touch with us on list and we will try to integrate it accordingly. Have a great day and looking forward to seeing as many of you in Germany in November as possible. Thank you Lewis [0] http://www.apachecon.eu [1] http://gora.apache.org/#12+September%2C+2012%3A+Apache+Gora+at+ApacheCon+EU+2012 -- Lewis ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: [ANN] Apache Gora successfully participates in Google Summer of Code 2012
Haha, yeah it came through weird on my machine (probably the charset, so funny) -- I knew of Renato, but not sure about the other ;) Cheers, Chris On Aug 24, 2012, at 2:25 PM, Lewis John Mcgibbney wrote: Hi Chris, On Fri, Aug 24, 2012 at 10:19 PM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: Amazing job, and amazing job to the students doing the work -- Renato and MarroquÃn great work!!! What a laugh I just had. It's one guy. I think maybe I should have just called him Renato and removed the multiple name ambiguity :0) Anyway congrats Renato enjoy the remainder of your summer. Best Lewis ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: [VOTE] Apache Gora 0.2.1 Release Candidate
Hi Lewis, +1 from me: SIGS check out: [chipotle:~/tmp/gora-0.2.1] mattmann% ls apache-gora-0.2.1-src.tar.gz apache-gora-0.2.1-src.tar.gz.md5 apache-gora-0.2.1-src.zip apache-gora-0.2.1-src.zip.md5 apache-gora-0.2.1-src.tar.gz.asc apache-gora-0.2.1-src.tar.gz.sha apache-gora-0.2.1-src.zip.asc apache-gora-0.2.1-src.zip.sha [chipotle:~/tmp/gora-0.2.1] mattmann% $HOME/bin/verify_gpg_sigs Verifying Signature for file apache-gora-0.2.1-src.tar.gz.asc gpg: Signature made Thu Jul 26 10:00:24 2012 PDT using RSA key ID C601BCA7 gpg: Good signature from Lewis John McGibbney (CODE SIGNING KEY) lewi...@apache.org gpg: WARNING: This key is not certified with a trusted signature! gpg: There is no indication that the signature belongs to the owner. Primary key fingerprint: 2A23 D53F 8D27 5CB6 91E1 89C1 F45E 7970 C601 BCA7 Verifying Signature for file apache-gora-0.2.1-src.zip.asc gpg: Signature made Thu Jul 26 10:00:24 2012 PDT using RSA key ID C601BCA7 gpg: Good signature from Lewis John McGibbney (CODE SIGNING KEY) lewi...@apache.org gpg: WARNING: This key is not certified with a trusted signature! gpg: There is no indication that the signature belongs to the owner. Primary key fingerprint: 2A23 D53F 8D27 5CB6 91E1 89C1 F45E 7970 C601 BCA7 Checksums check out: [chipotle:~/tmp/gora-0.2.1] mattmann% $HOME/bin/verify_md5_checksums md5sum: stat '*.bz2': No such file or directory apache-gora-0.2.1-src.tar.gz: OK apache-gora-0.2.1-src.zip: OK [chipotle:~/tmp/gora-0.2.1] mattmann% Built and ran tests, however ran into an error as the source won't build for me when running ant package: [..snip..] init-module: clean-lib: [delete] Deleting directory /Users/mattmann/tmp/gora-0.2.1/apache-gora-0.2.1/gora-cassandra/lib resolve: [mkdir] Created dir: /Users/mattmann/tmp/gora-0.2.1/apache-gora-0.2.1/gora-cassandra/lib [ivy:resolve] :: loading settings :: file = /Users/mattmann/tmp/gora-0.2.1/apache-gora-0.2.1/ivy/ivysettings.xml [ivy:resolve] downloading /Users/mattmann/.ivy2/local/org.apache.gora/gora-core/0.2-incubating/jars/gora-core.jar ... [ivy:resolve] ... (117kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] org.apache.gora#gora-core;0.2-incubating!gora-core.jar (22ms) compile: [javac] Compiling 18 source files to /Users/mattmann/tmp/gora-0.2.1/apache-gora-0.2.1/gora-cassandra/build/classes [javac] /Users/mattmann/tmp/gora-0.2.1/apache-gora-0.2.1/gora-cassandra/src/main/java/org/apache/gora/cassandra/query/CassandraColumn.java:23: package me.prettyprint.hector.api does not exist [javac] import me.prettyprint.hector.api.Serializer; [javac] ^ For me it's not a blocker since I'm not actively building and installing the software, so it might be something I'm doing and I don't want to block progress. But I felt I should report it nonetheless. I'm running on Mac OS X 10.6.8, and with JDK6: Darwin chipotle.jpl.nasa.gov 10.8.0 Darwin Kernel Version 10.8.0: Tue Jun 7 16:32:41 PDT 2011; root:xnu-1504.15.3~1/RELEASE_X86_64 x86_64 [chipotle:~/tmp/gora-0.2.1/apache-gora-0.2.1] mattmann% [chipotle:~/tmp/gora-0.2.1/apache-gora-0.2.1] mattmann% java -version java version 1.6.0_33 Java(TM) SE Runtime Environment (build 1.6.0_33-b03-424-10M3720) Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03-424, mixed mode) [chipotle:~/tmp/gora-0.2.1/apache-gora-0.2.1] mattmann% ant -version Apache Ant(TM) version 1.8.2 compiled on May 17 2012 [chipotle:~/tmp/gora-0.2.1/apache-gora-0.2.1] mattmann% Cheers, Chris On Jul 26, 2012, at 11:47 AM, Lewis John Mcgibbney wrote: Hi Everyone, A candidate for the Apache Gora 0.2.1 RC#1 is available at: http://people.apache.org/~lewismc/apache-gora-0.2.1 The release candidate is a src.zip and src.tar.gz ONLY archive of the sources in: http://svn.apache.org/repos/asf/gora/tags/apache-gora-0.2.1 We release Gora 0.2.1 in this fashion due to the likelihood that users will regularly recompile the code to suit dynamic requirements. Further, a staged Maven repository of the 0.2.1 jar, sources.jar and javadoc.jar is available here: https://repository.apache.org/content/repositories/orgapachegora-091 Please vote on releasing this package as Apache Gora 0.2.1. The vote is open for the next 72 hours and passes if a majority of at least three +1 Gora PMC votes are cast. [ ] +1 Release this package as Apache Gora 0.2.1 [ ] -1 Do not release this package because... Many Thanks and heres to plenty more. Kind Regards, Lewis P.S. Here's my +1. -- Lewis ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science
Re: [VOTE] Apache Gora 0.2.1 Release Candidate
Completely forgot about the Maven integration, awesome (I think I worked on that, which just shows how out of the loop I am, at least I have an excuse [Mars]) :) Anyhoo, Maven builds fine, but I had to use mvn -Dmaven.test.skip=true: [INFO] [INFO] [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Gora ... SUCCESS [25.971s] [INFO] Apache Gora :: Core ... SUCCESS [18.874s] [INFO] Apache Gora :: Hbase .. SUCCESS [10.720s] [INFO] Apache Gora :: Accumulo ... SUCCESS [13.515s] [INFO] Apache Gora :: Cassandra .. SUCCESS [27.047s] [INFO] Apache Gora :: SQL SUCCESS [10.824s] [INFO] Apache Gora :: Tutorial ... SUCCESS [8.958s] [INFO] Apache Gora :: Sources-Dist ... SUCCESS [14.306s] [INFO] [INFO] [INFO] BUILD SUCCESSFUL [INFO] [INFO] Total time: 2 minutes 24 seconds [INFO] Finished at: Tue Aug 07 07:25:18 PDT 2012 [INFO] Final Memory: 92M/123M [INFO] [chipotle:~/tmp/gora-0.2.1/apache-gora-0.2.1] mattmann% B/c it hung in the Gora-Hbase tests as you predicted. Thanks Lewis! +1 again from me. Cheers, Chris On Aug 7, 2012, at 7:03 AM, Lewis John Mcgibbney wrote: Hi Chris, Thanks for the feeback. Thanks for the check on the checksums :0). Please see below On Tue, Aug 7, 2012 at 2:59 PM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: Built and ran tests, however ran into an error as the source won't build for me when running ant package: The Ant build should not be working now. Gora moved to Maven a while ago and to be honest the Ant stuff is cluttering up the place. For the time being I think this is confusing but we can live with it! I hope this is the consensus anyway. Thanks very much again and I'll get the release done now :0) Lewis -- Lewis ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: [VOTE] Apache Gora 0.2.1 Release Candidate
No worries dude you are the man. I've VOTE up there when I see the MD5s and I think everything else is looking good. Cheers, Chris On Aug 4, 2012, at 7:17 AM, Lewis John Mcgibbney wrote: Hi Chris, Apologies for delay. been a wild week. Thanks for pointing out, this is a mistake on my part and I will sort this out tomorrow. I'll get the missing md5's loaded with the various artifacts tomorrow. Apologies about this. Lewis On Wed, Aug 1, 2012 at 5:15 PM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: Hey Lewis, At first blush, I don't see any md5 files here for checksums. Is that correct? If so, we'll need to put them up there? Cheers, Chris On Jul 26, 2012, at 11:47 AM, Lewis John Mcgibbney wrote: Hi Everyone, A candidate for the Apache Gora 0.2.1 RC#1 is available at: http://people.apache.org/~lewismc/apache-gora-0.2.1 The release candidate is a src.zip and src.tar.gz ONLY archive of the sources in: http://svn.apache.org/repos/asf/gora/tags/apache-gora-0.2.1 We release Gora 0.2.1 in this fashion due to the likelihood that users will regularly recompile the code to suit dynamic requirements. Further, a staged Maven repository of the 0.2.1 jar, sources.jar and javadoc.jar is available here: https://repository.apache.org/content/repositories/orgapachegora-091 Please vote on releasing this package as Apache Gora 0.2.1. The vote is open for the next 72 hours and passes if a majority of at least three +1 Gora PMC votes are cast. [ ] +1 Release this package as Apache Gora 0.2.1 [ ] -1 Do not release this package because... Many Thanks and heres to plenty more. Kind Regards, Lewis P.S. Here's my +1. -- Lewis ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -- Lewis ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Fwd: Call for Papers for ApacheCon Europe 2012 now open!
FYI... Begin forwarded message: From: Nick Burch nick.bu...@alfresco.com Date: July 19, 2012 1:14:57 PM CDT To: committ...@apache.org Subject: Call for Papers for ApacheCon Europe 2012 now open! Reply-To: apachecon-disc...@apache.org Hi All We're pleased to announce that the Call for Papers for ApacheCon Europe 2012 is finally open! (For those who don't already know, ApacheCon Europe will be taking place between the 5th and the 9th of November this year, in Sinsheim, Germany.) If you'd like to submit a talk proposal, please visit the conference website at http://www.apachecon.eu/ and sign up for a new account. Once you've signed up, use your dashboard to enter your speaker bio, then submit your talk proposal(s). There's more information on the CFP page on the conference website. We welcome talk proposals from all projects, from right across the bredth of projects at the foundation! To make things easier for talk selection and scheduling, we'd ask that you tag your proposal with the track that it most closely fits within. The details of the tracks, and what projects they expect to cover, are available at http://www.apachecon.eu/tracks/. (If your project/group of projects was intending to submit a track, and missed the deadline, then please get in touch with us on apachecon-disc...@apache.org straight away, so we can work out if it's possible to squeeze you in...) The CFP will close on Friday 3rd August, so you've a little over weeks to send in your talk proposal. Don't put it off! We'll look forward to seeing some great ones shortly! Thanks Nick (On behalf of the Conferences committee) ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: [DISCUSS] Apache Gora 0.3 Release
+1 to roll release and I'll also throw my name into the hat to release it. Let me know. Thanks! Cheers, Chris On Jul 5, 2012, at 12:30 PM, Lewis John Mcgibbney wrote: Hi, As the GSoC project is moving along nicely and it's been some 3 or so months since the 0.2 release I was thinking about drumming up support for another (possibly even 0.2.1) release? We have some 15 issues which have been addressed in the development drive since 0.2 was released and I for one have not had quite as much time as I would have liked recently to put serious time into Gora. What do you guys think? I am more than happy to work as RM again if required. Thank you in advance Lewis -- Lewis ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ Phone: +1 (818) 354-8810 ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: Kazuomi Kashii as new Gora PMC Member and Committer
Congrats! Welcome Kazuomi! Cheers, Chris On Jun 11, 2012, at 8:11 AM, Lewis John Mcgibbney wrote: Good Afternoon Everyone, I am please to say that after some VOTE'ing by the Gora PMC, we are very happy to welcome Kazuomi to the Gora PMC. Please feel free to introduce yourself Kazuomi and maybe describe a bit about your involvement with Gora if you so wish. All the best Lewis -- Lewis ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: [REPORT] Apache Gora
+1 looks great! On May 8, 2012, at 4:42 AM, Lewis John Mcgibbney wrote: Hi Guys, I've been a tad busy as of late and usually pass on this report prior to sending it off to board but didn't get it produced in quite as much time as I would have liked. Please see below for this months report. Thanks Lewis - Apache Gora The Apache Gora open source framework provides an in-memory data model and persistence for big data. Gora supports persisting to column stores, key value stores, document stores and RDBMSs, and analyzing the data with extensive Apache Hadoop MapReduce support. Project Releases The Apache Gora team was happy to announce the release of Gora 0.2 on 24th April 2012 (1st since graduation from the Incubator). During the process of this release, the team managed to simplify the release process somewhat so we look forward to the shift towards a more incremental release policy within the project as a whole. As a side note, an experimental branch of Nutch is now very close to a community VOTE further to the recent Gora 0.2 release. Overall Project Activity since last report The majority of work directly concerned the 0.2 release as mentioned above. Since then there have been a few commits, however there has also been some encouraging conversation with respect to our site migration to the Apache CMS. We currently have some 50 odd issues to work on within the 0.3 development drive. How has the community developed since the last report? Encouragingly we have had a number of new issues logged on Jira from new Gora users. On most occasions this has sparked decent conversations however we are lacking patches from these new Gora users. Some excellent news was that our Gora Amazon DynamoDB Google Summer of Code project was succesfully accepted into this years programme. The prospective student, Renato has been on list and we look forward to kicking things off later this month. Changes to PMC Committers NONE PMC and Committer diversity We currently have committers from a wide variety of projects including, Nutch, Tika, OODT, Camel, Solr, Accumulo Hadoop (this is not an exhaustive list). Within the scope of the 0.3 development drive is an extensive upgrade of the Avro code, so we will be vigilant in attracting new members. We also look forward to the progression of GSoC. Hopefully our student progresses to become part of the Gora team in due course. Project Branding or Naming issues NONE Legal issues NONE -- Lewis ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: [jira] [Closed] (GORA-28) Merge back recent changes in 0.1-incubating to trunk
Lewis you are a beast! Drinks are on me next time we cross paths (probably at an airport ;) ). Cheers, Chris On Apr 6, 2012, at 1:23 PM, Lewis John McGibbney (Closed) (JIRA) wrote: [ https://issues.apache.org/jira/browse/GORA-28?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed GORA-28. Bulk close of issues. Preparation for Gora 0.2 release candidate. Merge back recent changes in 0.1-incubating to trunk Key: GORA-28 URL: https://issues.apache.org/jira/browse/GORA-28 Project: Apache Gora Issue Type: Task Components: build process Reporter: Andrzej Bialecki Assignee: Chris A. Mattmann Fix For: 0.2 The following revisions need to be merged back: 1080072, 1080091. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: Time for Gora 0.2 Release Candidate?
Hey Lewis, I've got the following this weekend: Apache SIS 0.2 RC #2 Apache Nutch 1.5 RC #2 (and likely Nutchgora 2.0 after Gora) I can add Gora 0.2 RC to the list unless someone beats me to it this weekend, and if folks are +1. Cheers, Chris On Apr 5, 2012, at 12:44 PM, Lewis John Mcgibbney wrote: Hi Guys, Since working with Ferdy to upgrade hadoop-core and tests deps, I thought it might be best to raise this conversation again and see if it has any traction. Regarding the remaining issues [0] GORA-76 - We committed this earlier. Tests pass successfully although there are many dodgy stack traces in the log output. At first glance this looks horrible, but upon some investigation I found that the HBase guys are having problems of a similar nature [1] [2] which makes me think that we are definitely not alone in this one, possibly bugs further upstream??? For the time being, I have no issues here, and also would like to resolve and close. GORA-65 - I am extremely happy with Keith's datastore and pleased to see it make its way into the 0.2 release. I have no issues here, and also would like to resolve and close. GORA-63 - I really don't know what's happening with this little gem, but surely it's not a blocker for the release. I agree that it certainly should be addressed in the future, but to date I've been unable to get things working with it so can't confirm if it should be committed. If anyone else would like to try Enis' patch then please do and try to compile the examples... if they work great. GORA-53 - I had marked this as critical. It has quite literally become the thorn in my side of recent. I've made reasonable progress but there is a final missing link which I can't seem to fix. As the community is more HBase oriented, I suggest to bump this to 0.3 development and I will pick it up then. If there are any objections then OK, but I hope I have justified myself on this one. Looking forward to hearing your thoughts and opinions... and also looking towards VOTE'ing on a 0.2 RC :0) Best Lewis [0] https://issues.apache.org/jira/browse/GORA/fixforversion/12315541#atl_token=A5KQ-2QAV-T4JA-FDED|674efcd881a26e09b262373d1c2addb1762e06ea|linselectedTab=com.atlassian.jira.plugin.system.project%3Aversion-issues-panel [1]https://issues.apache.org/jira/browse/HBASE-4709 [2] https://issues.apache.org/jira/browse/HBASE-5711 -- *Lewis* ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: Creation of user@ lists
Hi Lewis, +1 from me. I'm happy to moderate emails, you can add me to the list :) Cheers, Chris On Apr 4, 2012, at 4:37 AM, Lewis John Mcgibbney wrote: Hi Guys, I just opened https://issues.apache.org/jira/browse/INFRA-4649 and the question came upregarding PMC consensus for 1) Creating the list, and 2) 2-3 moderators (one of which I will fill) So... firstly, do we actually want a user@ list? I didn't realise that we required PMC consensus, but it makes sense as it might be more mail for people, and also I suppose that our dev list isn't so busy just now so is it actually required? I just thought it rather odd that AFAIK Gora seems to be the only TLP with no user@ list. Finally, if we do want a list do we have at least one other list moderator? Thanks Lewis -- *Lewis* ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: Republish Gora trunk Javadoc
Thanks d00d! Cheers, Chris On Apr 4, 2012, at 4:32 AM, Lewis John Mcgibbney wrote: Aye it does. I'll commit this today and write our report as well. Ta Chris. lewis On Wed, Apr 4, 2012 at 2:31 AM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: Hey Lewis, I think you can do: 1. mvn javadoc:aggregate (from top level) 2. cp -R target/site/apidocs ../site/publish/apidocs-X.Y 3. cd ../site/publish; svn commit -m ... Make sense? Cheers, Chris On Apr 3, 2012, at 3:09 PM, Lewis John Mcgibbney wrote: Hi Guys, I've published site documentation and have included my experiences of doing so on our wiki, but would like to republish the Javadoc as there have been some recent commits that I would like to get pushed for others to view via Javadoc. Can anyone provide details of how I got about doing this? Thanks Lewis -- *Lewis* ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -- *Lewis* ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: DRAFT GORA REPORT
Super +1. Great report dude. Cheers, Chris On Apr 4, 2012, at 11:50 AM, Lewis John Mcgibbney wrote: Hi Everyone, Please see below for a draft report. I'll send this in tomorrow unless there are objections or anything to add. Thanks Lewis Apache Gora The Apache Gora open source framework provides an in-memory data model and persistence for big data. Gora supports persisting to column stores, key value stores, document stores and RDBMSs, and analyzing the data with extensive Apache Hadoop MapReduce support. Project Releases The last official project release was made on 24/09/2011 which was the 0.1.1-incubating release (2nd whilst in the Incubator). Since last reporting there have been few commits but the ones we've seen have been fairly significant, but still 4 issues to be addressed before we can progress to a 0.2 release candidate. Major issues to be addressed include implementing tests for the gora-cassandra module, an upgrade to Hadoop 1.0.0. Overall Project Activity since last report Activity roughly shadows last months average, with nothing exceptional taking place. A blocker issue with our usage of a particular sql library has been dealt with, additionally Keith Turner was able to commit his gora-accumulo module, as the distribution of Accumulo was released and available for us to use. Ferdy committed a nice piece of work which now provides users with the ability to properly support multiple data store implementations in parallel. We've also seen keen interest for our proposed GSoC project which is to add a gora-Amazon DyanmoDB module to the project and look forward to picking up traction with this in the near future. How has the community developed since the last report? We recently received (rather encouragingly) that someone struggled to join the user@ list. This was because this list did not exist, it has however now been created. We've had some questions coming into the project regarding the hbase module, and whether or not we were going to support certain features within Gora, however unfortunately none of these issues lead to any commits from outside the existing community. Changes to PMC Committers NONE PMC and Committer diversity We currently have committers from a wide variety of projects including, Nutch, Tika, OODT, Camel, Solr, Accumulo Hadoop (this is not an exhaustive list). There is work to be done with the Avro implementations, so once we are 100% ready to work on these issues, we will be looking to interest members of the Avro community in Gora. It would also be nice to attract members of the Hector and Cassandra community so we will work towards this goal. Project Branding or Naming issues NONE Legal issues NONE -- *Lewis* ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: Republish Gora trunk Javadoc
Hey Lewis, I think you can do: 1. mvn javadoc:aggregate (from top level) 2. cp -R target/site/apidocs ../site/publish/apidocs-X.Y 3. cd ../site/publish; svn commit -m ... Make sense? Cheers, Chris On Apr 3, 2012, at 3:09 PM, Lewis John Mcgibbney wrote: Hi Guys, I've published site documentation and have included my experiences of doing so on our wiki, but would like to republish the Javadoc as there have been some recent commits that I would like to get pushed for others to view via Javadoc. Can anyone provide details of how I got about doing this? Thanks Lewis -- *Lewis* ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: Apache Gora Git Mirror
Thanks Keith! Cheers, Chris On Apr 2, 2012, at 8:11 AM, Lewis John Mcgibbney wrote: Nice one Keith. I'm personally not using the Git mirror but I think everyone else does so thanks. Lewis On Mon, Apr 2, 2012 at 2:53 PM, Keith Turner ke...@deenlo.com wrote: I think the apache Gora Git mirror is incorrect. I opened https://issues.apache.org/jira/browse/INFRA-4636 Keith -- *Lewis* ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: Apache Gora Git Mirror
Hi Renato, Apache Infra is supporting an experiment with Git to allow a smallish (but growing) number of projects to use the VC tool and get feedback and understand its use at the ASF before rolling it out to everyone. So far I've heard great things, though. You can follow the progress here: http://wiki.apache.org/general/GitAtApache Cheers, Chris On Apr 2, 2012, at 1:11 PM, Renato Marroquín Mogrovejo wrote: Hi Keith, So is this Apache official? I mean Apache Infrastructure is giving support for Git repositories or is it just an initial effort? There was a discussion a while ago over the Pig-Latin mailing list about this, but they said that Apache wasn't really into git, and that there was no official Apache support for this. Just curious :) Renato M. El día 2 de abril de 2012 11:00, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov escribió: Thanks Keith! Cheers, Chris On Apr 2, 2012, at 8:11 AM, Lewis John Mcgibbney wrote: Nice one Keith. I'm personally not using the Git mirror but I think everyone else does so thanks. Lewis On Mon, Apr 2, 2012 at 2:53 PM, Keith Turner ke...@deenlo.com wrote: I think the apache Gora Git mirror is incorrect. I opened https://issues.apache.org/jira/browse/INFRA-4636 Keith -- *Lewis* ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: [GSoC 2012] Gora-DynamoDB datastore
Hey Renato, Awesome to hear! Really looking forward to you becoming an active contributor and part of the project... Cheers, Chris On Mar 23, 2012, at 4:05 PM, Renato Marroquín Mogrovejo wrote: Hi everyone, My name is Renato Marroquin and I would also like to apply for GSoC this year. I have a bit of experience with MapReduce, specifically Hadoop and Pig-Latin. I have completed my masters degree on databases (specifically cloud data management) and I really want to keep on working with hadoop related technologies specially with the ones which involve the open source community (: While doing my masters I was working for a bioinformatics laboratory where we tested a couple of NoSQL solutions (HBase and Cassandra) for genomic data. But going through the code of both Cassandra and HBase they really have come a long way. We also worked creating different data flows for scientific data analysis using MapReduce and Pig-Latin. Hope to be able to become a more active member on the list, so we can make Gora a bigger project. Renato M. ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: Apache Gora March board report (draft)
+1 from me, Lewis, Looks awesome. RE: gora-sql: what are the next steps there? Is it something the legal committee can help out with? At worst, I'd imagine, we could move gora-sql to a non-released area of SVN and then push forward with an 0.2 that doesn't include it? Cheers, Chris On Mar 6, 2012, at 7:26 AM, Lewis John Mcgibbney wrote: Hi All, Time of month again, we have monthly reporting for the first three months after graduation so please review and comment accordingly. I'll get it committed as of Apache Gora The Apache Gora open source framework provides an in-memory data model and persistence for big data. Gora supports persisting to column stores, key value stores, document stores and RDBMSs, and analyzing the data with extensive Apache Hadoop MapReduce support. Project Releases The last official project release was made on 24/09/2011 which was the 0.1.1-incubating release (2nd whilst in the Incubator). Although there have been some commits to three Gora modules, some remaining issues still need to be addressed before we are ready to roll out a 0.2 release. This is rather frustrating, as the main issue which blocks a 0.2 release is a licensing issue on the gora-sql store. It is widely acknowledged that although gora-sql is the least used module, it is the one which requires attention to rectify this licensing issue. Overall Project Activity since last report Since graduation all documentation, and infrastructure has been successfully migrated over to TLP status. We've witnessed development and committs to three datastores and a number of issues addressed as a result. It is safe to say that the gora-hbase store seems to be attracting the most development, however there are also a number of issues which have recently been opened for the gora-cassandra module. In the last report we stated that there were two new modules in the process of being integrated into the project (namely gora-accumulo and gora-solr), however due to blocking issues with releases for Solr and Accumulo, we are not able to release the gora modules. How has the community developed since the last report? The Gora mailing lists have seen some activity with regards to gora-accumulo, however generally speaking traffic was quite low for the month until this report. We do however have some excellent news that we are submitting an application to Google Summer of Code. The project comprises a gora-dynamodb (Amazon Dynamo DB) module again making best efforts to open up Gora to a wider audience. Changes to PMC Committers Since VOTE'ing Keither Turner on to the team no new members have been accepted into the Gora development team. Additionally, no new VOTE'ing has taken place. PMC and Committer diversity We currently have committers from a wide variety of projects including, Nutch, Tika, OODT, Camel, Solr, Accumulo Hadoop (this is not an exhaustive list). There is work to be done with the Avro implementations, so once we are 100% ready to work on these issues, we will be looking to interest members of the Avro community in Gora. Project Branding or Naming issues We recently sorted out all branding and trademark issues through trademarks@ . As this has been addressed there is no more to add at this stage. Legal issues We currently have an issue with one particular library being used within the gora-sql store. It is the communities intention to remove this LGPL licensed library from the codebase replacing it with implementations from JOOQ an ASL 2.0 licensed library. The re-write of the gora-sql module building from the JOOQ library has been delayed until after our 0.2 release. -- *Lewis* ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: Apache Gora March board report (draft)
Hey Lewis, Gotcha, no problem. One option is simply to remove the sqlbuilder library and to replace SqlStore with stubbed out methods that throw Exceptions and that simply don't work. Then we ship Gora 0.2, with a note saying don't use SqlStore. Then if folks complain and want to use it, we ask that they help to write the method implementations using an ALv2 compatible library. Would Derby be useful here, or Hypersonic SQL? Both of those are ALv2 compatible I think. Cheers, Chris On Mar 6, 2012, at 7:40 AM, Lewis John Mcgibbney wrote: Hi Chris, We'll in all honesty, I think it's best for me to provide a brief summary. We need to remove our dependency upon and use of the healthmarketscience sqlbuilder libary. Used very little in SqlStore class [0]. The problem I ran into when looking into this was that unfortunately JOOQ doesn't support data definition language constructs which is exactly what we require. Although this is in the pipeline (for JOOQ), I have a feeling that it may be some way off. I did however run into a couple of examples of implementing similar functionality to what we require, unfortunately I got distracted and haven't had time to go back to it. I have included all of my resources on the offending issue on our Jira so anyone trying to pick it up would really be starting off from where I got to. This is the blocking issue here and unfortunately it doesn't look like something legal guys can help out with, it simply a case of understanding exactly what functionality we require then finding a replacement and someone that can code it in using a more suitably licensed library. If it is for the sake of pushing out with a Gora 0.2 release as per your requirement's E.g. archive gora-sql then I would probably back this for the time being... I am not using the gora-sql module, and in all honesty I don't think that many others are hence the level of interest to get this module up to scratch! I began a thread a week or so ago, and to be honest the gora-sql issue was the only blocked in my opinion. Thanks for chiming in on this one. Lewis [0] http://svn.apache.org/viewvc/gora/trunk/gora-sql/src/main/java/org/apache/gora/sql/store/SqlStore.java?view=markup On Tue, Mar 6, 2012 at 3:29 PM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: +1 from me, Lewis, Looks awesome. RE: gora-sql: what are the next steps there? Is it something the legal committee can help out with? At worst, I'd imagine, we could move gora-sql to a non-released area of SVN and then push forward with an 0.2 that doesn't include it? Cheers, Chris On Mar 6, 2012, at 7:26 AM, Lewis John Mcgibbney wrote: Hi All, Time of month again, we have monthly reporting for the first three months after graduation so please review and comment accordingly. I'll get it committed as of Apache Gora The Apache Gora open source framework provides an in-memory data model and persistence for big data. Gora supports persisting to column stores, key value stores, document stores and RDBMSs, and analyzing the data with extensive Apache Hadoop MapReduce support. Project Releases The last official project release was made on 24/09/2011 which was the 0.1.1-incubating release (2nd whilst in the Incubator). Although there have been some commits to three Gora modules, some remaining issues still need to be addressed before we are ready to roll out a 0.2 release. This is rather frustrating, as the main issue which blocks a 0.2 release is a licensing issue on the gora-sql store. It is widely acknowledged that although gora-sql is the least used module, it is the one which requires attention to rectify this licensing issue. Overall Project Activity since last report Since graduation all documentation, and infrastructure has been successfully migrated over to TLP status. We've witnessed development and committs to three datastores and a number of issues addressed as a result. It is safe to say that the gora-hbase store seems to be attracting the most development, however there are also a number of issues which have recently been opened for the gora-cassandra module. In the last report we stated that there were two new modules in the process of being integrated into the project (namely gora-accumulo and gora-solr), however due to blocking issues with releases for Solr and Accumulo, we are not able to release the gora modules. How has the community developed since the last report? The Gora mailing lists have seen some activity with regards to gora-accumulo, however generally speaking traffic was quite low for the month until this report. We do however have some excellent news that we are submitting an application to Google Summer of Code. The project comprises a gora-dynamodb (Amazon Dynamo DB) module again making best efforts to open up Gora to a wider audience. Changes to PMC Committers Since VOTE'ing Keither
Fwd: Google Summer of Code 2012 upcoming
Guys, FYI...in case anyone is thinking of GSoC, deadlines are approaching. Process is described below... Thanks! Cheers, Chris Begin forwarded message: From: Ulrich Stärk u...@apache.org Date: March 4, 2012 9:01:07 AM PST To: p...@apache.org p...@apache.org Cc: d...@community.apache.org d...@community.apache.org Subject: Google Summer of Code 2012 upcoming Reply-To: priv...@hadoop.apache.org priv...@hadoop.apache.org Hello PMCs, Google Summer of Code is the ideal opportunity for you to attract new contributors to your projects. If you want to participate with your project you NOW need to - understand what it means to be a mentor [1] - propose your project ideas. Just label your issues with gsoc2012 in JIRA and they will show up at [2]. See also [1]. - subscribe to code-awa...@apache.org (restricted to potential mentors, meant to be used as a private list - general discussions on the public d...@community.apache.org list as much as possible please) The ASF will apply as a participating organization with GSoC, your project doesn't need to do that. See [3] for more information. Note that the ASF isn't accepted yet, nevertheless you *really* should start recording your ideas now. Last year we had 38 students completing GSoC successfully, some of which are now active contributors to the projects they worked on. Let's make this a success again this year! On behalf of the GSoC 2012 admins, Uli [1] http://community.apache.org/guide-to-being-a-mentor.html [2] http://s.apache.org/gsoc2012tasks [3] http://community.apache.org/gsoc.html ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Fwd: [blog post] Accumulo, Nutch, and Gora
FYI...awesome! Begin forwarded message: From: Jason Trost jason.tr...@gmail.com Date: February 28, 2012 5:41:23 PM PST To: common-u...@hadoop.apache.org common-u...@hadoop.apache.org Subject: [blog post] Accumulo, Nutch, and Gora Reply-To: common-u...@hadoop.apache.org common-u...@hadoop.apache.org Blog post for anyone who's interested. I cover a basic howto for getting Nutch to use Apache Gora to store web crawl data in Accumulo. Let me know if you have any questions. Accumulo, Nutch, and GORA http://www.covert.io/post/18414889381/accumulo-nutch-and-gora --Jason ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: GSoC 2012 Project
Hey Lewis, This sounds awesome! I will try my best to help review patches/etc., and to comment on the design. Super +1. Cheers, Chris On Feb 11, 2012, at 6:18 AM, Lewis John Mcgibbney wrote: Hi Guys, I thought best to keep this conversation on this thread but to rename it. Renatoj and myself are going to put in a submission for a GSoC project which seeks to achieve the following: Provide a gora-amazondynamodb (shorter name suggestions please :0)) module for Gora. Chris provided a link to the ASL'ed Amazon SDK, in addition it would utilize Apache Whirr for spinning up the Amazon cloud instance. Although we are in early days here, I am keen to push ahead with expanding the scope here to provide more details on implementation, however it would be great if you guys could chip so we get a better idea of where to take this project. I think this is really interesting and has potential to really go somewhere so your feedback is greatly appreciated. Thanks for now Lewis On Tue, Feb 7, 2012 at 11:36 PM, Henry Saputra henry.sapu...@gmail.comwrote: Cool and looks like its ASF 2.0 license too =) - Henry On Mon, Feb 6, 2012 at 7:07 AM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: Hey Lewis, That sounds like a fantastic idea! See here: https://github.com/amazonwebservices/aws-sdk-for-java/ Looks like it's ALv2 licensed as well... Cheers, Chris On Feb 6, 2012, at 6:47 AM, Lewis John Mcgibbney wrote: Hi Chris, Something which got me thinking was if Gora provided a module for Amazon DynamoDB [1]. Does this sound like a project suited to GSoC? Ta Lewis [1] http://aws.amazon.com/dynamodb/ On Mon, Feb 6, 2012 at 4:29 AM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: Guys FYI. If anyone is interested in getting a GSoC student for Gora, the info is below. Feel free to reach out to dev@community.a.o for further questions. Cheers, Chris Begin forwarded message: From: Ross Gardler rgard...@opendirective.com Date: February 5, 2012 1:45:18 PM PST To: d...@community.apache.org d...@community.apache.org Subject: RE: [Announce] Google Summer of Code 2012 Reply-To: d...@community.apache.org d...@community.apache.org For those new to GSoC you might want to review the roles defined at http://community.apache.org/mentoringprogramme.html and the GSoC specific info at http://community.apache.org/gsoc.html (yet to be updated for 2012) Sent from my mobile device, please forgive errors and brevity. On Feb 5, 2012 8:31 PM, Franklin, Matthew B. mfrank...@mitre.org wrote: ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -- *Lewis* ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -- *Lewis* ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: Going emeritus
+1 to simply keeping it as it is, and not distinguishing between Emeritus and other. We are all a community here and I'm sure Julien will be around... Cheers, Chris On Feb 11, 2012, at 8:46 AM, DigitalPebble wrote: Hi Lewis Thanks for being gen up about this Julien. I'm just regretful I wasn't around when you were rocking the boat in the beginning... Hmm, to be honest Dogacan and Enis did all the hard work :-) I'll get in touch with board as suggested. Saw that, thanks Can you please comment if you wish me to add an emeritus section to the credits page? If not then I'll leave you in where you are :0) We could have a former committers section - up to you Thanks Julien Thanks Lewis On Thu, Feb 9, 2012 at 3:55 PM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: Thanks for your awesomeness Julien! Lewis: note, emeritus is largely a status of choice by an existing committee member at Apache. It simply requires you to send an [ACK] request to the board, but you simply leave Julien in the committee roster, and if he desires to become active again, we can let the board know but that's pretty much it. Cheers, Chris On Feb 9, 2012, at 2:17 AM, Julien Nioche wrote: Hi guys, The subject says it all. I can't be considered an active committer in GORA and don't think I will be able to contribute any time in the foreseable future. I also believe that the list of committers of a project should reflect the actual number of people actively involved. Of course if I do contribute again in the future you can always reinvite me for committership ;-) I was involved in GORA pretty much from day one when we discussed it with Enis and Dogacan as part of Nutch 2.0 and helped pushing it towards incubation. The project is now TLP, is slowly getting a larger audience and a good committer base and I am confident that the people on the PMC will steer it in the right direction. Thanks! Julien -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com http://twitter.com/digitalpebble ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -- *Lewis* -- ** * Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com http://www.digitalpebble.com* ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: Gora monthly report
Great work Lewis, +1 from me and saw it fly by on board@. Cheers, Chris On Feb 2, 2012, at 3:58 AM, Lewis John Mcgibbney wrote: Ok I'm committing this in the next half hour . Thanks for feedback. Lewis On Wed, Feb 1, 2012 at 6:24 PM, Enis Söztutar enis@gmail.com wrote: Great writeup. +1 for the report. On Tue, Jan 31, 2012 at 5:42 AM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi Guys, OK so a couple of things here. 1) Please use svn switch http://svn.apache.org/repos/asf/gora/trunk/ to switch your trunk workspace to the new TLP svn repos. 2) I've added ALL existing PMC members and committers to the PMC and Unix group on people.apache.org except the following Keith - Waiting to hear back from Board@ as this is a fundamental change to the PMC, I need to comply with the specified rules [1]. Hopefully this won't be too long so please nudge me if i don't get back to you shortly. In addition can you provide your ASF UID as well please. [1] http://apache.org/dev/pmc.html#newpmc OK now on with the report. The Board are meeting on 15th Feb, so we've got to have it submitted some weeks before hand, therefore I'll leave this thread open for a week or so before committing and sending it off through the official channels. Please comment accordingly. Thanks guys. Apache Gora Project Description Apache Gora is an open source framework providing an in memory data model and persistence for big data. Gora supports persisting to column stores, key value stores, document stores and RDBMSs, and analyzing the data with extensive Hadoop MapReduce support. (This new project description comes from Enis' comments on gora-dev@, how does it sound? No-one ever got back to him) Project Releases The last official project release was made on 24/09/2011 which was the 0.1.1-incubating release (2nd whilst in the Incubator). Discussion has surfaced on the project development strategy for spinning a 0.2 RC within the next month or so. In addition to the numerous improvements made between 0.1.1-incubating and 0.2 we plan to implement two additional data store modules, namely gora-solr and gora-accumulo. The addition of these modules not only drives the project towards its extended project description and aspirations, but will also ensure that Apache Gora is extended to two diverse and active communities with the ASF. Overall Project Activity since last report Since reporting we have identified several improvements to be made to the codebase. These issues are actively being worked on, however the task of migrating all Gora infrastructure to TLP status is currently taking precedence. In terms of project activity, over the last month, our mailing lists have witnessed sustainable levels of activity with noticeably higher volumes of traffic. Daniel Sharaf has done an excellent job of migrating the overwhelming majority of the incubator infrastructure over to TLP configuration. How has the community developed since the last report? Having just been accepted into the ASF as a TLP, the Gora PMC is pleased to announce that having VOTE'd we have successfully attracted one new committer and PMC member, Keith Turner. In addition we've become aware of some other more discrete members of the community chipping in. This is promising, as members from communities as far afield as Apache James are actively monitoring the Gora development lists. Additionally we have connected with members of the JOOQ community regarding a possible re-write of the gora-sql module as well as members of the Hector Developers community, a formal announcement was made on 15/01/2012 (cross posted to the Hector user lists) which states continued and committed support for the Hector client API within the gora-cassandra module. This was followed by members of the Hector development team formally stating on the Gora dev lists that they are happy to work collaboratively to make both Hector and Gora better projects. Changes to PMC Committers The Gora PMC is pleased to announce that Keith Turner was VOTE'd in as PMC member and Committer. Hopefully by the time this report is presented to the Board, Keith will be on board and able to actively maintain and improve the gora-accumulo module, also opening Gora up to the Accumulo community. PMC and Committer diversity We currently have committers from a wide variety of projects including, Nutch, Tika, OODT, Camel, Solr to name a few. Once Keith's credentials are processed we can add Accumulo to this list. This list is not exhaustive. Project Branding or Naming issues As we migrate the website to the TLP infrastructure we have created an issue on the Gora Jira to ensure that all ASF branding and naming guidelines are implemented fully and that Apache Gora is in compliance. Legal issues We currently have an issue with one particular library being used within the gora-sql store. It is the