I can confirm Apache got in! :) The slot assignment is not yet clear however.
And, because mailing people to death is what I do, volunteers for mentoring? On Thu, Apr 4, 2013 at 9:49 PM, Shannon Quinn <squ...@gatech.edu> wrote: > According to the GSoC calendar, accepted organizations aren't posted until > April 8 (Monday), at which point (assuming Apache is accepted...I can't > imagine it wouldn't be) slots will be doled out internally. This will > probably take at least a day or two, so probably by middle of next week > we'll know how many slots Mahout has. > > Speaking of which: how do the various subprojects negotiate for slots? Is > there a central spreadsheet, or an IRC meeting to attend? Or did I miss the > email detailing this? > > > On 4/4/13 2:43 PM, Dan Filimon wrote: > >> Any news on this front? Did we get approved/assigned a slot/anything? >> >> >> On Fri, Mar 29, 2013 at 7:44 PM, Dan Filimon <dangeorge.fili...@gmail.com >> >**wrote: >> >> Ok, updated! >>> >>> >>> On Fri, Mar 29, 2013 at 7:36 PM, Andy Twigg <andy.tw...@gmail.com> >>> wrote: >>> >>> Dan, >>>> >>>> I think what you've written is fine (I wanted to edit to remove the >>>> '?' around random forests but couldn't). >>>> >>>> ok? >>>> >>>> >>>> >>>> On 29 March 2013 11:14, Dan Filimon <dangeorge.fili...@gmail.com> >>>> wrote: >>>> >>>>> I added Andy's first suggestion and Ted's suggestion as ideas. >>>>> >>>>> Andy, could you flesh out your second suggestion into a project and >>>>> >>>> make an >>>> >>>>> issue please? >>>>> >>>>> >>>>> On Fri, Mar 29, 2013 at 3:53 AM, Ted Dunning <ted.dunn...@gmail.com> >>>>> >>>> wrote: >>>> >>>>> It should be possible to view a Lucene index as a matrix. This would >>>>>> require that we standardize on a way to convert documents to rows. >>>>>> >>>>> There >>>> >>>>> are many choices, the discussion of which should be deferred to the >>>>>> >>>>> actual >>>> >>>>> work on the project, but there are a few obvious constraints: >>>>>> >>>>>> a) it should be possible to get the same result as dumping the term >>>>>> >>>>> vectors >>>> >>>>> for each document each to a line and converting that result using >>>>>> >>>>> standard >>>> >>>>> Mahout methods. >>>>>> >>>>>> b) numeric fields ought to work somehow. >>>>>> >>>>>> c) if there are multiple text fields that ought to work sensibly as >>>>>> >>>>> well. >>>> >>>>> Two options include dumping multiple matrices or to convert the >>>>>> fields >>>>>> into a single row of a single matrix. >>>>>> >>>>>> d) it should be possible to refer back from a row of the matrix to >>>>>> >>>>> find the >>>> >>>>> correct document. THis might be because we remember the Lucene doc >>>>>> >>>>> number >>>> >>>>> or because a field is named as holding a unique id. >>>>>> >>>>>> e) named vectors and matrices should be used if plausible. >>>>>> >>>>>> On Thu, Mar 28, 2013 at 4:58 PM, Dan Filimon < >>>>>> >>>>> dangeorge.fili...@gmail.com >>>> >>>>> wrote: >>>>>>> ... >>>>>>> Ted, could you explain a bit more what you mean by "simplify the >>>>>>> >>>>>> connection >>>>>> >>>>>>> to Lucene for clustering and classification"? It's too vague for an >>>>>>> >>>>>> idea >>>> >>>>> proposal. >>>>>>> >>>>>>> >>>> >>>> -- >>>> Dr Andy Twigg >>>> Junior Research Fellow, St Johns College, Oxford >>>> Room 351, Department of Computer Science >>>> http://www.cs.ox.ac.uk/people/**andy.twigg/<http://www.cs.ox.ac.uk/people/andy.twigg/> >>>> andy.tw...@cs.ox.ac.uk | +447799647538 >>>> >>>> >>> >