GData
How hard would it be to build a GData server using Solr? An open-source, Lucene-based GData server would be a good thing to have. Does this fit in Solr, or should it be a separate project? http://code.google.com/apis/gdata/overview.html Another summer of code project? Doug
Re: GData
It would require some work to add this to Solr, but not a huge effort. One of the most crucial missing pieces that I'm beginning to feel a strong need for is being able to update a single field in a Lucene index. I notice the GData protocol supports this: http://code.google.com/apis/gdata/protocol.html#Updating-an-entry So to turn it around to ask you a question, what would it take to allow a Lucene document to be updatable at the field granularity, such that no other fields need to be specified again? The idea of using HTTP 1.1 PUT/DELETE methods has been discussed for Solr before, and I think it'd be a great idea to support Atom and GData, and perhaps even legacy RSS. Currently Solr's request and response handling are pretty intertwined with the rest of the system and some decoupling needs to take place to facilitate plug-ability in the external interfaces. Nothing too awfully difficult I don't think, but not something that is currently possible out of the box. Erik On Apr 20, 2006, at 11:59 AM, Doug Cutting wrote: How hard would it be to build a GData server using Solr? An open- source, Lucene-based GData server would be a good thing to have. Does this fit in Solr, or should it be a separate project? http://code.google.com/apis/gdata/overview.html Another summer of code project? Doug
Re: GData
On 4/20/06, Erik Hatcher [EMAIL PROTECTED] wrote: So to turn it around to ask you a question, what would it take to allow a Lucene document to be updatable at the field granularity, such that no other fields need to be specified again? That sounds like quite a job in Lucene... one thing for a stored field, but quite another for indexed fields. Even if you could update things like TermDocs, you don't know what terms are currently pointing to your document. I personally don't see an easy (or remotely practical) way. The easiest way I can think of to get that effect is to store all the fields so you can re-create the Document and change the field being updated. -Yonik
Re: GData
: The easiest way I can think of to get that effect is to store all the : fields so you can re-create the Document and change the field being : updated. My brief reading of hte GData URL Doug sent suggestes that the overall theme is content storage -- if that's the goal, mandating that modify operations require all fields be stored wouldn't sacrifice functionality. As far as the output format -- it seems to me that just like A9's OpenSearch this cold probably be done entirely with an XSLT. The input format is the trickier part .. that's where we'd definitely need a more pluggable parser for dealing with incoming data. ... but we're going to need that anyway if we want to support posting CSV files and things like that. -Hoss
[jira] Commented: (SOLR-3) create test harness and port TestApp to junit
[ http://issues.apache.org/jira/browse/SOLR-3?page=comments#action_12375416 ] Yonik Seeley commented on SOLR-3: - Yes, I suppose SolrTest can be removed nostalgic sniff... create test harness and port TestApp to junit - Key: SOLR-3 URL: http://issues.apache.org/jira/browse/SOLR-3 Project: Solr Type: Task Reporter: Hoss Man Assignee: Hoss Man Priority: Minor Attachments: SOLR-3.zip, TestBasicFunctionality.java, TestHarness.java To both encourage good internal development, and to make it easy for plugin developers to write unit tests of their own code I think we need a harness that makes it easy to unit test updates and queries against Solr (without needing a servlet container) Once we have this, i think we can/should also retire TestApp in favor of some JUnit tests (which would probably make more sense for other developers) Iv'e already started on this, i thought i'd have something to commit tonight, but i got distracted ... filing this bug as a tracker for now. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: summer of code: solr for apache mail archives
On 4/19/06, Doug Cutting [EMAIL PROTECTED] wrote: http://wiki.apache.org/general/SummerOfCode2006 I'd be happy to co-mentor this with someone from Solr. OK, I added solr-mail-archive to the Wiki (pretty much copied your description) -Yonik
Re: summer of code: solr for apache mail archives
If you could get it so that it interfaces with mod-mbox (what they currently use) that would be a better solution for the ASF infrastructure I think. On 4/21/06, Yonik Seeley [EMAIL PROTECTED] wrote: On 4/19/06, Doug Cutting [EMAIL PROTECTED] wrote: http://wiki.apache.org/general/SummerOfCode2006 I'd be happy to co-mentor this with someone from Solr. OK, I added solr-mail-archive to the Wiki (pretty much copied your description) -Yonik -- [EMAIL PROTECTED] -- blog: http://feh.holsman.net/ -- PH: ++61-3-9818-0132 If everything seems under control, you're not going fast enough. - Mario Andretti
Re: GData
Yoav Shapira wrote: Getting back to Doug's original point about this as a possible SoC project: it seems a little too big from the technical discussion so far. It might actually be a simpler project if it were standalone: not built into Solr, but rather a Lucene contrib project. One only has to write a few servlets that translate each requests into Lucene events: add, delete, delete+add, or query. It wouldn't have lots of Solr's fancy features (faceted searching, replication, etc.) but could still be a very useful thing. Do folks think that would be a tractable SoC project? Doug
Re: GData
Hola, It might actually be a simpler project if it were standalone: not built into Solr, but rather a Lucene contrib project. One only has to write a few servlets that translate each requests into Lucene events: add, delete, delete+add, or query. It wouldn't have lots of Solr's fancy features (faceted searching, replication, etc.) but could still be a very useful thing. Do folks think that would be a tractable SoC project? Doug Yeah, and a cool one at that, +1. Yoav
Re: summer of code: solr for apache mail archives
Ian Holsman wrote: If you could get it so that it interfaces with mod-mbox (what they currently use) that would be a better solution for the ASF infrastructure I think. I assume that search results would be displayed with mod-mbox, i.e., links in the hit list would be links to mail-archive.a.o. Is that what you mean? Also, in my original message I said: We can setup a notification mechanism for new messages with Apache infrastructure. I now note that mod_mbox provides Atom feeds for each list. So we can just poll those to index new messages. We could generate the current list of feeds by scraping http://mail-archives.apache.org/mod_mbox/. Doug
Re: GData
On 4/20/06, Doug Cutting [EMAIL PROTECTED] wrote: It might actually be a simpler project if it were standalone: not built into Solr, but rather a Lucene contrib project. One only has to write a few servlets that translate each requests into Lucene events: add, delete, delete+add, or query. At first blush, that's the approach I would take with Solr too (a gdata specific Servlet that interfaced to Solr). So I don't see a big difference in difficulty level. It shouldn't be hard to take a straight lucene-servlet version and adapt it to Solr later, so It would be a benefit regardless. -Yonik
Re: GData
Good. I'll be available on a time-permitting basis as always, but I don't want to commit as a mentor for this, so having you two makes me feel at ease ;) Yoav On 4/20/06, Yonik Seeley [EMAIL PROTECTED] wrote: On 4/20/06, Doug Cutting [EMAIL PROTECTED] wrote: Would you (or someone else) be willing to co-mentor this one with me? I'm travelling the month of July, so I'm hesitant to be the sole mentor. (I'll be online, but at reduced capacity.) If I have a co-mentor, then I'd be happy to write up the proposal. OK, I'm up for it... I don't really know my summer schedule, but I doubt I would be out more than a week at a time. -Yonik -- Yoav Shapira Nimalex LLC 1 Mifflin Place, Suite 310 Cambridge, MA, USA [EMAIL PROTECTED] / www.yoavshapira.com