Re: Synchronizing Webapp on Tomcat with Solr instance
Hi, I think this is a question for the solr-u...@lucene.apache.org mailing list. The solr-dev mailing list is mostly for communication that has to do with Solr internals and development of Solr itself. On Sat, Dec 12, 2009 at 12:34 AM, insaneyogi3008 insaney...@gmail.comwrote: Hello, I have a webapp on Tomcat that displays search results to the end users . I have a Solr instance on the same linux box that has data indexed . The webapp wraps an instance of CommonHTTPSolrServer presumably to send HTTP requests to the Solr instance , now if Solr has indexed documents , how does it return the document(s) that satisfied the search query ? I am a little confused about how this communication between tomcat solr takes place -- View this message in context: http://old.nabble.com/Synchronizing-Webapp-on-Tomcat-with-Solr-instance-tp26754143p26754143.html Sent from the Solr - Dev mailing list archive at Nabble.com. -- Good Enough is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once. http://www.israelekpo.com/
Re: release announcement draft
On Sun, Nov 1, 2009 at 11:56 AM, Yonik Seeley yo...@lucidimagination.comwrote: OK, here's another shot that adds a second paragraph that describes a little more the form that Solr takes. Again, I'm thinking of reusing the first two big descriptive paragraphs (starting with Solr is the popular) on Solr's home page, as it's main description. -Yonik http://www.lucidimagination.com -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- Apache Solr 1.4 has been released and is now available for public download! http://www.apache.org/dyn/closer.cgi/lucene/solr/ Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites. Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Tomcat. Solr uses Lucene at it's core for indexing and full-text search, and has REST-like HTTP/XML and JSON APIs that make it easy to use from virtually any programming language. Solr's powerful external configuration allow it to be tailored to almost any type of application without Java coding, however it has an extensive plugin architecture when more advanced customization is required. New Solr 1.4 features include - Major performance enhancements in indexing, searching, and faceting - Revamped all-Java index replication that's simple to configure and can replicate config files - Greatly improved database integration via the DataImportHandler - Rich document processing (Word, PDF, HTML) via Apache Tika - Dynamic search results clustering via Carrot2 - Multi-select faceting (support for multiple items in a single category to be selected) - Many powerful query enhancements, including ranges over arbitrary functions, nested queries of different syntaxes - Many other plugins including Terms for auto-suggest, Statistics, TermVectors, Deduplication -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- Hi Yonik, I love this new introduction. In my opinion, it contains most of the facts that someone hearing about Solr for the very first time needs to know about what it is and what it can do. The distinction is clearer now, I believe. I made slight changes (just punctuations) to your original text and I have enclosed it between --BEGIN-- and --END-- below. I made the following changes 1A: Solr uses Lucene at it's core for indexing 1B: Solr uses Lucene at its core for indexing 2A: JSON APIs that make it easy to use from virtually any programming language 2B: JSON APIs that makes it easy to use from virtually any programming language 3A: Solr's powerful external configuration allow it to be tailored to almost any type of application 3B: Solr's powerful external configuration allows it to be tailored to almost any type of application Added a semi-colon to indicate the complete pause 4A: without Java coding, however it has an extensive plugin architecture when more advanced customization is required. 4B: without Java coding; however it has an extensive plugin architecture when more advanced customization is required. --BEGIN-- Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Tomcat. Solr uses Lucene at its core for indexing and full-text search, and has REST-like HTTP/XML and JSON APIs that makes it easy to use from virtually any programming language. Solr's powerful external configuration allows it to be tailored to almost any type of application without Java coding; however it has an extensive plugin architecture when more advanced customization is required. --END-- -- Good Enough is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once.
Re: release announcement draft
On Sun, Nov 1, 2009 at 9:08 PM, Yonik Seeley yo...@lucidimagination.comwrote: Thanks for the review! I already made some additional changes (I had forgotten one of the suggestions by Chris eariler too). New draft is at the end, I've already checked this into subversion. It's not live though, so further tweaks and cleanups can still be made. On that note: I'll be traveling tomorrow and tuesday... probably mostly out of touch, so others may need to handle further edits. On Sun, Nov 1, 2009 at 8:55 PM, Israel Ekpo israele...@gmail.com wrote: 1A: Solr uses Lucene at it's core for indexing 1B: Solr uses Lucene at its core for indexing Will do. 2A: JSON APIs that make it easy to use from virtually any programming language 2B: JSON APIs that makes it easy to use from virtually any programming language Is that change right? I'm no english major ;-) A makes foo easy A and B make foo easy Multiple A's make foo easy 3A: Solr's powerful external configuration allow it to be tailored to almost any type of application 3B: Solr's powerful external configuration allows it to be tailored to almost any type of application Hmm, OK. Added a semi-colon to indicate the complete pause 4A: without Java coding, however it has an extensive plugin architecture when more advanced customization is required. 4B: without Java coding; however it has an extensive plugin architecture when more advanced customization is required. I had already changed this to use and instead of however... does that change things? What sounds best? -Yonik http://www.lucidimagination.com -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- Apache Solr 1.4 has been released and is now available for public download! http://www.apache.org/dyn/closer.cgi/lucene/solr/ Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites. Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Tomcat. Solr uses the Lucene Java search library at it's core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it easy to use from virtually any programming language. Solr's powerful external configuration allow it to be tailored to almost any type of application without Java coding, and it has an extensive plugin architecture when more advanced customization is required. New Solr 1.4 features include - Major performance enhancements in indexing, searching, and faceting - Revamped all-Java index replication that's simple to configure and can replicate config files - Greatly improved database integration via the DataImportHandler - Rich document processing (Word, PDF, HTML) via Apache Tika - Dynamic search results clustering via Carrot2 - Multi-select faceting (support for multiple items in a single category to be selected) - Many powerful query enhancements, including ranges over arbitrary functions, and nested queries of different syntaxes - Many other plugins including Terms for auto-suggest, Statistics, TermVectors, Deduplication -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- About 2B, I am not an English major too :) but I thought since the it was referring to Solr (singular) we should use makes instead of make but I think that is OK since I am not 100% sure. In 3B, I was using the same logic too. About 4B, I prefer the second version with the and instead of however, the second version sounds better. In summary, you guys have done a very good job! I am looking forward to the official release. Save journey during your trip. -- Good Enough is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once.
Wiki Update Recommendation : QueryParametersIndex, FrontPage
Hi Guys, I was thinking about adding new items to this page [ http://wiki.apache.org/solr/QueryParametersIndex ] These are the parameters I want to add : http://wiki.apache.org/solr/SpellCheckComponent http://wiki.apache.org/solr/TermsComponent http://wiki.apache.org/solr/StatsComponent They seem kind of buried in the wiki. What do you think? Also, is this the right page for these, or should they be listed elsewhere? Should they be listed here instead http://wiki.apache.org/solr/FrontPage ? -- Good Enough is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once.
Re: 1.4.0 RC
This is awesome! Thanks to all who made this possible. On Tue, Oct 13, 2009 at 4:56 PM, Grant Ingersoll gsing...@apache.orgwrote: http://people.apache.org/~gsingers/solr/1.4.0-RC/http://people.apache.org/%7Egsingers/solr/1.4.0-RC/ I have not signed these artifacts yet. I have to generate a stronger key b/c mine is only 2048. Am working through that now. -- Good Enough is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once.
Re: Down to Zero!
This is Awesome! Thanks to All who made this possible. Grant, should we be getting ready for the 1.4 release party? On Mon, Oct 12, 2009 at 12:46 PM, Ryan McKinley ryan...@gmail.com wrote: Looks like we are down to zero issues: https://issues.apache.org/jira/secure/BrowseVersion.jspa?id=12310230versionId=12313351showOpenIssuesOnly=true -- Good Enough is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once.
Solr 1.4 Release Party
I can't wait... -- Good Enough is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once.
Re: 8 for 1.4
On Tue, Oct 6, 2009 at 11:22 PM, James McKinney ja...@evolvingweb.cawrote: Hi all, I only just now discovered this thread on the future of SolrJS. I'm one of the Drupal developers that worked on the SolrJS fork that we call AJAX Solr. The code is up at http://github.com/evolvingweb/AJAX-Solr Before I address some of the concerns that came up earlier in the thread, I'd just like to thank Matthias and any contributors for their work on SolrJS, which provided an excellent API and framework in which to add more features and improve existing ones. Now: why did we fork SolrJS instead of patching? A fork was preferable for two reasons. (1) we would have had to make a lot of patches and (2) we needed the code to be licensed under GPL in order to publish it on drupal.org, which is the one and only place Drupal developers go to share code. As to (1), it is entirely possible some of our patches would not have been accepted, even though they are useful to us, so it would have probably been inevitable that we start a fork. As we were hacking SolrJS for one of our clients, not for its own right, we also got quite far from the original code, so writing atomic patches would have taken months (we don't work full-time on this). As to (2), AJAX Solr is currently tri-licensed as GPL v2, ASL v2, and MIT. We don't really care about the license. It's GPL v2 for drupal.org, ASL v2 for consistency with SolrJS, and MIT for consistency with Ruby on Rails, for which we hope to one day release a plugin, as we have done for Drupal (the Drupal module will be released this weekend). If it were up to me, I'd just make the code public domain. I'm not religious about licences. We set up the GitHub account so that people could contribute code there, under all three licenses, instead of contributing code at drupal.org (only GPL), or apache.org (only ASL?). I will not accept a patch unless I can release it under those three licenses. I think this will avoid any licensing issues and address the licensing concerns I read earlier in the thread. RE: Shalin Shekhar Mangar: Are they exposing their Solr servers to the public so that it can be accessed directly through Javascript? In our Drupal module, we provide the option to either expose the Solr server to the public (not recommended), or to proxy requests through Drupal (recommended) or even a custom proxy. Our Drupal proxy filters the request prepared by the JavaScript, returning only those fields that the administrator set as publicly accessible, and limiting the number of rows returned to the administrator-set maximum. Also, as to the name, a few things: one of our developers, when we were still thinking of patching SolrJS, created a module on drupal.org called solrjs. As we won't be using SolrJS, we will rename that appropriately. If anyone objects to the name AJAX Solr please let me know. I don't think it's a problem to have Solr in the name; it would be terribly confusing if it didn't. Thanks, James P.S.: Since I just joined the list, I didn't know how to reply to the thread with all the thread history attached. Sorry if this causes problems with the mailing list. Hi James, You almost gave me heart attack by using that subject line Re: 8 for 1.4. I remember checking a few minutes ago and it was just 4 issues remaining. I can't wait for 1.4 to be released officially. Maybe Re: 4 for 1.4 would be more appropriate. :) -- Good Enough is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once.
Re: Down to 5
On Sun, Oct 4, 2009 at 9:51 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : Subject: Down to 5 I reopened SOLR-1448 because it seems perfectly reasonable to me .. looking for a reply. I'm also just waiting on someone else to test/review SOLR-1449 ... otherwise it should be pushed to 1.5. -Hoss Hi Grant, I was just looking at the documentation bug SOLR-1483 and I found the following comments in the comments in the schema file. Numeric field types that index each value at various levels of precision to accelerate range queries when the number of values between the range endpoints is large Smaller precisionStep values (specified in bits) will lead to more tokens indexed per value, slightly larger index size, and faster range queries It also states that for faster range queries, consider the tint/tfloat/tlong/tdouble types. Now, the tint/tfloat/tlong/tdouble have a precisionStetp of 8 while the int/float/long/double types have a precisionStep of 0 From these comments, it seems like the int/float/long/double with smaller precisionstep values should lead to more tokens indexed per value, slightly larger index size, and faster range queries. So maybe we should recommend, the int/float/long/double types over the tint/tfloat/tlong/tdouble types for faster range queries. If all we need to do is to rewrite the documentation, I can come up with a re-write of the comments in the schema file and submit the patch so that this issue can be closed. So if you want to assign this one to me, that would be fine too. -- Good Enough is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once.
Announcing the Apache Solr extension in PHP - 0.9.0
Fellow Apache Solr users, I have been working on a PHP extension for Apache Solr in C for quite sometime now. I just finished testing it and I have completed the initial user level documentation of the API Version 0.9.0-beta has just been released. It already has built-in readiness for Solr 1.4 If you are using Solr 1.3 or later in PHP, I would appreciate if you could check it out and give me some feedback. It is very easy to install on UNIX systems. I am still working on the build for windows. It should be available for Windows soon. http://solr.israelekpo.com/manual/en/solr.installation.php A quick list of some of the features of the API include : - Built in serialization of Solr Parameter objects. - Reuse of HTTP connections across repeated requests. - Ability to obtain input documents for possible resubmission from query responses. - Simplified interface to access server response data (SolrObject) - Ability to connect to Solr server instances secured behind HTTP Authentication and proxy servers The following components are also supported - Facets - MoreLikeThis - TermsComponent - Stats - Highlighting Solr PECL Extension Homepage http://pecl.php.net/package/solr Some examples are available here http://solr.israelekpo.com/manual/en/solr.examples.php Interim Documentation Page until refresh of official PHP documentation http://solr.israelekpo.com/manual/en/book.solr.php The C source is available here http://svn.php.net/viewvc/pecl/solr/ -- Good Enough is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once.
Re: dist download size
As you suggested, I think the javadoc should be removed and provided as a separate download. Developers and end users can also be pointed to the online documentation for references and also where the downloadable javadoc can be found. That could help On Sat, Sep 19, 2009 at 10:12 AM, Yonik Seeley yo...@lucidimagination.comwrote: Our download is currently a 150MB zip file, and 233MB on disk! $ du -s * 96 CHANGES.txt 80 LICENSE.txt 12 NOTICE.txt 8 README.txt 36 build.xml 1857client 20 common-build.xml 40554 contrib 48036 dist 55620 docs 76214 example 3200lib 7554src Hoss listed a bunch of the issues (jars in dist that didn't need to be there) https://issues.apache.org/jira/browse/SOLR-1433 - solr cell --- The example contains the solr cell libs in both the DIH example and the normal example. At a minimum, the DIH example shouldn't include them. I think it's still up in the air if the normal example should include them. --- javadoc 55MB of Javadoc! That's a bit heavy... esp when I imagine almost no one uses it. Should we create a compressed doc jar for this instead? Point to an online version instead? Note that dist also contains broken-out javadoc jars. $ du -s dist/*docs* 60 dist/apache-solr-cell-docs-1.4-dev.jar 68 dist/apache-solr-clustering-docs-1.4-dev.jar 2212dist/apache-solr-core-docs-1.4-dev.jar 300 dist/apache-solr-dataimporthandler-docs-1.4-dev.jar 680 dist/apache-solr-solrj-docs-1.4-dev.jar -Yonik http://www.lucidimagination.com -- Good Enough is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once.
Spec Version vs Implementation Version
What are the differences between specification version and implementation version I downloaded the nightly build for September 05 2009 and it has a spec version of 1.3 and the implementation version states 1.4-dev What does that mean? -- Good Enough is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once.