Re: Synchronizing Webapp on Tomcat with Solr instance

2009-12-12 Thread Israel Ekpo
Hi,

I think this is a question for the solr-u...@lucene.apache.org  mailing
list.

The solr-dev mailing list is mostly for communication that has to do with
Solr internals and development of Solr itself.

On Sat, Dec 12, 2009 at 12:34 AM, insaneyogi3008 insaney...@gmail.comwrote:


 Hello,

 I have a webapp on Tomcat that displays search results to the end users . I
 have a Solr instance on the same linux box that has data indexed .

 The webapp wraps an instance of CommonHTTPSolrServer presumably to send
 HTTP requests to the Solr instance , now if Solr has indexed documents ,
 how
 does it return the document(s) that satisfied the search query ?

 I am a little confused about how this communication between tomcat  solr
 takes place




 --
 View this message in context:
 http://old.nabble.com/Synchronizing-Webapp-on-Tomcat-with-Solr-instance-tp26754143p26754143.html
 Sent from the Solr - Dev mailing list archive at Nabble.com.




-- 
Good Enough is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/


Re: release announcement draft

2009-11-01 Thread Israel Ekpo
On Sun, Nov 1, 2009 at 11:56 AM, Yonik Seeley yo...@lucidimagination.comwrote:

 OK, here's another shot that adds a second paragraph that describes a
 little
 more the form that Solr takes.

 Again, I'm thinking of reusing the first two big descriptive
 paragraphs (starting with Solr is the popular) on Solr's home page,
 as it's main description.

 -Yonik
 http://www.lucidimagination.com

 -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT --
 Apache Solr 1.4 has been released and is now available for public download!
 http://www.apache.org/dyn/closer.cgi/lucene/solr/

 Solr is the popular, blazing fast open source enterprise search
 platform from the Apache Lucene project.  Its major features include
 powerful full-text search, hit highlighting, faceted search, dynamic
 clustering, database integration, and rich document (e.g., Word, PDF)
 handling.  Solr is highly scalable, providing distributed search and
 index replication, and it powers the search and navigation features of
 many of the world's largest internet sites.

 Solr is written in Java and runs as a standalone full-text search server
 within a servlet container such as Tomcat.
 Solr uses Lucene at it's core for indexing and full-text search, and has
 REST-like HTTP/XML and JSON APIs that make it easy to use from virtually
 any programming language.  Solr's powerful external configuration allow it
 to
 be tailored to almost any type of application without Java coding, however
 it has an extensive plugin architecture when more advanced
 customization is required.


 New Solr 1.4 features include
  - Major performance enhancements in indexing, searching, and faceting
  - Revamped all-Java index replication that's simple to configure and
 can replicate config files
  - Greatly improved database integration via the DataImportHandler
  - Rich document processing (Word, PDF, HTML) via Apache Tika
  - Dynamic search results clustering via Carrot2
  - Multi-select faceting (support for multiple items in a single
 category to be selected)
  - Many powerful query enhancements, including ranges over arbitrary
 functions, nested queries of different syntaxes
  - Many other plugins including Terms for auto-suggest, Statistics,
 TermVectors, Deduplication

 -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT --



Hi Yonik,

I love this new introduction. In my opinion, it contains most of the facts
that someone hearing about Solr for the very first time needs to know about
what it is and what it can do. The distinction is clearer now, I believe.

I made slight changes (just punctuations) to your original text and I have
enclosed it between --BEGIN-- and --END-- below.

I made the following changes

1A: Solr uses Lucene at it's core for indexing
1B: Solr uses Lucene at its core for indexing

2A: JSON APIs that make it easy to use from virtually any programming
language
2B: JSON APIs that makes it easy to use from virtually any programming
language

3A: Solr's powerful external configuration allow it to be tailored to almost
any type of application
3B: Solr's powerful external configuration allows it to be tailored to
almost any type of application

Added a semi-colon to indicate the complete pause
4A: without Java coding, however it has an extensive plugin architecture
when more advanced customization is required.
4B: without Java coding; however it has an extensive plugin architecture
when more advanced customization is required.

--BEGIN--
Solr is written in Java and runs as a standalone full-text search server
within a servlet container such as Tomcat. Solr uses Lucene at its core for
indexing and full-text search, and has REST-like HTTP/XML and JSON APIs that
makes it easy to use from virtually any programming language.  Solr's
powerful external configuration allows it to be tailored to almost any type
of application without Java coding; however it has an extensive plugin
architecture when more advanced customization is required.
--END--
-- 
Good Enough is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.


Re: release announcement draft

2009-11-01 Thread Israel Ekpo
On Sun, Nov 1, 2009 at 9:08 PM, Yonik Seeley yo...@lucidimagination.comwrote:

 Thanks for the review!  I already made some additional changes (I had
 forgotten one of the suggestions by Chris eariler too).  New draft is
 at the end, I've already checked this into subversion.  It's not live
 though, so further tweaks and cleanups can still be made.

 On that note: I'll be traveling tomorrow and tuesday... probably
 mostly out of touch, so others may need to handle further edits.

 On Sun, Nov 1, 2009 at 8:55 PM, Israel Ekpo israele...@gmail.com wrote:
  1A: Solr uses Lucene at it's core for indexing
  1B: Solr uses Lucene at its core for indexing

 Will do.

  2A: JSON APIs that make it easy to use from virtually any programming
  language
  2B: JSON APIs that makes it easy to use from virtually any programming
  language

 Is that change right?  I'm no english major ;-)

 A makes foo easy
 A and B make foo easy
 Multiple A's make foo easy

  3A: Solr's powerful external configuration allow it to be tailored to
 almost
  any type of application
  3B: Solr's powerful external configuration allows it to be tailored to
  almost any type of application

 Hmm, OK.

  Added a semi-colon to indicate the complete pause
  4A: without Java coding, however it has an extensive plugin architecture
  when more advanced customization is required.
  4B: without Java coding; however it has an extensive plugin architecture
  when more advanced customization is required.

 I had already changed this to use and instead of however... does
 that change things?
 What sounds best?

 -Yonik
 http://www.lucidimagination.com

 -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT --
 Apache Solr 1.4 has been released and is now available for public download!
 http://www.apache.org/dyn/closer.cgi/lucene/solr/

 Solr is the popular, blazing fast open source enterprise search
 platform from the Apache Lucene project.  Its major features include
 powerful full-text search, hit highlighting, faceted search, dynamic
 clustering, database integration, and rich document (e.g., Word, PDF)
 handling.  Solr is highly scalable, providing distributed search and
 index replication, and it powers the search and navigation features of
 many of the world's largest internet sites.

 Solr is written in Java and runs as a standalone full-text search server
 within a servlet container such as Tomcat.  Solr uses the Lucene Java
 search library at it's core for full-text indexing and search, and has
 REST-like HTTP/XML and JSON APIs that make it easy to use from virtually
 any programming language.  Solr's powerful external configuration allow it
 to
 be tailored to almost any type of application without Java coding, and
 it has an extensive plugin architecture when more advanced
 customization is required.


 New Solr 1.4 features include
  - Major performance enhancements in indexing, searching, and faceting
  - Revamped all-Java index replication that's simple to configure and
 can replicate config files
  - Greatly improved database integration via the DataImportHandler
  - Rich document processing (Word, PDF, HTML) via Apache Tika
  - Dynamic search results clustering via Carrot2
  - Multi-select faceting (support for multiple items in a single
 category to be selected)
  - Many powerful query enhancements, including ranges over arbitrary
 functions, and nested queries of different syntaxes
  - Many other plugins including Terms for auto-suggest, Statistics,
 TermVectors, Deduplication
 -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT --



About 2B, I am not an English major too :) but I thought since the it was
referring to Solr (singular) we should use makes instead of make but I
think that is OK since I am not 100% sure.

In 3B, I was using the same logic too.

About 4B, I prefer the second version with the and instead of however,
the second version sounds better.

In summary, you guys have done a very good job! I am looking forward to the
official release.

Save journey during your trip.
-- 
Good Enough is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.


Wiki Update Recommendation : QueryParametersIndex, FrontPage

2009-10-19 Thread Israel Ekpo
Hi Guys,

I was thinking about adding new items to this page [
http://wiki.apache.org/solr/QueryParametersIndex ]

These are the parameters I want to add :

http://wiki.apache.org/solr/SpellCheckComponent

http://wiki.apache.org/solr/TermsComponent

http://wiki.apache.org/solr/StatsComponent

They seem kind of buried in the wiki. What do you think?

Also, is this the right page for these, or should they be listed elsewhere?

Should they be listed here instead http://wiki.apache.org/solr/FrontPage ?

-- 
Good Enough is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.


Re: 1.4.0 RC

2009-10-13 Thread Israel Ekpo
This is awesome!

Thanks to all who made this possible.

On Tue, Oct 13, 2009 at 4:56 PM, Grant Ingersoll gsing...@apache.orgwrote:

 http://people.apache.org/~gsingers/solr/1.4.0-RC/http://people.apache.org/%7Egsingers/solr/1.4.0-RC/

 I have not signed these artifacts yet.  I have to generate a stronger key
 b/c mine is only 2048.  Am working through that now.




-- 
Good Enough is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.


Re: Down to Zero!

2009-10-12 Thread Israel Ekpo
This is Awesome!

Thanks to All who made this possible.

Grant, should we be getting ready for the 1.4 release party?

On Mon, Oct 12, 2009 at 12:46 PM, Ryan McKinley ryan...@gmail.com wrote:

 Looks like we are down to zero issues:

 https://issues.apache.org/jira/secure/BrowseVersion.jspa?id=12310230versionId=12313351showOpenIssuesOnly=true





-- 
Good Enough is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.


Solr 1.4 Release Party

2009-10-10 Thread Israel Ekpo
I can't wait...

-- 
Good Enough is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.


Re: 8 for 1.4

2009-10-06 Thread Israel Ekpo
On Tue, Oct 6, 2009 at 11:22 PM, James McKinney ja...@evolvingweb.cawrote:

 Hi all, I only just now discovered this thread on the future of
 SolrJS. I'm one of the Drupal developers that worked on the SolrJS
 fork that we call AJAX Solr. The code is up at
 http://github.com/evolvingweb/AJAX-Solr

 Before I address some of the concerns that came up earlier in the
 thread, I'd just like to thank Matthias and any contributors for their
 work on SolrJS, which provided an excellent API and framework in which
 to add more features and improve existing ones.

 Now: why did we fork SolrJS instead of patching? A fork was preferable
 for two reasons. (1) we would have had to make a lot of patches and
 (2) we needed the code to be licensed under GPL in order to publish it
 on drupal.org, which is the one and only place Drupal developers go to
 share code.

 As to (1), it is entirely possible some of our patches would not have
 been accepted, even though they are useful to us, so it would have
 probably been inevitable that we start a fork. As we were hacking
 SolrJS for one of our clients, not for its own right, we also got
 quite far from the original code, so writing atomic patches would have
 taken months (we don't work full-time on this).

 As to (2), AJAX Solr is currently tri-licensed as GPL v2, ASL v2, and
 MIT. We don't really care about the license. It's GPL v2 for
 drupal.org, ASL v2 for consistency with SolrJS, and MIT for
 consistency with Ruby on Rails, for which we hope to one day release a
 plugin, as we have done for Drupal (the Drupal module will be released
 this weekend). If it were up to me, I'd just make the code public
 domain. I'm not religious about licences.

 We set up the GitHub account so that people could contribute code
 there, under all three licenses, instead of contributing code at
 drupal.org (only GPL), or apache.org (only ASL?). I will not accept a
 patch unless I can release it under those three licenses. I think this
 will avoid any licensing issues and address the licensing concerns I
 read earlier in the thread.

 RE: Shalin Shekhar Mangar: Are they exposing their Solr servers to
 the public so that it can be accessed directly through Javascript? In
 our Drupal module, we provide the option to either expose the Solr
 server to the public (not recommended), or to proxy requests through
 Drupal (recommended) or even a custom proxy. Our Drupal proxy filters
 the request prepared by the JavaScript, returning only those fields
 that the administrator set as publicly accessible, and limiting the
 number of rows returned to the administrator-set maximum.

 Also, as to the name, a few things: one of our developers, when we
 were still thinking of patching SolrJS, created a module on drupal.org
 called solrjs. As we won't be using SolrJS, we will rename that
 appropriately. If anyone objects to the name AJAX Solr please let me
 know. I don't think it's a problem to have Solr in the name; it
 would be terribly confusing if it didn't.

 Thanks,

 James

 P.S.: Since I just joined the list, I didn't know how to reply to the
 thread with all the thread history attached. Sorry if this causes
 problems with the mailing list.



Hi James,

You almost gave me heart attack by using that subject line Re: 8 for 1.4.
I remember checking a few minutes ago and it was just 4 issues remaining.

I can't wait for 1.4 to be released officially.

Maybe Re: 4  for 1.4 would be more appropriate. :)

-- 
Good Enough is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.


Re: Down to 5

2009-10-04 Thread Israel Ekpo
On Sun, Oct 4, 2009 at 9:51 PM, Chris Hostetter hossman_luc...@fucit.orgwrote:


 : Subject: Down to 5

 I reopened SOLR-1448 because it seems perfectly reasonable to me ..
 looking for a reply.

 I'm also just waiting on someone else to test/review SOLR-1449 ...
 otherwise it should be pushed to 1.5.



 -Hoss



Hi Grant,

I was just looking at the documentation bug SOLR-1483 and I found the
following comments in the comments in the schema file.

Numeric field types that index each value at various levels of precision to
accelerate range queries when the number of values between the range
endpoints is large

Smaller precisionStep values (specified in bits) will lead to more tokens
indexed per value, slightly larger index size, and faster range queries

It also states that for faster range queries, consider the
tint/tfloat/tlong/tdouble types.

Now, the tint/tfloat/tlong/tdouble have a precisionStetp of 8 while the
int/float/long/double types have a precisionStep of 0

From these comments, it seems like the int/float/long/double with smaller
precisionstep values should lead to more tokens indexed per value, slightly
larger index size, and faster range queries.

So maybe we should recommend, the int/float/long/double types over the
tint/tfloat/tlong/tdouble types for faster range queries.

If all we need to do is to rewrite the documentation, I can come up with a
re-write of the comments in the schema file and submit the patch so that
this issue can be closed.

So if you want to assign this one to me, that would be fine too.

-- 
Good Enough is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.


Announcing the Apache Solr extension in PHP - 0.9.0

2009-10-04 Thread Israel Ekpo
Fellow Apache Solr users,

I have been working on a PHP extension for Apache Solr in C for quite
sometime now.

I just finished testing it and I have completed the initial user level
documentation of the API

Version 0.9.0-beta has just been released.

It already has built-in readiness for Solr 1.4

If you are using Solr 1.3 or later in PHP, I would appreciate if you could
check it out and give me some feedback.

It is very easy to install on UNIX systems. I am still working on the build
for windows. It should be available for Windows soon.

http://solr.israelekpo.com/manual/en/solr.installation.php

A quick list of some of the features of the API include :
- Built in serialization of Solr Parameter objects.
- Reuse of HTTP connections across repeated requests.
- Ability to obtain input documents for possible resubmission from query
responses.
- Simplified interface to access server response data (SolrObject)
- Ability to connect to Solr server instances secured behind HTTP
Authentication and proxy servers

The following components are also supported
- Facets
- MoreLikeThis
- TermsComponent
- Stats
- Highlighting

Solr PECL Extension Homepage
http://pecl.php.net/package/solr

Some examples are available here
http://solr.israelekpo.com/manual/en/solr.examples.php

Interim Documentation Page until refresh of official PHP documentation
http://solr.israelekpo.com/manual/en/book.solr.php

The C source is available here
http://svn.php.net/viewvc/pecl/solr/

-- 
Good Enough is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.


Re: dist download size

2009-09-19 Thread Israel Ekpo
As you suggested, I think the javadoc should be removed and provided as a
separate download.

Developers and end users can also be pointed to the online documentation for
references and also where the downloadable javadoc can be found.

That could help

On Sat, Sep 19, 2009 at 10:12 AM, Yonik Seeley
yo...@lucidimagination.comwrote:

 Our download is currently a 150MB zip file, and 233MB on disk!

 $ du -s *
 96  CHANGES.txt
 80  LICENSE.txt
 12  NOTICE.txt
 8   README.txt
 36  build.xml
 1857client
 20  common-build.xml
 40554   contrib
 48036   dist
 55620   docs
 76214   example
 3200lib
 7554src

 Hoss listed a bunch of the issues (jars in dist that didn't need to be
 there)
 https://issues.apache.org/jira/browse/SOLR-1433

 - solr cell ---
 The example contains the solr cell libs in both the DIH example and
 the normal example.
 At a minimum, the DIH example shouldn't include them.
 I think it's still up in the air if the normal example should include them.

 --- javadoc 
 55MB of Javadoc!  That's a bit heavy... esp when I imagine almost no
 one uses it.
 Should we create a compressed doc jar for this instead?
 Point to an online version instead?
 Note that dist also contains broken-out javadoc jars.
 $ du -s dist/*docs*
 60  dist/apache-solr-cell-docs-1.4-dev.jar
 68  dist/apache-solr-clustering-docs-1.4-dev.jar
 2212dist/apache-solr-core-docs-1.4-dev.jar
 300 dist/apache-solr-dataimporthandler-docs-1.4-dev.jar
 680 dist/apache-solr-solrj-docs-1.4-dev.jar

 -Yonik
 http://www.lucidimagination.com




-- 
Good Enough is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.


Spec Version vs Implementation Version

2009-09-11 Thread Israel Ekpo
What are the differences between specification version and implementation
version

I downloaded the nightly build for September 05 2009 and it has a spec
version of 1.3 and the implementation version states 1.4-dev

What does that mean?


-- 
Good Enough is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.