Lucene is basically a search library and Solr is a search web application using Lucene.

So, depending where you want to set your "starting point", you can definitely do this with Lucene, whereas you might want to consider Solr

https://solr.apache.org/features.html

which is also based on Lucene, because you will provide many features out of the box

https://solr.apache.org/features.html

Also see

https://cwiki.apache.org/confluence/display/SOLR/FAQ

Re Crawlers in combination with Solr see for example

https://cwiki.apache.org/confluence/display/SOLR/SolrEcosystem#SolrEcosystem-CrawlersAndConnectors

or

https://www.octoparse.com/blog/10-best-open-source-web-scraper#

Cheers

Michael




Am 05.04.21 um 11:59 schrieb Som Lima:
Thank you for your reply.
Yes I would like to provide a search engine for my company website and at
the same time build a  web search engine as a personal project .

On Mon, 5 Apr 2021, 10:57 Michael Wechner, <michael.wech...@wyona.com>
wrote:

Hi

The following FAQ might be a bit outdated, but nevertheless you should
find some answers there as well

https://cwiki.apache.org/confluence/display/lucene/LuceneFAQ

For example to answer your question 4) see


https://cwiki.apache.org/confluence/display/lucene/LuceneFAQ#LuceneFAQ-CanIuseLucenetocrawlmysiteorothersitesontheInternet
?

If I understand your questions correctly, your objective is to provide a
search engine for your company website?

HTH

Michael



Am 05.04.21 um 11:34 schrieb Som Lima:
Hi,

Before  doing a deep dive into lucene I would  appreciate it if  you
would
clarify a few things so I know if this is the right project to fulfill my
objective.

1. It is my my understanding that google search is a more elaborate
utility
but not unlike this *.nix search utility grep which searches for a string
pattern recursively in text files , for example  files could   .java
files
, .html files.  The search starts in this case from the current
directory.
   grep -RiIl 'search'

Quick grep explanation:

      -R - recursive search
      -i - case-insensitive
      -I - skip binary files
      -l - print a simple list as output.

2. Further to my undersrand , if it correct, is the objective of lucene
pretty much the same . Searching for String patterns recursively  ?

3. If  lucene is a search engine same as google or grep then do I just
point it to my website root directory  ?

4. Can I use lucene as a web search engine same as Google, if so where
would I point it to so that lucence can recursively search the www
websites  ?

5. Is lucene use case for something else entirely  ?


Thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to