Re: Getting started with Solr

2015-03-01 Thread Baruch Kogan
OK, got it, works now.

Maybe you can advise on something more general?

I'm trying to use Solr to analyze html data retrieved with Nutch. I want to
crawl a list of webpages built according to a certain template, and analyze
certain fields in their HTML (identified by a span class and consisting of
a number,) then output results as csv to generate a list with the website's
domain and sum of the numbers in all the specified fields.

How should I set up the flow? Should I configure Nutch to only pull the
relevant fields from each page, then use Solr to add the integers in those
fields and output to a csv? Or should I use Nutch to pull in everything
from the relevant page and then use Solr to strip out the relevant fields
and process them as above? Can I do the processing strictly in Solr, using
the stuff found here
,
or should I use PHP through Solarium or something along those lines?

Your advice would be appreciated-I don't want to reinvent the bicycle.

Sincerely,

Baruch Kogan
Marketing Manager
Seller Panda 
+972(58)441-3829
baruch.kogan at Skype

On Sun, Mar 1, 2015 at 9:17 AM, Baruch Kogan  wrote:

> Thanks for bearing with me.
>
> I start Solr with `bin/solr start -e cloud' with 2 nodes. Then I get this:
>
> *Welcome to the SolrCloud example!*
>
>
> *This interactive session will help you launch a SolrCloud cluster on your
> local workstation.*
>
> *To begin, how many Solr nodes would you like to run in your local
> cluster? (specify 1-4 nodes) [2] *
> *Ok, let's start up 2 Solr nodes for your example SolrCloud cluster.*
>
> *Please enter the port for node1 [8983] *
> *8983*
> *Please enter the port for node2 [7574] *
> *7574*
> *Cloning Solr home directory /home/ubuntu/crawler/solr/example/cloud/node1
> into /home/ubuntu/crawler/solr/example/cloud/node2*
>
> *Starting up SolrCloud node1 on port 8983 using command:*
>
> *solr start -cloud -s example/cloud/node1/solr -p 8983   *
>
> I then go to http://localhost:8983/solr/admin/cores and get the following:
>
>
> *This XML file does not appear to have any style information associated
> with it. The document tree is shown below.*
>
> *0 name="QTime">2 name="testCollection_shard1_replica1"> name="name">testCollection_shard1_replica1 name="instanceDir">/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica1/ name="dataDir">/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica1/data/ name="config">solrconfig.xmlschema.xml name="startTime">2015-03-01T06:59:12.296Z name="uptime">463800 name="maxDoc">00 name="indexHeapUsageBytes">01 name="segmentCount">0true name="hasDeletions">false name="directory">org.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(MMapDirectory@/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica1/data/index
> lockFactory=org.apache.lucene.store.NativeFSLockFactory@2a4f8f8b;
> maxCacheMB=48.0 maxMergeSizeMB=4.0) name="sizeInBytes">7171 bytes name="testCollection_shard1_replica2"> name="name">testCollection_shard1_replica2 name="instanceDir">/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica2/ name="dataDir">/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica2/data/ name="config">solrconfig.xmlschema.xml name="startTime">2015-03-01T06:59:12.751Z name="uptime">459260 name="maxDoc">00 name="indexHeapUsageBytes">01 name="segmentCount">0true name="hasDeletions">false name="directory">org.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(MMapDirectory@/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica2/data/index
> lockFactory=org.apache.lucene.store.NativeFSLockFactory@2a4f8f8b;
> maxCacheMB=48.0 maxMergeSizeMB=4.0) name="sizeInBytes">7171 bytes name="testCollection_shard2_replica1"> name="name">testCollection_shard2_replica1 name="instanceDir">/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard2_replica1/ name="dataDir">/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard2_replica1/data/ name="config">solrconfig.xmlschema.xml name="startTime">2015-03-01T06:59:12.596Z name="uptime">460810 name="maxDoc">00 name="indexHeapUsageBytes">01 name="segmentCount">0true name="hasDeletions">false name="directory">org.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(MMapDirectory@/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard2_replica1/data/index
> lockFactory=org.apache.lucene.store.NativeFSLockFactory@2a4f8f8b;
> maxCacheMB=48.0 maxMergeSizeMB=4.0) name="sizeInBytes">7171 bytes name="testCollection_shard2_replica2"> name="name">testCollection_shard2_replica2 name="instanceDir">/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard2_replica2/ name="dataDir">/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard2_replica2/data/ name="config">solrconfig.x

Re: Getting started with Solr

2015-02-28 Thread Baruch Kogan
Thanks for bearing with me.

I start Solr with `bin/solr start -e cloud' with 2 nodes. Then I get this:

*Welcome to the SolrCloud example!*


*This interactive session will help you launch a SolrCloud cluster on your
local workstation.*

*To begin, how many Solr nodes would you like to run in your local cluster?
(specify 1-4 nodes) [2] *
*Ok, let's start up 2 Solr nodes for your example SolrCloud cluster.*

*Please enter the port for node1 [8983] *
*8983*
*Please enter the port for node2 [7574] *
*7574*
*Cloning Solr home directory /home/ubuntu/crawler/solr/example/cloud/node1
into /home/ubuntu/crawler/solr/example/cloud/node2*

*Starting up SolrCloud node1 on port 8983 using command:*

*solr start -cloud -s example/cloud/node1/solr -p 8983   *

I then go to http://localhost:8983/solr/admin/cores and get the following:


*This XML file does not appear to have any style information associated
with it. The document tree is shown below.*

*02testCollection_shard1_replica1/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica1//home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica1/data/solrconfig.xmlschema.xml2015-03-01T06:59:12.296Z4638010truefalseorg.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(MMapDirectory@/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica1/data/index
lockFactory=org.apache.lucene.store.NativeFSLockFactory@2a4f8f8b;
maxCacheMB=48.0 maxMergeSizeMB=4.0)7171 bytestestCollection_shard1_replica2/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica2//home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica2/data/solrconfig.xmlschema.xml2015-03-01T06:59:12.751Z4592610truefalseorg.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(MMapDirectory@/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard1_replica2/data/index
lockFactory=org.apache.lucene.store.NativeFSLockFactory@2a4f8f8b;
maxCacheMB=48.0 maxMergeSizeMB=4.0)7171 bytestestCollection_shard2_replica1/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard2_replica1//home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard2_replica1/data/solrconfig.xmlschema.xml2015-03-01T06:59:12.596Z4608110truefalseorg.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(MMapDirectory@/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard2_replica1/data/index
lockFactory=org.apache.lucene.store.NativeFSLockFactory@2a4f8f8b;
maxCacheMB=48.0 maxMergeSizeMB=4.0)7171 bytestestCollection_shard2_replica2/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard2_replica2//home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard2_replica2/data/solrconfig.xmlschema.xml2015-03-01T06:59:12.718Z4595910truefalseorg.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(MMapDirectory@/home/ubuntu/crawler/solr/example/cloud/node1/solr/testCollection_shard2_replica2/data/index
lockFactory=org.apache.lucene.store.NativeFSLockFactory@2a4f8f8b;
maxCacheMB=48.0 maxMergeSizeMB=4.0)7171
bytes*

I do not seem to have a gettingstarted collection.

Sincerely,

Baruch Kogan
Marketing Manager
Seller Panda 
+972(58)441-3829
baruch.kogan at Skype

On Fri, Feb 27, 2015 at 12:00 AM, Erik Hatcher 
wrote:

> I’m sorry, I’m not following exactly.
>
> Somehow you no longer have a gettingstarted collection, but it is not
> clear how that happened.
>
> Could you post the exact script steps you used that got you this error?
>
> What collections/cores does the Solr admin show you have?What are the
> results of http://localhost:8983/solr/admin/cores <
> http://localhost:8983/solr/admin/cores> ?
>
> —
> Erik Hatcher, Senior Solutions Architect
> http://www.lucidworks.com 
>
>
>
>
> > On Feb 26, 2015, at 9:58 AM, Baruch Kogan 
> wrote:
> >
> > Oh, I see. I used the start -e cloud command, then ran through a setup
> with
> > one core and default options for the rest, then tried to post the json
> > example again, and got another error:
> > buntu@ubuntu-VirtualBox:~/crawler/solr$ bin/post -c gettingstarted
> > example/exampledocs/*.json
> > /usr/lib/jvm/java-7-oracle/bin/java -classpath
> > /home/ubuntu/crawler/solr/dist/solr-core-5.0.0.jar -Dauto=yes
> > -Dc=gettingstarted -Ddata=files org.apache.solr.util.SimplePostTool
> > example/exampledocs/books.json
> > SimplePostTool version 5.0.0
> > Posting files to [base] url
> > http://localhost:8983/solr/gettingstarted/update...
> > Entering auto mode. File endings considered are
> >
> xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
> > POSTing file books.json (application/json) to [base]
> > SimplePostTool: WARNING: Solr returned an error #404 (Not Found) for url:
> > http://localhost:8983/solr/gettingstarted/update
> > SimplePostTool: WARNING: Response: 
> > 
> > 
> > Error 404 Not F

Re: Getting started with Solr

2015-02-26 Thread Erik Hatcher
I’m sorry, I’m not following exactly.   

Somehow you no longer have a gettingstarted collection, but it is not clear how 
that happened.  

Could you post the exact script steps you used that got you this error?

What collections/cores does the Solr admin show you have?What are the 
results of http://localhost:8983/solr/admin/cores 
 ?

—
Erik Hatcher, Senior Solutions Architect
http://www.lucidworks.com 




> On Feb 26, 2015, at 9:58 AM, Baruch Kogan  wrote:
> 
> Oh, I see. I used the start -e cloud command, then ran through a setup with
> one core and default options for the rest, then tried to post the json
> example again, and got another error:
> buntu@ubuntu-VirtualBox:~/crawler/solr$ bin/post -c gettingstarted
> example/exampledocs/*.json
> /usr/lib/jvm/java-7-oracle/bin/java -classpath
> /home/ubuntu/crawler/solr/dist/solr-core-5.0.0.jar -Dauto=yes
> -Dc=gettingstarted -Ddata=files org.apache.solr.util.SimplePostTool
> example/exampledocs/books.json
> SimplePostTool version 5.0.0
> Posting files to [base] url
> http://localhost:8983/solr/gettingstarted/update...
> Entering auto mode. File endings considered are
> xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
> POSTing file books.json (application/json) to [base]
> SimplePostTool: WARNING: Solr returned an error #404 (Not Found) for url:
> http://localhost:8983/solr/gettingstarted/update
> SimplePostTool: WARNING: Response: 
> 
> 
> Error 404 Not Found
> 
> HTTP ERROR 404
> Problem accessing /solr/gettingstarted/update. Reason:
> Not FoundPowered by
> Jetty://
> 
> Sincerely,
> 
> Baruch Kogan
> Marketing Manager
> Seller Panda 
> +972(58)441-3829
> baruch.kogan at Skype
> 
> On Thu, Feb 26, 2015 at 4:07 PM, Erik Hatcher 
> wrote:
> 
>> How did you start Solr?   If you started with `bin/solr start -e cloud`
>> you’ll have a gettingstarted collection created automatically, otherwise
>> you’ll need to create it yourself with `bin/solr create -c gettingstarted`
>> 
>> 
>> —
>> Erik Hatcher, Senior Solutions Architect
>> http://www.lucidworks.com 
>> 
>> 
>> 
>> 
>>> On Feb 26, 2015, at 4:53 AM, Baruch Kogan 
>> wrote:
>>> 
>>> Hi, I've just installed Solr (will be controlling with Solarium and using
>>> to search Nutch queries.)  I'm working through the starting tutorials
>>> described here:
>>> https://cwiki.apache.org/confluence/display/solr/Running+Solr
>>> 
>>> When I try to run $ bin/post -c gettingstarted
>> example/exampledocs/*.json,
>>> I get a bunch of errors having to do
>>> with there not being a gettingstarted folder in /solr/. Is this normal?
>>> Should I create one?
>>> 
>>> Sincerely,
>>> 
>>> Baruch Kogan
>>> Marketing Manager
>>> Seller Panda 
>>> +972(58)441-3829
>>> baruch.kogan at Skype
>> 
>> 



Re: Getting started with Solr

2015-02-26 Thread Baruch Kogan
Oh, I see. I used the start -e cloud command, then ran through a setup with
one core and default options for the rest, then tried to post the json
example again, and got another error:
buntu@ubuntu-VirtualBox:~/crawler/solr$ bin/post -c gettingstarted
example/exampledocs/*.json
/usr/lib/jvm/java-7-oracle/bin/java -classpath
/home/ubuntu/crawler/solr/dist/solr-core-5.0.0.jar -Dauto=yes
-Dc=gettingstarted -Ddata=files org.apache.solr.util.SimplePostTool
example/exampledocs/books.json
SimplePostTool version 5.0.0
Posting files to [base] url
http://localhost:8983/solr/gettingstarted/update...
Entering auto mode. File endings considered are
xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
POSTing file books.json (application/json) to [base]
SimplePostTool: WARNING: Solr returned an error #404 (Not Found) for url:
http://localhost:8983/solr/gettingstarted/update
SimplePostTool: WARNING: Response: 


Error 404 Not Found

HTTP ERROR 404
Problem accessing /solr/gettingstarted/update. Reason:
Not FoundPowered by
Jetty://

Sincerely,

Baruch Kogan
Marketing Manager
Seller Panda 
+972(58)441-3829
baruch.kogan at Skype

On Thu, Feb 26, 2015 at 4:07 PM, Erik Hatcher 
wrote:

> How did you start Solr?   If you started with `bin/solr start -e cloud`
> you’ll have a gettingstarted collection created automatically, otherwise
> you’ll need to create it yourself with `bin/solr create -c gettingstarted`
>
>
> —
> Erik Hatcher, Senior Solutions Architect
> http://www.lucidworks.com 
>
>
>
>
> > On Feb 26, 2015, at 4:53 AM, Baruch Kogan 
> wrote:
> >
> > Hi, I've just installed Solr (will be controlling with Solarium and using
> > to search Nutch queries.)  I'm working through the starting tutorials
> > described here:
> > https://cwiki.apache.org/confluence/display/solr/Running+Solr
> >
> > When I try to run $ bin/post -c gettingstarted
> example/exampledocs/*.json,
> > I get a bunch of errors having to do
> > with there not being a gettingstarted folder in /solr/. Is this normal?
> > Should I create one?
> >
> > Sincerely,
> >
> > Baruch Kogan
> > Marketing Manager
> > Seller Panda 
> > +972(58)441-3829
> > baruch.kogan at Skype
>
>


Re: Getting started with Solr

2015-02-26 Thread Erik Hatcher
How did you start Solr?   If you started with `bin/solr start -e cloud` you’ll 
have a gettingstarted collection created automatically, otherwise you’ll need 
to create it yourself with `bin/solr create -c gettingstarted`


—
Erik Hatcher, Senior Solutions Architect
http://www.lucidworks.com 




> On Feb 26, 2015, at 4:53 AM, Baruch Kogan  wrote:
> 
> Hi, I've just installed Solr (will be controlling with Solarium and using
> to search Nutch queries.)  I'm working through the starting tutorials
> described here:
> https://cwiki.apache.org/confluence/display/solr/Running+Solr
> 
> When I try to run $ bin/post -c gettingstarted example/exampledocs/*.json,
> I get a bunch of errors having to do
> with there not being a gettingstarted folder in /solr/. Is this normal?
> Should I create one?
> 
> Sincerely,
> 
> Baruch Kogan
> Marketing Manager
> Seller Panda 
> +972(58)441-3829
> baruch.kogan at Skype



Getting started with Solr

2015-02-26 Thread Baruch Kogan
Hi, I've just installed Solr (will be controlling with Solarium and using
to search Nutch queries.)  I'm working through the starting tutorials
described here:
https://cwiki.apache.org/confluence/display/solr/Running+Solr

When I try to run $ bin/post -c gettingstarted example/exampledocs/*.json,
I get a bunch of errors having to do
with there not being a gettingstarted folder in /solr/. Is this normal?
Should I create one?

Sincerely,

Baruch Kogan
Marketing Manager
Seller Panda 
+972(58)441-3829
baruch.kogan at Skype


Re: newbie getting started with solr

2013-11-07 Thread Alexandre Rafalovitch
Tried my book? It should explain that. You can see the collections with
examples in GitHub:
https://github.com/arafalov/solr-indexing-book/tree/master/published

Start from collection1.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Thu, Nov 7, 2013 at 4:50 PM, Palmer, Eric  wrote:

> Sorry if this is obvious (because it isn't for me)
>
> I want to build a solr (4.5.1) + nutch (1.7.1) environment.  I'm doing
> this on amazon linux (I may put nutch on a separate server eventually).
>
> Please let me know if my thinking is sound or off base
>
> in the example folder are a lot of files and folders including the war
> file and start.jar
>
> drwxr-xr-x   cloud-scripts
> drwxr-xr-x   contexts
> drwxr-xr-x   etc
> drwxr-xr-x   example-DIH
> drwxr-xr-x   exampledocs
> drwxr-xr-x   example-schemaless
> drwxr-xr-x   lib
> drwxr-xr-x   logs
> drwxr-xr-x   multicore
> -rw-r--r--   README.txt
> drwxr-xr-x   resources
> drwxr-xr-x   solr
> drwxr-xr-x   solr-webapp
> -rw-r--r--   start.jar
> drwxr-xr-x   webapps
>
>
> I am creating a separate folder for the conf and data folders (on another
> disk) and placing these files in the conf file
>
> schema-solr.xml (from nutch) renamed to schema.solr
> solrconfig.xml
>
> I will use the example folder and start.jar from that location. (is this
> okay)
>
> Where do I set the collection name?
>
> What else do I need to do to get a basic web page indexer built. (I'll
> work out the crawling later, I just want to be able to manually add some
> documents and query).  I'm trying to understand solr first and then will
> use nutch.
>
> I have several books and have looked at the tutorial and other web sites.
> It seems they assume that I know where to begin when creating a new
> collection and customizing it.
>
> Thanks in advance for your help.
>
> --
> Eric Palmer
> Web Services
> U of Richmond
>
> To report technical issues, obtain technical support or make requests for
> enhancements please visit
> http://web.richmond.edu/contact/technical-support.html
>


Re: newbie getting started with solr

2013-11-07 Thread Tom Mortimer
Hi Eric,

Solr configuration can certainly be confusing at first. And for some time
after. :P

If you're running start.jar from the example folder (which is fine for
testing, and I've known some people to use it for production systems) then
the default solr home is example/solr.  This contains solr.xml, which
specifies where to find per-core configuration and data. (A core is
equivalent to a collection in a simple non-sharded setup).

For now, the easiest thing would be to use the default core in
example/solr/collection1. Copy your solrconfig.xml and schema.xml over the
ones in collection1/conf (backing up the originals for reference). Create
your data directory wherever you like and symlink it into collection1.

Now when you run $ java -jar start.jar in example/, you should be able to
access Solr at http://localhost:8983/solr/ , and add and search for
documents.

Hope that helps a bit!

Tom



On 7 November 2013 14:50, Palmer, Eric  wrote:

> Sorry if this is obvious (because it isn't for me)
>
> I want to build a solr (4.5.1) + nutch (1.7.1) environment.  I'm doing
> this on amazon linux (I may put nutch on a separate server eventually).
>
> Please let me know if my thinking is sound or off base
>
> in the example folder are a lot of files and folders including the war
> file and start.jar
>
> drwxr-xr-x   cloud-scripts
> drwxr-xr-x   contexts
> drwxr-xr-x   etc
> drwxr-xr-x   example-DIH
> drwxr-xr-x   exampledocs
> drwxr-xr-x   example-schemaless
> drwxr-xr-x   lib
> drwxr-xr-x   logs
> drwxr-xr-x   multicore
> -rw-r--r--   README.txt
> drwxr-xr-x   resources
> drwxr-xr-x   solr
> drwxr-xr-x   solr-webapp
> -rw-r--r--   start.jar
> drwxr-xr-x   webapps
>
>
> I am creating a separate folder for the conf and data folders (on another
> disk) and placing these files in the conf file
>
> schema-solr.xml (from nutch) renamed to schema.solr
> solrconfig.xml
>
> I will use the example folder and start.jar from that location. (is this
> okay)
>
> Where do I set the collection name?
>
> What else do I need to do to get a basic web page indexer built. (I'll
> work out the crawling later, I just want to be able to manually add some
> documents and query).  I'm trying to understand solr first and then will
> use nutch.
>
> I have several books and have looked at the tutorial and other web sites.
> It seems they assume that I know where to begin when creating a new
> collection and customizing it.
>
> Thanks in advance for your help.
>
> --
> Eric Palmer
> Web Services
> U of Richmond
>
> To report technical issues, obtain technical support or make requests for
> enhancements please visit
> http://web.richmond.edu/contact/technical-support.html
>


newbie getting started with solr

2013-11-07 Thread Palmer, Eric
Sorry if this is obvious (because it isn't for me)

I want to build a solr (4.5.1) + nutch (1.7.1) environment.  I'm doing this on 
amazon linux (I may put nutch on a separate server eventually).

Please let me know if my thinking is sound or off base

in the example folder are a lot of files and folders including the war file and 
start.jar

drwxr-xr-x   cloud-scripts
drwxr-xr-x   contexts
drwxr-xr-x   etc
drwxr-xr-x   example-DIH
drwxr-xr-x   exampledocs
drwxr-xr-x   example-schemaless
drwxr-xr-x   lib
drwxr-xr-x   logs
drwxr-xr-x   multicore
-rw-r--r--   README.txt
drwxr-xr-x   resources
drwxr-xr-x   solr
drwxr-xr-x   solr-webapp
-rw-r--r--   start.jar
drwxr-xr-x   webapps


I am creating a separate folder for the conf and data folders (on another disk) 
and placing these files in the conf file

schema-solr.xml (from nutch) renamed to schema.solr
solrconfig.xml

I will use the example folder and start.jar from that location. (is this okay)

Where do I set the collection name?

What else do I need to do to get a basic web page indexer built. (I'll work out 
the crawling later, I just want to be able to manually add some documents and 
query).  I'm trying to understand solr first and then will use nutch.

I have several books and have looked at the tutorial and other web sites. It 
seems they assume that I know where to begin when creating a new collection and 
customizing it.

Thanks in advance for your help.

--
Eric Palmer
Web Services
U of Richmond

To report technical issues, obtain technical support or make requests for 
enhancements please visit http://web.richmond.edu/contact/technical-support.html


Re: Re: Unable to getting started with SOLR

2013-09-18 Thread Furkan KAMACI
I suggest you to start from here:
http://wiki.apache.org/solr/HowToCompileSolr

15 Eylül 2013 Pazar tarihinde Erick Erickson  adlı
kullanıcı şöyle yazdı:
> If you're using the default jetty container, there's no log unless
> you set it up, the content is echoed to the screen.
>
> About a zillion people have downloaded this and started it
> running without issue, so you need to give us the exact
> steps you followed.
>
> If you checked the code out from SVN, you need to build it,
> go into /solr and execute
>
> ant example dist
>
> the "dist" bit isn't strictly necessary, but it builds the jars
> that you link to if you try to develop custom plugins etc.
>
> Best,
> Erick
>
>
> On Fri, Sep 13, 2013 at 3:56 AM, Rah1x  wrote:
>
>> I have the same issue can anyone tell me if they found a solution?
>>
>>
>>
>> --
>> View this message in context:
>>
http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p4089761.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>


Re: Re: Unable to getting started with SOLR

2013-09-14 Thread Erick Erickson
If you're using the default jetty container, there's no log unless
you set it up, the content is echoed to the screen.

About a zillion people have downloaded this and started it
running without issue, so you need to give us the exact
steps you followed.

If you checked the code out from SVN, you need to build it,
go into /solr and execute

ant example dist

the "dist" bit isn't strictly necessary, but it builds the jars
that you link to if you try to develop custom plugins etc.

Best,
Erick


On Fri, Sep 13, 2013 at 3:56 AM, Rah1x  wrote:

> I have the same issue can anyone tell me if they found a solution?
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p4089761.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Re: Unable to getting started with SOLR

2013-09-13 Thread Rah1x
I have the same issue can anyone tell me if they found a solution?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p4089761.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Getting started with solr 4.2 and cassandra

2013-04-01 Thread Otis Gospodnetic
Hi,

Solr doesn't have anything like ES River.  DIH (DataImportHandler)
feels like the closest thing in Solr, though it's not quite the same
thing.  DIH pulls in data like a typical River does, but most people
have external indexers that push data into Solr using one of its
client libraries to talk to Solr, such as SolrJ.

Otis
--
Solr & ElasticSearch Support
http://sematext.com/





On Mon, Apr 1, 2013 at 6:34 PM, Utkarsh Sengar  wrote:
> Hello,
>
> I am evaluating solr 4.2 and ElasticSearch (I am new to both) for a search
> API, where data sits in cassandra.
>
> Getting started with elasticsearch is pretty straight forward and I was
> able to write an ES
> "river<http://www.elasticsearch.org/guide/reference/river/>"
> which pulls data from cassandra and indexes it in ES within a day.
>
> Now, I trying to implement something similar with solr and compare both of
> them.
>
> Getting started with
> solr/example<http://lucene.apache.org/solr/4_2_0/tutorial.html>was
> pretty easy and an example solr instance works. But the example folder
> contains whole bunch of stuff which I am not sure if I need:
> http://pastebin.com/Gv660mRT . I am sure I don't need 53 directories and
> 527 files
>
> So my questions are:
> 1. How can I create a bare bone solr app up and running with minimum set of
> configuration? (I will build over it when needed by taking reference from
> /example)
> 2. What is a best practice to run solr in production? Am approach like this
> jetty+nginx recommended:
> http://sacharya.com/nginx-proxy-to-jetty-for-java-apps/ ?
>
> Once I am done setting up a simple solr instance:
> 3. What is the general practice to import data to solr? For now, I am
> writing a python script which will read data in bulk from cassandra and
> throw it to solr.
>
> --
> Thanks,
> -Utkarsh


Re: Getting started with solr 4.2 and cassandra

2013-04-01 Thread Jack Krupansky
The Solr example really is rather simple. Download, unzip, run, add data, 
query. It's really that simple. Make sure you are looking at the Solr 
tutorial:


http://lucene.apache.org/solr/4_2_0/tutorial.html

Download from here:
http://lucene.apache.org/solr/tutorial.html

-- Jack Krupansky

-Original Message- 
From: Utkarsh Sengar

Sent: Monday, April 01, 2013 7:41 PM
To: solr-user@lucene.apache.org
Subject: Re: Getting started with solr 4.2 and cassandra

Thanks for the reply. So DSE is one of the options and I am looking into
that too.
Although, before diving into solr+cassandra integration (which comes out of
the box with DSE).

I am just trying to setup a solr instance on my local machine without the
bloat the "example" solr instance has to offer. Any suggestions about that?

Thanks,
-Utkarsh


On Mon, Apr 1, 2013 at 4:00 PM, Jack Krupansky 
wrote:



You might want to check out DataStax Enterprise, which actually integrates
Cassandra and Solr. You keep the data in Cassandra, but as data is added
and updated and deleted, the Solr index is automatically updated in
parallel. You can add and update data and query using either the Cassandra
API or the Solr API.

See:
http://www.datastax.com/what-**we-offer/products-services/**
datastax-enterprise<http://www.datastax.com/what-we-offer/products-services/datastax-enterprise>

-- Jack Krupansky

-Original Message- From: Utkarsh Sengar
Sent: Monday, April 01, 2013 6:34 PM
To: solr-user@lucene.apache.org
Subject: Getting started with solr 4.2 and cassandra


Hello,

I am evaluating solr 4.2 and ElasticSearch (I am new to both) for a search
API, where data sits in cassandra.

Getting started with elasticsearch is pretty straight forward and I was
able to write an ES
"river<http://www.**elasticsearch.org/guide/**reference/river/<http://www.elasticsearch.org/guide/reference/river/>
>"

which pulls data from cassandra and indexes it in ES within a day.

Now, I trying to implement something similar with solr and compare both of
them.

Getting started with
solr/example<http://lucene.**apache.org/solr/4_2_0/**tutorial.html<http://lucene.apache.org/solr/4_2_0/tutorial.html>
>was

pretty easy and an example solr instance works. But the example folder
contains whole bunch of stuff which I am not sure if I need:
http://pastebin.com/Gv660mRT . I am sure I don't need 53 directories and
527 files

So my questions are:
1. How can I create a bare bone solr app up and running with minimum set 
of

configuration? (I will build over it when needed by taking reference from
/example)
2. What is a best practice to run solr in production? Am approach like 
this

jetty+nginx recommended:
http://sacharya.com/nginx-**proxy-to-jetty-for-java-apps/<http://sacharya.com/nginx-proxy-to-jetty-for-java-apps/>?

Once I am done setting up a simple solr instance:
3. What is the general practice to import data to solr? For now, I am
writing a python script which will read data in bulk from cassandra and
throw it to solr.

--
Thanks,
-Utkarsh





--
Thanks,
-Utkarsh 



Re: Getting started with solr 4.2 and cassandra

2013-04-01 Thread Utkarsh Sengar
Thanks for the reply. So DSE is one of the options and I am looking into
that too.
Although, before diving into solr+cassandra integration (which comes out of
the box with DSE).

I am just trying to setup a solr instance on my local machine without the
bloat the "example" solr instance has to offer. Any suggestions about that?

Thanks,
-Utkarsh


On Mon, Apr 1, 2013 at 4:00 PM, Jack Krupansky wrote:

> You might want to check out DataStax Enterprise, which actually integrates
> Cassandra and Solr. You keep the data in Cassandra, but as data is added
> and updated and deleted, the Solr index is automatically updated in
> parallel. You can add and update data and query using either the Cassandra
> API or the Solr API.
>
> See:
> http://www.datastax.com/what-**we-offer/products-services/**
> datastax-enterprise<http://www.datastax.com/what-we-offer/products-services/datastax-enterprise>
>
> -- Jack Krupansky
>
> -Original Message- From: Utkarsh Sengar
> Sent: Monday, April 01, 2013 6:34 PM
> To: solr-user@lucene.apache.org
> Subject: Getting started with solr 4.2 and cassandra
>
>
> Hello,
>
> I am evaluating solr 4.2 and ElasticSearch (I am new to both) for a search
> API, where data sits in cassandra.
>
> Getting started with elasticsearch is pretty straight forward and I was
> able to write an ES
> "river<http://www.**elasticsearch.org/guide/**reference/river/<http://www.elasticsearch.org/guide/reference/river/>
> >"
>
> which pulls data from cassandra and indexes it in ES within a day.
>
> Now, I trying to implement something similar with solr and compare both of
> them.
>
> Getting started with
> solr/example<http://lucene.**apache.org/solr/4_2_0/**tutorial.html<http://lucene.apache.org/solr/4_2_0/tutorial.html>
> >was
>
> pretty easy and an example solr instance works. But the example folder
> contains whole bunch of stuff which I am not sure if I need:
> http://pastebin.com/Gv660mRT . I am sure I don't need 53 directories and
> 527 files
>
> So my questions are:
> 1. How can I create a bare bone solr app up and running with minimum set of
> configuration? (I will build over it when needed by taking reference from
> /example)
> 2. What is a best practice to run solr in production? Am approach like this
> jetty+nginx recommended:
> http://sacharya.com/nginx-**proxy-to-jetty-for-java-apps/<http://sacharya.com/nginx-proxy-to-jetty-for-java-apps/>?
>
> Once I am done setting up a simple solr instance:
> 3. What is the general practice to import data to solr? For now, I am
> writing a python script which will read data in bulk from cassandra and
> throw it to solr.
>
> --
> Thanks,
> -Utkarsh
>



-- 
Thanks,
-Utkarsh


Re: Getting started with solr 4.2 and cassandra

2013-04-01 Thread Jack Krupansky
You might want to check out DataStax Enterprise, which actually integrates 
Cassandra and Solr. You keep the data in Cassandra, but as data is added and 
updated and deleted, the Solr index is automatically updated in parallel. 
You can add and update data and query using either the Cassandra API or the 
Solr API.


See:
http://www.datastax.com/what-we-offer/products-services/datastax-enterprise

-- Jack Krupansky

-Original Message- 
From: Utkarsh Sengar

Sent: Monday, April 01, 2013 6:34 PM
To: solr-user@lucene.apache.org
Subject: Getting started with solr 4.2 and cassandra

Hello,

I am evaluating solr 4.2 and ElasticSearch (I am new to both) for a search
API, where data sits in cassandra.

Getting started with elasticsearch is pretty straight forward and I was
able to write an ES
"river<http://www.elasticsearch.org/guide/reference/river/>"
which pulls data from cassandra and indexes it in ES within a day.

Now, I trying to implement something similar with solr and compare both of
them.

Getting started with
solr/example<http://lucene.apache.org/solr/4_2_0/tutorial.html>was
pretty easy and an example solr instance works. But the example folder
contains whole bunch of stuff which I am not sure if I need:
http://pastebin.com/Gv660mRT . I am sure I don't need 53 directories and
527 files

So my questions are:
1. How can I create a bare bone solr app up and running with minimum set of
configuration? (I will build over it when needed by taking reference from
/example)
2. What is a best practice to run solr in production? Am approach like this
jetty+nginx recommended:
http://sacharya.com/nginx-proxy-to-jetty-for-java-apps/ ?

Once I am done setting up a simple solr instance:
3. What is the general practice to import data to solr? For now, I am
writing a python script which will read data in bulk from cassandra and
throw it to solr.

--
Thanks,
-Utkarsh 



Getting started with solr 4.2 and cassandra

2013-04-01 Thread Utkarsh Sengar
Hello,

I am evaluating solr 4.2 and ElasticSearch (I am new to both) for a search
API, where data sits in cassandra.

Getting started with elasticsearch is pretty straight forward and I was
able to write an ES
"river<http://www.elasticsearch.org/guide/reference/river/>"
which pulls data from cassandra and indexes it in ES within a day.

Now, I trying to implement something similar with solr and compare both of
them.

Getting started with
solr/example<http://lucene.apache.org/solr/4_2_0/tutorial.html>was
pretty easy and an example solr instance works. But the example folder
contains whole bunch of stuff which I am not sure if I need:
http://pastebin.com/Gv660mRT . I am sure I don't need 53 directories and
527 files

So my questions are:
1. How can I create a bare bone solr app up and running with minimum set of
configuration? (I will build over it when needed by taking reference from
/example)
2. What is a best practice to run solr in production? Am approach like this
jetty+nginx recommended:
http://sacharya.com/nginx-proxy-to-jetty-for-java-apps/ ?

Once I am done setting up a simple solr instance:
3. What is the general practice to import data to solr? For now, I am
writing a python script which will read data in bulk from cassandra and
throw it to solr.

-- 
Thanks,
-Utkarsh


Re: Re: Unable to getting started with SOLR

2011-11-10 Thread kingkong
Try replacing "localhost" with your domain or ip address and make sure the
port is open. Use the ps command to see if java is running.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p3497583.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Unable to getting started with SOLR

2011-11-10 Thread Per Newgro

Sounds strange. Did you do >>>java -jar start.jar<<< on the console?

Am 10.11.2011 18:19, schrieb dsy99:


Yes I executed the server "start.jar" embedded in example folder but not 
getting any message after that. I checked to logs also.it is empty.





On Thu, 10 Nov 2011 22:34:57 +0530  wrote



Did you start the server (




*java -jar start.jar*




)? Was it successful? Have you checked the logs?




Am 10.11.2011 17:54, schrieb dsy99:




Hi all,




  Sorry for the in convenience caused if to anyone but I need reply for




following.






I want to work in Solr and for the same I downloaded it and started to




follow the instruction provided in the Tutorial available at




"http://lucene.apache.org/solr/tutorial.html"; to execute some examples




first.




but when I tried to check whether Solr is running or not bye using




"http://localhost:8983/solr/admin/"; in the web browser I found the following




message.




I will be thankful if one can suggest some solution for it.






  Message:











   Unable to connect






  Firefox can't establish a connection to the server at localhost:8983.






  The site could be temporarily unavailable or too busy. Try again in a few




moments.




  If you are unable to load any pages, check your computer's network




connection.




  If your computer or network is protected by a firewall or proxy, make sure




that Firefox is permitted to access the Web.




_






With Regds:




Divakar






--




View this message in context: 
http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p3497276.html
Sent from the Solr - User mailing list archive at Nabble.com.















If you reply to this email, your message will be added to the 
discussion below:

    
http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p3497310.html





        

        To unsubscribe from Unable to getting started with SOLR, click 
here.

See how NAML generates this email





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p3497364.html
Sent from the Solr - User mailing list archive at Nabble.com.




Re: Unable to getting started with SOLR

2011-11-10 Thread Per Newgro

Did you start the server (

*java -jar start.jar*

)? Was it successful? Have you checked the logs?

Am 10.11.2011 17:54, schrieb dsy99:

Hi all,
  Sorry for the in convenience caused if to anyone but I need reply for
following.

I want to work in Solr and for the same I downloaded it and started to
follow the instruction provided in the Tutorial available at
"http://lucene.apache.org/solr/tutorial.html"; to execute some examples
first.
but when I tried to check whether Solr is running or not bye using
"http://localhost:8983/solr/admin/"; in the web browser I found the following
message.
   I will be thankful if one can suggest some solution for it.

  Message:


 Unable to connect

   Firefox can't establish a connection to the server at localhost:8983.

  The site could be temporarily unavailable or too busy. Try again in a few
moments.
   If you are unable to load any pages, check your computer's network
connection.
   If your computer or network is protected by a firewall or proxy, make sure
that Firefox is permitted to access the Web.
_

With Regds:
Divakar

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p3497276.html
Sent from the Solr - User mailing list archive at Nabble.com.





Unable to getting started with SOLR

2011-11-10 Thread dsy99

Hi all,
 Sorry for the in convenience caused if to anyone but I need reply for
following.

I want to work in Solr and for the same I downloaded it and started to
follow the instruction provided in the Tutorial available at
"http://lucene.apache.org/solr/tutorial.html"; to execute some examples
first.
but when I tried to check whether Solr is running or not bye using
"http://localhost:8983/solr/admin/"; in the web browser I found the following
message.
  I will be thankful if one can suggest some solution for it.
 
 Message:


Unable to connect

  Firefox can't establish a connection to the server at localhost:8983.

 The site could be temporarily unavailable or too busy. Try again in a few 
moments.
  If you are unable to load any pages, check your computer's network
connection.
  If your computer or network is protected by a firewall or proxy, make sure
that Firefox is permitted to access the Web.
_

With Regds:
Divakar

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unable-to-getting-started-with-SOLR-tp3497276p3497276.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: getting started with Solr Flare

2011-10-12 Thread Chris Hostetter

: but run into a problem at step 4
: 
: Launch Solr:
: cd ; java -Dsolr.solr.home= -jar start.jar
: 
: where Solr complains that it can't find solrconfig.xml in either the
: classpath or the solr-ruby home dir. Can anyone help me disentangle this?

what exactly was the command line you executed? what wsa the real path?

The solrconfig.xml file shouldn't exist *in* the directory you specify 
with -Dsolr.solr.home, it should be at ./conf/solrconfig.xml relative that 
path (or there should be a "./solr.xml" relative that patch in a more 
modern/multi-core setup)



-Hoss


getting started with Solr Flare

2011-10-05 Thread Fred Zimmerman
Hi,

I followed the very simple instructions found at '

http://wiki.apache.org/solr/Flare/HowTo

but run into a problem at step 4

Launch Solr:
cd ; java -Dsolr.solr.home= -jar start.jar

where Solr complains that it can't find solrconfig.xml in either the
classpath or the solr-ruby home dir. Can anyone help me disentangle this?

FredZ

-
Subscribe to the Nimble Books Mailing List  http://eepurl.com/czS- for
monthly updates


Re: Getting started with Solr

2008-09-24 Thread Mark Miller




How can I setup to run Solr as a service, so I don't need to have a
SSH connection open?
Sorry for being stupid here btw.
This is kind of independent from solr. You have to look how to do it for 
the OS you are running on. With Ubuntu, you could just launch solr with 
nohup to keep it from stopping when you log off, or look into writing an 
init.d/rc startup script that launches solr (just google).


I'm working to have a multi-langual search. So a company (doc) exists
in say Poland, what design of scheme should I read/work on to be able
to write Poland/Polen/Polska (Poland in different languages) and still
hit the same results. I have the data from geonames.org for this, but
I can't really grasp how I should be working the scheme.xml. The
easiest solution would be to populate each document with each possible
hit word, but this would give me a bunch of duplicates.
Not sure I get you completely, but you one option might be to index each 
language to a separate field, and search over those fields sep/together 
as needed.


Another option, if there is a lot of overlap, might be to use something 
like a synonym type analyzer: put tokens that differ in each language at 
the same position in the index. Of course this immediately gets 
difficult if one language has two tokens for a word and another has 1. 
This could get tricky quick depending on what queries you need to 
support how they should work, etc.


- Mark



Getting started with Solr

2008-09-24 Thread Martin Iwanowski

Hi,

I'm very new to search engines in general.
I've been using Zend_Search_Lucene PHP class before to try Lucene in
general and though it surely works it's not what I'm looking for
performance wise.

I recently installed Solr on a newly installed Ubuntu (Hardy Heron)
machine.

I have about 207k docs (currently, and I'm getting about 100k each
month from now on) and that's why I decided to throw myself into
something real for once.

As I'm learning from today, I was wondering two main things.
I'm using Jetty as the Java container, and PHP5 to handle the search-
requests from an agent.

If I start Solr using "java -jar start.jar" in the example directory,
everything works fine. I even manage to populate the index with the
example data as documented in the tutorials.

How can I setup to run Solr as a service, so I don't need to have a
SSH connection open?
Sorry for being stupid here btw.

I'm working to have a multi-langual search. So a company (doc) exists
in say Poland, what design of scheme should I read/work on to be able
to write Poland/Polen/Polska (Poland in different languages) and still
hit the same results. I have the data from geonames.org for this, but
I can't really grasp how I should be working the scheme.xml. The
easiest solution would be to populate each document with each possible
hit word, but this would give me a bunch of duplicates.

Yours,
Martin Iwanowski