Re: Newbie to SOLR with ridiculously simple questions

2013-12-10 Thread smetzger
Alex,
Yeah im getting too complicated...

I have a VMWARE instance with SOLR running
I have an Amazon instance with SOLR running
I have a Rails Installer with SOLR and Blacklight running on windows
(localhost:8983)
I have a Binami Windows SOLR installed (localhost:8984)

i just downloaded the windows.zip file from Apache-Solr and can see it
matches your book.

I unpacked the zip file in my user directory. I believe this runs its own
jetty JRE itself. 
The Solr instructions are to move the solr.war into my SERVLET container
(where is that?) I am already running JAVA on my machine... executed in
Windows/system32/java.exe.
am i suppposed to run all the commands from C:/  instead of $???
Also...there is no start.bat file as indicated in the solr install
instructions.

Ill try and move your files over to one of the bitnami installs.

steve 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Newbie-to-SOLR-with-ridiculously-simple-questions-tp4105788p4105964.html
Sent from the Solr - User mailing list archive at Nabble.com.


Newbie to SOLR with ridiculously simple questions

2013-12-09 Thread smetzger
OK...
Im a Windows guy who is being forced to learn SoLR on Ubuntu for the whole
organizations. I fancy myself somewhat capable of following directions but
this Solr concept is puzzling. 

Here is what I think i know. 

Solr houses indexes. Each index record (usually based on a document) need to
be added to the Solr collection.  This seems fairly simple and I can run the
post.jar and various xml and json files  FROM THE UBUNTU TERMINAL. I doubt
you have to use the Terminal every time you want to add an index.

My guess is that you have to feed Solr from third party systems using the
http: update url into the solr server. Is this correct? Lets say i have a
(god forbid) a sharepoint site and I want to move all the document text and
document metadata into Solr.  Do I simply run a script (say in .NET or
Coldfusion) that loops through the SP doc records and sends out the http
update url to Solr for each doc???

How does Tika fit in ?

thanks
steve







--
View this message in context: 
http://lucene.472066.n3.nabble.com/Newbie-to-SOLR-with-ridiculously-simple-questions-tp4105788.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Newbie to SOLR with ridiculously simple questions

2013-12-09 Thread Alexandre Rafalovitch
Hi Steve,

Good luck. I would start from doing online tutorial if you haven't already
(do it on Windows) and then reading a book. There are several on the
market, including my own for the beginners (
http://blog.outerthoughts.com/2013/06/my-book-on-solr-is-now-published/ ).

For SharePoint, I would look at http://manifoldcf.apache.org/en_US/ , they
seem to be covering that use case specifically and sending information to
Solr.

For more general case, I would look at SolrNet (
https://github.com/mausch/SolrNet/blob/master/Documentation/README.md ). To
use Solr 4 with SorlNet, you would need to get the latest build or build it
yourself from source, it is not terribly complicated.

Tika, is a separate Apache project bundled with Solr and is used to parse
binary files (e.g. PDFs, MSWord, etc) and extract whatever is possible,
usually structured metadata and some sort of internal text.

For the interface, there is a couple of options, though most people are
rolling their own. The main reason is because you should NOT expose Solr
directly to the web (not secure), so there is a need for Solr middleware.
Solr middleware is usually custom with project-specific enhancements, etc.
But you could have a look at Hue for internal/intermediate usage. Hue is
for Hadoop ecosystem, but does include Solr support too:
http://gethue.tumblr.com/tagged/search

The most important point to remember when you are understanding Solr is
that it is there for _search_. You shape your data to match that purpose.
If that breaks relationships and duplicates data in Solr, that's fine. You
still have your primary data safe in relational/document storage.

Regards,
   Alex.


Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Tue, Dec 10, 2013 at 6:13 AM, smetzger smetz...@msi-inc.com wrote:

 OK...
 Im a Windows guy who is being forced to learn SoLR on Ubuntu for the whole
 organizations. I fancy myself somewhat capable of following directions but
 this Solr concept is puzzling.

 Here is what I think i know.

 Solr houses indexes. Each index record (usually based on a document) need
 to
 be added to the Solr collection.  This seems fairly simple and I can run
 the
 post.jar and various xml and json files  FROM THE UBUNTU TERMINAL. I doubt
 you have to use the Terminal every time you want to add an index.

 My guess is that you have to feed Solr from third party systems using the
 http: update url into the solr server. Is this correct? Lets say i have a
 (god forbid) a sharepoint site and I want to move all the document text and
 document metadata into Solr.  Do I simply run a script (say in .NET or
 Coldfusion) that loops through the SP doc records and sends out the http
 update url to Solr for each doc???

 How does Tika fit in ?

 thanks
 steve







 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Newbie-to-SOLR-with-ridiculously-simple-questions-tp4105788.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Newbie to SOLR with ridiculously simple questions

2013-12-09 Thread smetzger
Thanks for the reply Alex...
in fact I am using your book!

the book seems like a good tutorial ...

My bitnami solr instance however already includes Solr (running in
background) and a directory structure :

root
--opt
bitnami
--apache-solr
solr
--collection1


I assume that the apache-solr directory is the same as the universal
example directory mentioned in many tutorials. If I follow your book I
create a new directory under apache-solr called SOLR-INDEXING with the
collection1/conf/ and .xml files per your instruction. 

but now i have two instances running and somehow I need to point solr from
the solr/collection1 core to the SOLR-INDEXING/collection1   core  

I would think this could be done on the Solr Admin page but can't see how.
If i try and restart the jetty with java -Dsolr.solr.home=SOLR-INDEXING
-jar start.jarit runs and does some install but I think it does not
shut down the prior one first. In fact once i run that i lose all my solr
and have to reinstall the VMWARE snapshot. 

Any guidance would be useful so I can continue with your book. 
Thanks
steve






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Newbie-to-SOLR-with-ridiculously-simple-questions-tp4105788p4105812.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Newbie to SOLR with ridiculously simple questions

2013-12-09 Thread Alexandre Rafalovitch
I think you might be complicating your life with BitNami stack during
learning. I would just download latest Solr to your Windows desktop and go
through the examples there.

Still, you can try moving collection1 directory under 'solr' and putting my
examples there instead. Then, you don't need to change any scripts. Or
rename collection1 to another name and add it to solr.xml as per
instructions in the book to have it as a second core. Basically, change the
content of 'solr' directory rather than the scripts that make it work. But
then you still need need to know where the libraries are as I bet the file
path would be different from my book's instructions. Use 'locate' command
on unix to find where the jar might be.

Just make sure BitNami stack Solr is at least 4.3 (4.3.1?) as per book's
minimum requirements. Otherwise, more advanced examples will fail in
strange ways.

Regards,
   Alex.


Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Tue, Dec 10, 2013 at 8:22 AM, smetzger smetz...@msi-inc.com wrote:

 Thanks for the reply Alex...
 in fact I am using your book!

 the book seems like a good tutorial ...

 My bitnami solr instance however already includes Solr (running in
 background) and a directory structure :

 root
 --opt
 bitnami
 --apache-solr
 solr
 --collection1


 I assume that the apache-solr directory is the same as the universal
 example directory mentioned in many tutorials. If I follow your book I
 create a new directory under apache-solr called SOLR-INDEXING with the
 collection1/conf/ and .xml files per your instruction.

 but now i have two instances running and somehow I need to point solr from
 the solr/collection1 core to the SOLR-INDEXING/collection1   core

 I would think this could be done on the Solr Admin page but can't see how.
 If i try and restart the jetty with java -Dsolr.solr.home=SOLR-INDEXING
 -jar start.jarit runs and does some install but I think it does not
 shut down the prior one first. In fact once i run that i lose all my solr
 and have to reinstall the VMWARE snapshot.

 Any guidance would be useful so I can continue with your book.
 Thanks
 steve






 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Newbie-to-SOLR-with-ridiculously-simple-questions-tp4105788p4105812.html
 Sent from the Solr - User mailing list archive at Nabble.com.