subject:"A Newbie Question"

Re: Newbie question - Error loading an existing config file

2019-02-21 Thread Erick Erickson

I have absolutely no idea when it comes to Drupal, the Drupal folks would be
much better equipped to answer.

Best,
Erick

> On Feb 21, 2019, at 8:16 AM, Greg Robinson  wrote:
> 
> Thanks for the feedback.
> 
> So here is where I'm at.
> 
> I first went ahead and deleted the existing core that was returning the
> error using the following command: bin/solr delete -c new_solr_core
> 
> Now when I access the admin panel, there are no errors.
> 
> I then referred to the large "warning" box on the CREATE action
> documentation:
> 
> "While it’s possible to create a core for a non-existent collection, this
> approach is not supported and not recommended. Always create a collection
> using the *Collections API* before creating a core directly for it."
> 
> I then tried to create a collection using the following "Collections API"
> command:
> 
> http://localhost:8983/solr/admin/collections?action=CREATE=newCollection=2=1=xml
> 
> This was the response:
> 
> 
> 
> 400
> 2
> 
> 
> 
> org.apache.solr.common.SolrException
> org.apache.solr.common.SolrException
> 
> Solr instance is not running in SolrCloud mode.
> 400
> 
> 
> 
> I guess my main question is, do I need to be running in "SolrCloud mode" if
> my intention is to use Solr Server to index a Drupal 7 website? We're
> currently using opensolr.com which is working fine but we're trying to
> avoid the monthly costs associated with their "Shared Solr Cloud" plan.
> 
> Thanks!
> 
> 
> 
> 
> 
> On Wed, Feb 20, 2019 at 8:34 PM Shawn Heisey  wrote:
> 
>> On 2/20/2019 11:07 AM, Greg Robinson wrote:
>>> Lets try this: https://imgur.com/a/z5OzbLW
>>> 
>>> What I'm trying to do seems pretty straightforward:
>>> 
>>> 1. Install Solr Server 7.4 on Linux (Completed)
>>> 2. Connect my Drupal 7 site to the Solr Server and use it for indexing
>>> content
>>> 
>>> My understanding is that I must first create a core in order to connect
>> my
>>> drupal site to Solr Server. This is where I'm currently stuck.
>> 
>> The assertion in your screenshot that the dataDir must exist is
>> incorrect.  If current versions of Solr say this also, that is something
>> we will need to change.  This is what actually happens:  If all the
>> other requirements are met and the dataDir does not exist, it will be
>> created automatically when the core starts, if the process has
>> sufficient permissions.
>> 
>> See the large "warning" box on the CREATE action documentation for
>> details on what you need:
>> 
>> 
>> https://lucene.apache.org/solr/guide/7_4/coreadmin-api.html#coreadmin-create
>> 
>> The warning box is the one that has a red triangle to the left of it.
>> The red triangle contains an exclamation point.
>> 
>> The essence of what it says there is that the core's instance directory
>> must exist, that directory must contain a "conf" directory, and all
>> required config files must be in the conf directory.
>> 
>> If you're running in SolrCloud mode, then you're using the wrong API.
>> 
>> Thanks,
>> Shawn
>> 
> 
> 
> -- 
> Greg Robinson
> CEO - Mobile*Enhanced*
> www.mobileenhanced.com
> g...@mobileenhanced.com
> 303-598-1865

Re: Newbie question - Error loading an existing config file

2019-02-21 Thread Greg Robinson

Thanks for the feedback.

So here is where I'm at.

I first went ahead and deleted the existing core that was returning the
error using the following command: bin/solr delete -c new_solr_core

Now when I access the admin panel, there are no errors.

I then referred to the large "warning" box on the CREATE action
documentation:

"While it’s possible to create a core for a non-existent collection, this
approach is not supported and not recommended. Always create a collection
using the *Collections API* before creating a core directly for it."

I then tried to create a collection using the following "Collections API"
command:

http://localhost:8983/solr/admin/collections?action=CREATE=newCollection=2=1=xml

This was the response:

400
2

org.apache.solr.common.SolrException
org.apache.solr.common.SolrException

Solr instance is not running in SolrCloud mode.
400

I guess my main question is, do I need to be running in "SolrCloud mode" if
my intention is to use Solr Server to index a Drupal 7 website? We're
currently using opensolr.com which is working fine but we're trying to
avoid the monthly costs associated with their "Shared Solr Cloud" plan.

Thanks!

On Wed, Feb 20, 2019 at 8:34 PM Shawn Heisey  wrote:

> On 2/20/2019 11:07 AM, Greg Robinson wrote:
> > Lets try this: https://imgur.com/a/z5OzbLW
> >
> > What I'm trying to do seems pretty straightforward:
> >
> > 1. Install Solr Server 7.4 on Linux (Completed)
> > 2. Connect my Drupal 7 site to the Solr Server and use it for indexing
> > content
> >
> > My understanding is that I must first create a core in order to connect
> my
> > drupal site to Solr Server. This is where I'm currently stuck.
>
> The assertion in your screenshot that the dataDir must exist is
> incorrect.  If current versions of Solr say this also, that is something
> we will need to change.  This is what actually happens:  If all the
> other requirements are met and the dataDir does not exist, it will be
> created automatically when the core starts, if the process has
> sufficient permissions.
>
> See the large "warning" box on the CREATE action documentation for
> details on what you need:
>
>
> https://lucene.apache.org/solr/guide/7_4/coreadmin-api.html#coreadmin-create
>
> The warning box is the one that has a red triangle to the left of it.
> The red triangle contains an exclamation point.
>
> The essence of what it says there is that the core's instance directory
> must exist, that directory must contain a "conf" directory, and all
> required config files must be in the conf directory.
>
> If you're running in SolrCloud mode, then you're using the wrong API.
>
> Thanks,
> Shawn
>

-- 
Greg Robinson
CEO - Mobile*Enhanced*
www.mobileenhanced.com
g...@mobileenhanced.com
303-598-1865

Re: Newbie question - Error loading an existing config file

2019-02-20 Thread Shawn Heisey


On 2/20/2019 11:07 AM, Greg Robinson wrote:

Lets try this: https://imgur.com/a/z5OzbLW

What I'm trying to do seems pretty straightforward:

1. Install Solr Server 7.4 on Linux (Completed)
2. Connect my Drupal 7 site to the Solr Server and use it for indexing
content

My understanding is that I must first create a core in order to connect my
drupal site to Solr Server. This is where I'm currently stuck.


The assertion in your screenshot that the dataDir must exist is 
incorrect.  If current versions of Solr say this also, that is something 
we will need to change.  This is what actually happens:  If all the 
other requirements are met and the dataDir does not exist, it will be 
created automatically when the core starts, if the process has 
sufficient permissions.


See the large "warning" box on the CREATE action documentation for 
details on what you need:


https://lucene.apache.org/solr/guide/7_4/coreadmin-api.html#coreadmin-create

The warning box is the one that has a red triangle to the left of it. 
The red triangle contains an exclamation point.


The essence of what it says there is that the core's instance directory 
must exist, that directory must contain a "conf" directory, and all 
required config files must be in the conf directory.


If you're running in SolrCloud mode, then you're using the wrong API.

Thanks,
Shawn

Re: Newbie question - Error loading an existing config file

2019-02-20 Thread Greg Robinson

Gotcha.

Lets try this: https://imgur.com/a/z5OzbLW

What I'm trying to do seems pretty straightforward:

1. Install Solr Server 7.4 on Linux (Completed)
2. Connect my Drupal 7 site to the Solr Server and use it for indexing
content

My understanding is that I must first create a core in order to connect my
drupal site to Solr Server. This is where I'm currently stuck.

Thanks for your help!

On Wed, Feb 20, 2019 at 10:43 AM Erick Erickson 
wrote:

> Attachments generally are stripped by the mail server.
>
> Are you trying to create a core as part of a SolrCloud _collection_? If
> so, this
> is an anti-pattern, use the collection API commands. Shot in the dark.
>
> Best,
> Erick
>
> > On Feb 19, 2019, at 3:05 PM, Greg Robinson 
> wrote:
> >
> > I used the front end admin (see attached)
> >
> > thanks
> >
> > On Tue, Feb 19, 2019 at 3:54 PM Erick Erickson 
> wrote:
> > Hmmm, that’s not very helpful…..
> >
> > Don’t quite know what to say. There should be something more helpful
> > in the logs.
> >
> > Hmmm, How did you create the core?
> >
> > Best,
> > Erick
> >
> >
> > > On Feb 19, 2019, at 1:29 PM, Greg Robinson 
> wrote:
> > >
> > > Thanks for your direction regarding the log.
> > >
> > > I was able to locate it and these two lines stood out:
> > >
> > > Caused by: org.apache.solr.common.SolrException: Could not load conf
> for
> > > core new_solr_core: Error loading solr config from
> > > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> > >
> > > Caused by: org.apache.solr.common.SolrException: Error loading solr
> config
> > > from /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> > >
> > > which seems to point to the same issue.
> > >
> > > I also went ahead and updated permissions/owner to "solr" on all
> > > directories and files within "/home/solr/server/solr/new_solr_core".
> > >
> > > Still no luck. This is currently the same message that I'm getting on
> the
> > > admin front end:
> > >
> > > new_solr_core:
> > >
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > > Could not load conf for core new_solr_core: Error loading solr config
> from
> > > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml.
> > >
> > > thanks!
> > >
> > >
> > >
> > > On Tue, Feb 19, 2019 at 1:55 PM Erick Erickson <
> erickerick...@gmail.com>
> > > wrote:
> > >
> > >> do a recursive seach for “solr.log" under SOLR_HOME…….
> > >>
> > >> Best,
> > >> ERick
> > >>
> > >>> On Feb 19, 2019, at 8:08 AM, Greg Robinson 
> > >> wrote:
> > >>>
> > >>> Hi Erick,
> > >>>
> > >>> Thanks for the quick response.
> > >>>
> > >>> Here is what is currently contained within  the conf dir:
> > >>>
> > >>> drwxr-xr-x 2 root root  4096 Feb 18 17:51 lang
> > >>> -rw-r--r-- 1 root root 54513 Feb 18 17:51 managed-schema
> > >>> -rw-r--r-- 1 root root   329 Feb 18 17:51 params.json
> > >>> -rw-r--r-- 1 root root   894 Feb 18 17:51 protwords.txt
> > >>> -rwxrwxrwx 1 root root 55323 Feb 18 17:51 solrconfig.xml
> > >>> -rw-r--r-- 1 root root   795 Feb 18 17:51 stopwords.txt
> > >>> -rw-r--r-- 1 root root  1153 Feb 18 17:51 synonyms.txt
> > >>>
> > >>> As far as the log, where exactly might I find the specific log that
> would
> > >>> give more info in regards to this error?
> > >>>
> > >>> thanks again!
> > >>>
> > >>> On Tue, Feb 19, 2019 at 9:06 AM Erick Erickson <
> erickerick...@gmail.com>
> > >>> wrote:
> > >>>
> >  Are all the other files there in your conf dir? Solrconfig.xml
> > >> references
> >  things like nanaged-schema etc.
> > 
> >  Also, your log file might contain more clues...
> > 
> >  On Tue, Feb 19, 2019, 08:03 Greg Robinson  > >> wrote:
> > 
> > > Hello,
> > >
> > > We have Solr 7.4 up and running on a Linux machine.
> > >
> > > I'm just trying to add a new core so that I can eventually point a
> > >> Drupal
> > > site to the Solr Server for indexing.
> > >
> > > When attempting to add a core, I'm getting the following error:
> > >
> > > new_solr_core:
> > >
> > 
> > >>
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > > Could not load conf for core new_solr_core: Error loading solr
> config
> >  from
> > > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> > >
> > > I've confirmed that
> > > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml exists
> but I'm
> > > still getting the error.
> > >
> > > Any direction is appreciated.
> > >
> > > Thanks!
> > >
> > 
> > >>>
> > >>>
> > >>> --
> > >>> Greg Robinson
> > >>> CEO - Mobile*Enhanced*
> > >>> www.mobileenhanced.com
> > >>> g...@mobileenhanced.com
> > >>> 303-598-1865
> > >>
> > >>
> > >
> > > --
> > > Greg Robinson
> > > CEO - Mobile*Enhanced*
> > > www.mobileenhanced.com
> > > g...@mobileenhanced.com
> > > 303-598-1865
> >
> >
> >
> > --
> > Greg Robinson
> > CEO - MobileEnhanced
> > www.mobileenhanced.com
> > g...@mobileenhanced.com
> >

Re: Newbie question - Error loading an existing config file

2019-02-20 Thread Erick Erickson

Attachments generally are stripped by the mail server.

Are you trying to create a core as part of a SolrCloud _collection_? If so, this
is an anti-pattern, use the collection API commands. Shot in the dark.

Best,
Erick

> On Feb 19, 2019, at 3:05 PM, Greg Robinson  wrote:
> 
> I used the front end admin (see attached)
> 
> thanks
> 
> On Tue, Feb 19, 2019 at 3:54 PM Erick Erickson  
> wrote:
> Hmmm, that’s not very helpful…..
> 
> Don’t quite know what to say. There should be something more helpful
> in the logs.
> 
> Hmmm, How did you create the core?
> 
> Best,
> Erick
> 
> 
> > On Feb 19, 2019, at 1:29 PM, Greg Robinson  wrote:
> > 
> > Thanks for your direction regarding the log.
> > 
> > I was able to locate it and these two lines stood out:
> > 
> > Caused by: org.apache.solr.common.SolrException: Could not load conf for
> > core new_solr_core: Error loading solr config from
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> > 
> > Caused by: org.apache.solr.common.SolrException: Error loading solr config
> > from /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> > 
> > which seems to point to the same issue.
> > 
> > I also went ahead and updated permissions/owner to "solr" on all
> > directories and files within "/home/solr/server/solr/new_solr_core".
> > 
> > Still no luck. This is currently the same message that I'm getting on the
> > admin front end:
> > 
> > new_solr_core:
> > org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > Could not load conf for core new_solr_core: Error loading solr config from
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml.
> > 
> > thanks!
> > 
> > 
> > 
> > On Tue, Feb 19, 2019 at 1:55 PM Erick Erickson 
> > wrote:
> > 
> >> do a recursive seach for “solr.log" under SOLR_HOME…….
> >> 
> >> Best,
> >> ERick
> >> 
> >>> On Feb 19, 2019, at 8:08 AM, Greg Robinson 
> >> wrote:
> >>> 
> >>> Hi Erick,
> >>> 
> >>> Thanks for the quick response.
> >>> 
> >>> Here is what is currently contained within  the conf dir:
> >>> 
> >>> drwxr-xr-x 2 root root  4096 Feb 18 17:51 lang
> >>> -rw-r--r-- 1 root root 54513 Feb 18 17:51 managed-schema
> >>> -rw-r--r-- 1 root root   329 Feb 18 17:51 params.json
> >>> -rw-r--r-- 1 root root   894 Feb 18 17:51 protwords.txt
> >>> -rwxrwxrwx 1 root root 55323 Feb 18 17:51 solrconfig.xml
> >>> -rw-r--r-- 1 root root   795 Feb 18 17:51 stopwords.txt
> >>> -rw-r--r-- 1 root root  1153 Feb 18 17:51 synonyms.txt
> >>> 
> >>> As far as the log, where exactly might I find the specific log that would
> >>> give more info in regards to this error?
> >>> 
> >>> thanks again!
> >>> 
> >>> On Tue, Feb 19, 2019 at 9:06 AM Erick Erickson 
> >>> wrote:
> >>> 
>  Are all the other files there in your conf dir? Solrconfig.xml
> >> references
>  things like nanaged-schema etc.
>  
>  Also, your log file might contain more clues...
>  
>  On Tue, Feb 19, 2019, 08:03 Greg Robinson  >> wrote:
>  
> > Hello,
> > 
> > We have Solr 7.4 up and running on a Linux machine.
> > 
> > I'm just trying to add a new core so that I can eventually point a
> >> Drupal
> > site to the Solr Server for indexing.
> > 
> > When attempting to add a core, I'm getting the following error:
> > 
> > new_solr_core:
> > 
>  
> >> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > Could not load conf for core new_solr_core: Error loading solr config
>  from
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> > 
> > I've confirmed that
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml exists but I'm
> > still getting the error.
> > 
> > Any direction is appreciated.
> > 
> > Thanks!
> > 
>  
> >>> 
> >>> 
> >>> --
> >>> Greg Robinson
> >>> CEO - Mobile*Enhanced*
> >>> www.mobileenhanced.com
> >>> g...@mobileenhanced.com
> >>> 303-598-1865
> >> 
> >> 
> > 
> > -- 
> > Greg Robinson
> > CEO - Mobile*Enhanced*
> > www.mobileenhanced.com
> > g...@mobileenhanced.com
> > 303-598-1865
> 
> 
> 
> -- 
> Greg Robinson
> CEO - MobileEnhanced
> www.mobileenhanced.com
> g...@mobileenhanced.com
> 303-598-1865

Re: Newbie question - Error loading an existing config file

2019-02-19 Thread Greg Robinson

I used the front end admin (see attached)

thanks

On Tue, Feb 19, 2019 at 3:54 PM Erick Erickson 
wrote:

> Hmmm, that’s not very helpful…..
>
> Don’t quite know what to say. There should be something more helpful
> in the logs.
>
> Hmmm, How did you create the core?
>
> Best,
> Erick
>
>
> > On Feb 19, 2019, at 1:29 PM, Greg Robinson 
> wrote:
> >
> > Thanks for your direction regarding the log.
> >
> > I was able to locate it and these two lines stood out:
> >
> > Caused by: org.apache.solr.common.SolrException: Could not load conf for
> > core new_solr_core: Error loading solr config from
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> >
> > Caused by: org.apache.solr.common.SolrException: Error loading solr
> config
> > from /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> >
> > which seems to point to the same issue.
> >
> > I also went ahead and updated permissions/owner to "solr" on all
> > directories and files within "/home/solr/server/solr/new_solr_core".
> >
> > Still no luck. This is currently the same message that I'm getting on the
> > admin front end:
> >
> > new_solr_core:
> >
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > Could not load conf for core new_solr_core: Error loading solr config
> from
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml.
> >
> > thanks!
> >
> >
> >
> > On Tue, Feb 19, 2019 at 1:55 PM Erick Erickson 
> > wrote:
> >
> >> do a recursive seach for “solr.log" under SOLR_HOME…….
> >>
> >> Best,
> >> ERick
> >>
> >>> On Feb 19, 2019, at 8:08 AM, Greg Robinson 
> >> wrote:
> >>>
> >>> Hi Erick,
> >>>
> >>> Thanks for the quick response.
> >>>
> >>> Here is what is currently contained within  the conf dir:
> >>>
> >>> drwxr-xr-x 2 root root  4096 Feb 18 17:51 lang
> >>> -rw-r--r-- 1 root root 54513 Feb 18 17:51 managed-schema
> >>> -rw-r--r-- 1 root root   329 Feb 18 17:51 params.json
> >>> -rw-r--r-- 1 root root   894 Feb 18 17:51 protwords.txt
> >>> -rwxrwxrwx 1 root root 55323 Feb 18 17:51 solrconfig.xml
> >>> -rw-r--r-- 1 root root   795 Feb 18 17:51 stopwords.txt
> >>> -rw-r--r-- 1 root root  1153 Feb 18 17:51 synonyms.txt
> >>>
> >>> As far as the log, where exactly might I find the specific log that
> would
> >>> give more info in regards to this error?
> >>>
> >>> thanks again!
> >>>
> >>> On Tue, Feb 19, 2019 at 9:06 AM Erick Erickson <
> erickerick...@gmail.com>
> >>> wrote:
> >>>
>  Are all the other files there in your conf dir? Solrconfig.xml
> >> references
>  things like nanaged-schema etc.
> 
>  Also, your log file might contain more clues...
> 
>  On Tue, Feb 19, 2019, 08:03 Greg Robinson  >> wrote:
> 
> > Hello,
> >
> > We have Solr 7.4 up and running on a Linux machine.
> >
> > I'm just trying to add a new core so that I can eventually point a
> >> Drupal
> > site to the Solr Server for indexing.
> >
> > When attempting to add a core, I'm getting the following error:
> >
> > new_solr_core:
> >
> 
> >>
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > Could not load conf for core new_solr_core: Error loading solr config
>  from
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> >
> > I've confirmed that
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml exists but
> I'm
> > still getting the error.
> >
> > Any direction is appreciated.
> >
> > Thanks!
> >
> 
> >>>
> >>>
> >>> --
> >>> Greg Robinson
> >>> CEO - Mobile*Enhanced*
> >>> www.mobileenhanced.com
> >>> g...@mobileenhanced.com
> >>> 303-598-1865
> >>
> >>
> >
> > --
> > Greg Robinson
> > CEO - Mobile*Enhanced*
> > www.mobileenhanced.com
> > g...@mobileenhanced.com
> > 303-598-1865
>
>

-- 
Greg Robinson
CEO - Mobile*Enhanced*
www.mobileenhanced.com
g...@mobileenhanced.com
303-598-1865

Re: Newbie question - Error loading an existing config file

2019-02-19 Thread Erick Erickson

Hmmm, that’s not very helpful…..

Don’t quite know what to say. There should be something more helpful
in the logs.

Hmmm, How did you create the core?

Best,
Erick


> On Feb 19, 2019, at 1:29 PM, Greg Robinson  wrote:
> 
> Thanks for your direction regarding the log.
> 
> I was able to locate it and these two lines stood out:
> 
> Caused by: org.apache.solr.common.SolrException: Could not load conf for
> core new_solr_core: Error loading solr config from
> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> 
> Caused by: org.apache.solr.common.SolrException: Error loading solr config
> from /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> 
> which seems to point to the same issue.
> 
> I also went ahead and updated permissions/owner to "solr" on all
> directories and files within "/home/solr/server/solr/new_solr_core".
> 
> Still no luck. This is currently the same message that I'm getting on the
> admin front end:
> 
> new_solr_core:
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> Could not load conf for core new_solr_core: Error loading solr config from
> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml.
> 
> thanks!
> 
> 
> 
> On Tue, Feb 19, 2019 at 1:55 PM Erick Erickson 
> wrote:
> 
>> do a recursive seach for “solr.log" under SOLR_HOME…….
>> 
>> Best,
>> ERick
>> 
>>> On Feb 19, 2019, at 8:08 AM, Greg Robinson 
>> wrote:
>>> 
>>> Hi Erick,
>>> 
>>> Thanks for the quick response.
>>> 
>>> Here is what is currently contained within  the conf dir:
>>> 
>>> drwxr-xr-x 2 root root  4096 Feb 18 17:51 lang
>>> -rw-r--r-- 1 root root 54513 Feb 18 17:51 managed-schema
>>> -rw-r--r-- 1 root root   329 Feb 18 17:51 params.json
>>> -rw-r--r-- 1 root root   894 Feb 18 17:51 protwords.txt
>>> -rwxrwxrwx 1 root root 55323 Feb 18 17:51 solrconfig.xml
>>> -rw-r--r-- 1 root root   795 Feb 18 17:51 stopwords.txt
>>> -rw-r--r-- 1 root root  1153 Feb 18 17:51 synonyms.txt
>>> 
>>> As far as the log, where exactly might I find the specific log that would
>>> give more info in regards to this error?
>>> 
>>> thanks again!
>>> 
>>> On Tue, Feb 19, 2019 at 9:06 AM Erick Erickson 
>>> wrote:
>>> 
 Are all the other files there in your conf dir? Solrconfig.xml
>> references
 things like nanaged-schema etc.
 
 Also, your log file might contain more clues...
 
 On Tue, Feb 19, 2019, 08:03 Greg Robinson > wrote:
 
> Hello,
> 
> We have Solr 7.4 up and running on a Linux machine.
> 
> I'm just trying to add a new core so that I can eventually point a
>> Drupal
> site to the Solr Server for indexing.
> 
> When attempting to add a core, I'm getting the following error:
> 
> new_solr_core:
> 
 
>> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> Could not load conf for core new_solr_core: Error loading solr config
 from
> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> 
> I've confirmed that
> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml exists but I'm
> still getting the error.
> 
> Any direction is appreciated.
> 
> Thanks!
> 
 
>>> 
>>> 
>>> --
>>> Greg Robinson
>>> CEO - Mobile*Enhanced*
>>> www.mobileenhanced.com
>>> g...@mobileenhanced.com
>>> 303-598-1865
>> 
>> 
> 
> -- 
> Greg Robinson
> CEO - Mobile*Enhanced*
> www.mobileenhanced.com
> g...@mobileenhanced.com
> 303-598-1865

Re: Newbie question - Error loading an existing config file

2019-02-19 Thread Greg Robinson

Thanks for your direction regarding the log.

I was able to locate it and these two lines stood out:

Caused by: org.apache.solr.common.SolrException: Could not load conf for
core new_solr_core: Error loading solr config from
/home/solr/server/solr/new_solr_core/conf/solrconfig.xml

Caused by: org.apache.solr.common.SolrException: Error loading solr config
from /home/solr/server/solr/new_solr_core/conf/solrconfig.xml

which seems to point to the same issue.

I also went ahead and updated permissions/owner to "solr" on all
directories and files within "/home/solr/server/solr/new_solr_core".

Still no luck. This is currently the same message that I'm getting on the
admin front end:

new_solr_core:
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Could not load conf for core new_solr_core: Error loading solr config from
/home/solr/server/solr/new_solr_core/conf/solrconfig.xml.

thanks!



On Tue, Feb 19, 2019 at 1:55 PM Erick Erickson 
wrote:

> do a recursive seach for “solr.log" under SOLR_HOME…….
>
> Best,
> ERick
>
> > On Feb 19, 2019, at 8:08 AM, Greg Robinson 
> wrote:
> >
> > Hi Erick,
> >
> > Thanks for the quick response.
> >
> > Here is what is currently contained within  the conf dir:
> >
> > drwxr-xr-x 2 root root  4096 Feb 18 17:51 lang
> > -rw-r--r-- 1 root root 54513 Feb 18 17:51 managed-schema
> > -rw-r--r-- 1 root root   329 Feb 18 17:51 params.json
> > -rw-r--r-- 1 root root   894 Feb 18 17:51 protwords.txt
> > -rwxrwxrwx 1 root root 55323 Feb 18 17:51 solrconfig.xml
> > -rw-r--r-- 1 root root   795 Feb 18 17:51 stopwords.txt
> > -rw-r--r-- 1 root root  1153 Feb 18 17:51 synonyms.txt
> >
> > As far as the log, where exactly might I find the specific log that would
> > give more info in regards to this error?
> >
> > thanks again!
> >
> > On Tue, Feb 19, 2019 at 9:06 AM Erick Erickson 
> > wrote:
> >
> >> Are all the other files there in your conf dir? Solrconfig.xml
> references
> >> things like nanaged-schema etc.
> >>
> >> Also, your log file might contain more clues...
> >>
> >> On Tue, Feb 19, 2019, 08:03 Greg Robinson  wrote:
> >>
> >>> Hello,
> >>>
> >>> We have Solr 7.4 up and running on a Linux machine.
> >>>
> >>> I'm just trying to add a new core so that I can eventually point a
> Drupal
> >>> site to the Solr Server for indexing.
> >>>
> >>> When attempting to add a core, I'm getting the following error:
> >>>
> >>> new_solr_core:
> >>>
> >>
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> >>> Could not load conf for core new_solr_core: Error loading solr config
> >> from
> >>> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> >>>
> >>> I've confirmed that
> >>> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml exists but I'm
> >>> still getting the error.
> >>>
> >>> Any direction is appreciated.
> >>>
> >>> Thanks!
> >>>
> >>
> >
> >
> > --
> > Greg Robinson
> > CEO - Mobile*Enhanced*
> > www.mobileenhanced.com
> > g...@mobileenhanced.com
> > 303-598-1865
>
>

-- 
Greg Robinson
CEO - Mobile*Enhanced*
www.mobileenhanced.com
g...@mobileenhanced.com
303-598-1865

Re: Newbie question - Error loading an existing config file

2019-02-19 Thread Erick Erickson

do a recursive seach for “solr.log" under SOLR_HOME…….

Best,
ERick

> On Feb 19, 2019, at 8:08 AM, Greg Robinson  wrote:
> 
> Hi Erick,
> 
> Thanks for the quick response.
> 
> Here is what is currently contained within  the conf dir:
> 
> drwxr-xr-x 2 root root  4096 Feb 18 17:51 lang
> -rw-r--r-- 1 root root 54513 Feb 18 17:51 managed-schema
> -rw-r--r-- 1 root root   329 Feb 18 17:51 params.json
> -rw-r--r-- 1 root root   894 Feb 18 17:51 protwords.txt
> -rwxrwxrwx 1 root root 55323 Feb 18 17:51 solrconfig.xml
> -rw-r--r-- 1 root root   795 Feb 18 17:51 stopwords.txt
> -rw-r--r-- 1 root root  1153 Feb 18 17:51 synonyms.txt
> 
> As far as the log, where exactly might I find the specific log that would
> give more info in regards to this error?
> 
> thanks again!
> 
> On Tue, Feb 19, 2019 at 9:06 AM Erick Erickson 
> wrote:
> 
>> Are all the other files there in your conf dir? Solrconfig.xml references
>> things like nanaged-schema etc.
>> 
>> Also, your log file might contain more clues...
>> 
>> On Tue, Feb 19, 2019, 08:03 Greg Robinson > 
>>> Hello,
>>> 
>>> We have Solr 7.4 up and running on a Linux machine.
>>> 
>>> I'm just trying to add a new core so that I can eventually point a Drupal
>>> site to the Solr Server for indexing.
>>> 
>>> When attempting to add a core, I'm getting the following error:
>>> 
>>> new_solr_core:
>>> 
>> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
>>> Could not load conf for core new_solr_core: Error loading solr config
>> from
>>> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
>>> 
>>> I've confirmed that
>>> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml exists but I'm
>>> still getting the error.
>>> 
>>> Any direction is appreciated.
>>> 
>>> Thanks!
>>> 
>> 
> 
> 
> -- 
> Greg Robinson
> CEO - Mobile*Enhanced*
> www.mobileenhanced.com
> g...@mobileenhanced.com
> 303-598-1865

Re: Newbie question - Error loading an existing config file

2019-02-19 Thread Greg Robinson

Hi Erick,

Thanks for the quick response.

Here is what is currently contained within  the conf dir:

drwxr-xr-x 2 root root  4096 Feb 18 17:51 lang
-rw-r--r-- 1 root root 54513 Feb 18 17:51 managed-schema
-rw-r--r-- 1 root root   329 Feb 18 17:51 params.json
-rw-r--r-- 1 root root   894 Feb 18 17:51 protwords.txt
-rwxrwxrwx 1 root root 55323 Feb 18 17:51 solrconfig.xml
-rw-r--r-- 1 root root   795 Feb 18 17:51 stopwords.txt
-rw-r--r-- 1 root root  1153 Feb 18 17:51 synonyms.txt

As far as the log, where exactly might I find the specific log that would
give more info in regards to this error?

thanks again!

On Tue, Feb 19, 2019 at 9:06 AM Erick Erickson 
wrote:

> Are all the other files there in your conf dir? Solrconfig.xml references
> things like nanaged-schema etc.
>
> Also, your log file might contain more clues...
>
> On Tue, Feb 19, 2019, 08:03 Greg Robinson 
> > Hello,
> >
> > We have Solr 7.4 up and running on a Linux machine.
> >
> > I'm just trying to add a new core so that I can eventually point a Drupal
> > site to the Solr Server for indexing.
> >
> > When attempting to add a core, I'm getting the following error:
> >
> > new_solr_core:
> >
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > Could not load conf for core new_solr_core: Error loading solr config
> from
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> >
> > I've confirmed that
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml exists but I'm
> > still getting the error.
> >
> > Any direction is appreciated.
> >
> > Thanks!
> >
>


-- 
Greg Robinson
CEO - Mobile*Enhanced*
www.mobileenhanced.com
g...@mobileenhanced.com
303-598-1865

Re: Newbie question - Error loading an existing config file

2019-02-19 Thread Erick Erickson

Are all the other files there in your conf dir? Solrconfig.xml references
things like nanaged-schema etc.

Also, your log file might contain more clues...

On Tue, Feb 19, 2019, 08:03 Greg Robinson  Hello,
>
> We have Solr 7.4 up and running on a Linux machine.
>
> I'm just trying to add a new core so that I can eventually point a Drupal
> site to the Solr Server for indexing.
>
> When attempting to add a core, I'm getting the following error:
>
> new_solr_core:
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> Could not load conf for core new_solr_core: Error loading solr config from
> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
>
> I've confirmed that
> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml exists but I'm
> still getting the error.
>
> Any direction is appreciated.
>
> Thanks!
>

7.4.0 Newbie Question about bin/post and parsing/extracting html document parts into solr dynamic field.

2018-07-17 Thread Bell, Bob

Hi, 

New to solr, so forgive any missing info on my part.   

1. I am trying figure out how to get an html document html element 
parsed into a solr dynamic field.  Is it possible ?   So let's say I have some 
specific html tag or xml tags within the html 
document, that I created a  Dynamic field for, how do I specify that kind of 
parsing/extraction ?I can successfully
index the documents, and search finds the documents based on the indexing, but 
I would extract a specific set
of data from page, could be embedded xml tag, structured data, hidden fields, 
or specific html tag.   I am just not
sure how to accomplish this.Thank you for any ideas !! 

Thanks,
Bob
Austin Public Library

Re: Newbie Question

2018-01-09 Thread Deepak Goel

*Hello*

*The code which worked for me:*

SolrClient client = new HttpSolrClient.Builder("
http://localhost:8983/solr/shakespeare;).build();

SolrQuery query = new SolrQuery();
query.setRequestHandler("/select");
query.setQuery("text_entry:henry");
query.setFields("text_entry");

QueryResponse queryResponse = null;
try
{
queryResponse = client.query(query);
}
catch (Exception e)
{

}

System.out.println("Query Response: " +queryResponse.toString());

if (queryResponse!=null && queryResponse.getResponse().size()>0)
{
SolrDocumentList results = queryResponse.getResults();
for (int i = 0; i < results.size(); ++i) {
SolrDocument document = results.get(i);
System.out.println("The result is: " +results.get(i));
System.out.println("The Document field names are: "
+document.getFieldNames());
}
}

*The data:*

{"index":{"_index":"shakespeare","_id":0}}
{"type":"act","line_id":1,"play_name":"Henry IV",
"speech_number":"","line_number":"","speaker":"","text_entry":"ACT I"}
{"index":{"_index":"shakespeare","_id":1}}
{"type":"scene","line_id":2,"play_name":"Henry
IV","speech_number":"","line_number":"","speaker":"","text_entry":"SCENE I.
London. The palace."}

*Deepak*



Virus-free.
www.avg.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>



Deepak
"Please stop cruelty to Animals, help by becoming a Vegan"
+91 73500 12833
deic...@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

On Tue, Jan 9, 2018 at 8:09 PM, Shawn Heisey  wrote:

> On 1/8/2018 10:23 AM, Deepak Goel wrote:
> > *I am trying to search for documents in my collection (Shakespeare). The
> > code is as follows:*
> >
> > SolrClient client = new HttpSolrClient.Builder("
> > http://localhost:8983/solr/shakespeare;).build();
> >
> > SolrDocument doc = client.getById("2");
> > *However this does not return any document. What mistake am I making?*
>
> The getById method accesses the handler named "/get", normally defined
> with the RealTimeGetHandler class.  In recent Solr versions, the /get
> handler is defined implicitly and does not have to be configured, but in
> older versions (not sure which ones) you do need to have it in
> solrconfig.xml.
>
> I didn't expect your code to work because getById method returns a
> SolrDocumentList and you have SolrDocument, but apparently this actually
> does work.  I have tried code very similar to yours against the
> techproducts example in version 7.1, and it works perfectly.  I will
> share the exact code I tried and what results I got below.
>
> What code have you tried after the code you've shared?  How are you
> determining that no document is returned?  Are there any error messages
> logged by the client code or Solr?  If there are, can you share them?
>
> Do you have a document in the shakespeare index that has the value "2"
> in whatever field is the uniqueKey?  Does the schema have a uniqueKey
> defined?
>
> Can you find the entry in solr.log that logs the query and share that
> entire log entry?
>
> Code:
>
> public static void main(String[] args) throws SolrServerException,
> IOException
> {
>   String baseUrl = "http://localhost:8983/solr/techproducts;;
>   SolrClient client = new HttpSolrClient.Builder(baseUrl).build();
>   SolrDocument doc = client.getById("SP2514N");
>   System.out.println(doc.getFieldValue("name"));
> }
>
> Console log from that code:
>
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for
> further details.
> Samsung SpinPoint P120 SP2514N - hard drive - 250 GB - ATA-133
>
>
> Including the collection/core name in the URL is an older way of writing
> SolrJ code.  It works well, but multiple collections can be accessed
> through one client object if you change it and your SolrJ version is new
> enough.
>
> Thanks,
> Shawn
>
>

Re: Newbie Question

2018-01-09 Thread Shawn Heisey

On 1/8/2018 10:23 AM, Deepak Goel wrote:
> *I am trying to search for documents in my collection (Shakespeare). The
> code is as follows:*
>
> SolrClient client = new HttpSolrClient.Builder("
> http://localhost:8983/solr/shakespeare;).build();
>
> SolrDocument doc = client.getById("2");
> *However this does not return any document. What mistake am I making?*

The getById method accesses the handler named "/get", normally defined
with the RealTimeGetHandler class.  In recent Solr versions, the /get
handler is defined implicitly and does not have to be configured, but in
older versions (not sure which ones) you do need to have it in
solrconfig.xml.

I didn't expect your code to work because getById method returns a
SolrDocumentList and you have SolrDocument, but apparently this actually
does work.  I have tried code very similar to yours against the
techproducts example in version 7.1, and it works perfectly.  I will
share the exact code I tried and what results I got below.

What code have you tried after the code you've shared?  How are you
determining that no document is returned?  Are there any error messages
logged by the client code or Solr?  If there are, can you share them?

Do you have a document in the shakespeare index that has the value "2"
in whatever field is the uniqueKey?  Does the schema have a uniqueKey
defined?

Can you find the entry in solr.log that logs the query and share that
entire log entry?

Code:

public static void main(String[] args) throws SolrServerException,
IOException
{
  String baseUrl = "http://localhost:8983/solr/techproducts;;
  SolrClient client = new HttpSolrClient.Builder(baseUrl).build();
  SolrDocument doc = client.getById("SP2514N");
  System.out.println(doc.getFieldValue("name"));
}

Console log from that code:

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for
further details.
Samsung SpinPoint P120 SP2514N - hard drive - 250 GB - ATA-133

Including the collection/core name in the URL is an older way of writing
SolrJ code.  It works well, but multiple collections can be accessed
through one client object if you change it and your SolrJ version is new
enough.

Thanks,
Shawn

Re: Newbie Question

2018-01-08 Thread Deepak Goel

Got it . Thank You for your help



Deepak
"Please stop cruelty to Animals, help by becoming a Vegan"
+91 73500 12833
deic...@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

On Mon, Jan 8, 2018 at 11:48 PM, Deepak Goel  wrote:

> *Is this right?*
>
> SolrClient client = new HttpSolrClient.Builder("http:/
> /localhost:8983/solr/shakespeare/select").build();
>
> SolrQuery query = new SolrQuery();
> query.setQuery("henry");
> query.setFields("text_entry");
> query.setStart(0);
>
> queryResponse = client.query(query);
>
> *This is still returning NULL*
>
>
>
> 
>  Virus-free.
> www.avg.com
> 
> <#m_-1646772333528808550_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
>
>
> Deepak
> "Please stop cruelty to Animals, help by becoming a Vegan"
> +91 73500 12833
> deic...@gmail.com
>
> Facebook: https://www.facebook.com/deicool
> LinkedIn: www.linkedin.com/in/deicool
>
> "Plant a Tree, Go Green"
>
> On Mon, Jan 8, 2018 at 10:55 PM, Alexandre Rafalovitch  > wrote:
>
>> I think you are missing /query handler endpoint in the URL. Plus actual
>> search parameters.
>>
>> You may try using the admin UI to build your queries first.
>>
>> Regards,
>> Alex
>>
>> On Jan 8, 2018 12:23 PM, "Deepak Goel"  wrote:
>>
>> > Hello
>> >
>> > *I am trying to search for documents in my collection (Shakespeare). The
>> > code is as follows:*
>> >
>> > SolrClient client = new HttpSolrClient.Builder("
>> > http://localhost:8983/solr/shakespeare;).build();
>> >
>> > SolrDocument doc = client.getById("2");
>> > *However this does not return any document. What mistake am I making?*
>> >
>> > Thank You
>> > Deepak
>> >
>> > Deepak
>> > "Please stop cruelty to Animals, help by becoming a Vegan"
>> > +91 73500 12833
>> > deic...@gmail.com
>> >
>> > Facebook: https://www.facebook.com/deicool
>> > LinkedIn: www.linkedin.com/in/deicool
>> >
>> > "Plant a Tree, Go Green"
>> >
>> > > > utm_source=link_campaign=sig-email_content=webmail>
>> > Virus-free.
>> > www.avg.com
>> > > > utm_source=link_campaign=sig-email_content=webmail>
>> > <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>> >
>>
>
>

Re: Newbie Question

2018-01-08 Thread Deepak Goel

*Is this right?*

SolrClient client = new HttpSolrClient.Builder("
http://localhost:8983/solr/shakespeare/select;).build();

SolrQuery query = new SolrQuery();
query.setQuery("henry");
query.setFields("text_entry");
query.setStart(0);

queryResponse = client.query(query);

*This is still returning NULL*



Virus-free.
www.avg.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>



Deepak
"Please stop cruelty to Animals, help by becoming a Vegan"
+91 73500 12833
deic...@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

On Mon, Jan 8, 2018 at 10:55 PM, Alexandre Rafalovitch 
wrote:

> I think you are missing /query handler endpoint in the URL. Plus actual
> search parameters.
>
> You may try using the admin UI to build your queries first.
>
> Regards,
> Alex
>
> On Jan 8, 2018 12:23 PM, "Deepak Goel"  wrote:
>
> > Hello
> >
> > *I am trying to search for documents in my collection (Shakespeare). The
> > code is as follows:*
> >
> > SolrClient client = new HttpSolrClient.Builder("
> > http://localhost:8983/solr/shakespeare;).build();
> >
> > SolrDocument doc = client.getById("2");
> > *However this does not return any document. What mistake am I making?*
> >
> > Thank You
> > Deepak
> >
> > Deepak
> > "Please stop cruelty to Animals, help by becoming a Vegan"
> > +91 73500 12833
> > deic...@gmail.com
> >
> > Facebook: https://www.facebook.com/deicool
> > LinkedIn: www.linkedin.com/in/deicool
> >
> > "Plant a Tree, Go Green"
> >
> >  > utm_source=link_campaign=sig-email_content=webmail>
> > Virus-free.
> > www.avg.com
> >  > utm_source=link_campaign=sig-email_content=webmail>
> > <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
> >
>

Re: Newbie Question

2018-01-08 Thread Alexandre Rafalovitch

I think you are missing /query handler endpoint in the URL. Plus actual
search parameters.

You may try using the admin UI to build your queries first.

Regards,
Alex

On Jan 8, 2018 12:23 PM, "Deepak Goel"  wrote:

> Hello
>
> *I am trying to search for documents in my collection (Shakespeare). The
> code is as follows:*
>
> SolrClient client = new HttpSolrClient.Builder("
> http://localhost:8983/solr/shakespeare;).build();
>
> SolrDocument doc = client.getById("2");
> *However this does not return any document. What mistake am I making?*
>
> Thank You
> Deepak
>
> Deepak
> "Please stop cruelty to Animals, help by becoming a Vegan"
> +91 73500 12833
> deic...@gmail.com
>
> Facebook: https://www.facebook.com/deicool
> LinkedIn: www.linkedin.com/in/deicool
>
> "Plant a Tree, Go Green"
>
>  utm_source=link_campaign=sig-email_content=webmail>
> Virus-free.
> www.avg.com
>  utm_source=link_campaign=sig-email_content=webmail>
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>

Re: Newbie question about why represent timestamps as "float" values

2017-10-10 Thread Erick Erickson

Hold it. "date", "tdate", "pdate" _are_ primitive types. Under the
covers date/tdate are just a tlong type, newer Solrs have a "pdate"
which is a point numeric type. All that these types do is some parsing
up front so you can send human-readable data (and get it back). But
under the covers it's still a primitive.

And the idea of making it a float is _certainly_ worse than a long.
Last time I checked, floats were more expensive to work with than
longs. If this  was done for "efficiency" it wasn't done correctly.

It's vaguely possible that if this was done for efficiency, it was
done lng ago when dates could be strings. Certainly there's a
performance argument there, but that hasn't been the case for a very
long time.


Erick

On Tue, Oct 10, 2017 at 2:24 AM, Michael Kuhlmann  wrote:
> While you're generally right, in this case it might make sense to stick
> to a primitive type.
>
> I see "unixtime" as a technical information, probably from
> System.currentTimeMillis(). As long as it's not used as a "real world"
> date but only for sorting based on latest updates, or chosing which
> document is more recent, it's totally okay to index it as a long value.
>
> But definitely not as a float.
>
> -Michael
>
> Am 10.10.2017 um 10:55 schrieb alessandro.benedetti:
>> There was time ago a Solr installation which had the same problem, and the
>> author explained me that the choice was made for performance reasons.
>> Apparently he was sure that handling everything as primitive types would
>> give a boost to the Solr searching/faceting performance.
>> I never agreed ( and one of the reasons is that you need to transform back
>> from float to dates to actually render them in a readable format).
>>
>> Furthermore I tend to rely on standing on the shoulders of giants, so if a
>> community ( not just a single developer) spent time implementing a date type
>> ( with the different available implementations) to manage specifically date
>> information, I tend to thrust them and believe that the best approach to
>> manage dates is to use that ad hoc date type ( in its variants, depending on
>> the use cases).
>>
>> As a plus, using the right data type gives you immense power in debugging
>> and understanding better your data.
>> For proper maintenance , it is another good reason to stick with standards.
>>
>>
>>
>> -
>> ---
>> Alessandro Benedetti
>> Search Consultant, R Software Engineer, Director
>> Sease Ltd. - www.sease.io
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>
>

Re: Newbie question about why represent timestamps as "float" values

2017-10-10 Thread Michael Kuhlmann

While you're generally right, in this case it might make sense to stick
to a primitive type.

I see "unixtime" as a technical information, probably from
System.currentTimeMillis(). As long as it's not used as a "real world"
date but only for sorting based on latest updates, or chosing which
document is more recent, it's totally okay to index it as a long value.

But definitely not as a float.

-Michael

Am 10.10.2017 um 10:55 schrieb alessandro.benedetti:
> There was time ago a Solr installation which had the same problem, and the
> author explained me that the choice was made for performance reasons.
> Apparently he was sure that handling everything as primitive types would
> give a boost to the Solr searching/faceting performance.
> I never agreed ( and one of the reasons is that you need to transform back
> from float to dates to actually render them in a readable format).
> 
> Furthermore I tend to rely on standing on the shoulders of giants, so if a
> community ( not just a single developer) spent time implementing a date type
> ( with the different available implementations) to manage specifically date
> information, I tend to thrust them and believe that the best approach to
> manage dates is to use that ad hoc date type ( in its variants, depending on
> the use cases).
> 
> As a plus, using the right data type gives you immense power in debugging
> and understanding better your data.
> For proper maintenance , it is another good reason to stick with standards.
> 
> 
> 
> -
> ---
> Alessandro Benedetti
> Search Consultant, R Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>

Re: Newbie question about why represent timestamps as "float" values

2017-10-10 Thread alessandro.benedetti

There was time ago a Solr installation which had the same problem, and the
author explained me that the choice was made for performance reasons.
Apparently he was sure that handling everything as primitive types would
give a boost to the Solr searching/faceting performance.
I never agreed ( and one of the reasons is that you need to transform back
from float to dates to actually render them in a readable format).

Furthermore I tend to rely on standing on the shoulders of giants, so if a
community ( not just a single developer) spent time implementing a date type
( with the different available implementations) to manage specifically date
information, I tend to thrust them and believe that the best approach to
manage dates is to use that ad hoc date type ( in its variants, depending on
the use cases).

As a plus, using the right data type gives you immense power in debugging
and understanding better your data.
For proper maintenance , it is another good reason to stick with standards.



-
---
Alessandro Benedetti
Search Consultant, R Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Newbie question about why represent timestamps as "float" values

2017-10-09 Thread Erick Erickson

What Hoss said, and in addition somewhere some
custom code has to be translating things back and
forth. For dates, Solr wants -MM-DDTHH:MM:SSZ
as a date string it knows how to deal with. That simply
couldn't parse as a float type so there's some custom
code that transforms dates into a float at ingest
time and converts from float to something recognizable
as a date on output.

On Mon, Oct 9, 2017 at 2:06 PM, Chris Hostetter
 wrote:
>
> : Here is my question.  In schema.xml, there is this field:
> :
> : 
> :
> : Question:  why is this declared as a float datatype?  I'm just looking
> : for an explanation of what is there – any changes come later, after I
> : understand things better.
>
> You would hvae to ask the creator of that schema.xml file why they made
> that choice ... to the best of my knowledge, no sample/example schema that
> has ever shipped with any version of solr has ever included a "unixdate"
> field -- let alone one that suggested "float" would be a logically correct
> data type for storing that type of information.
>
>
> -Hoss
> http://www.lucidworks.com/

Re: Newbie question about why represent timestamps as "float" values

2017-10-09 Thread Chris Hostetter


: Here is my question.  In schema.xml, there is this field:
: 
: 
: 
: Question:  why is this declared as a float datatype?  I'm just looking 
: for an explanation of what is there – any changes come later, after I 
: understand things better.

You would hvae to ask the creator of that schema.xml file why they made 
that choice ... to the best of my knowledge, no sample/example schema that 
has ever shipped with any version of solr has ever included a "unixdate" 
field -- let alone one that suggested "float" would be a logically correct 
data type for storing that type of information.


-Hoss
http://www.lucidworks.com/

Re: newbie question re solr.PatternReplaceFilterFactory

2017-05-10 Thread Erick Erickson

First use PatternReplaceCharFilterFactory. The difference is that
PatternReplaceCharFilterFactoryworks on the entire input whereas
PatternReplaceFilterFactory works only on the tokens emitted by the
tokenizer. Concrete example using WhitespeceTokenizerFactory would be
this [is some ] text
PatternReplaceFilterFactory would see 5 tokens, "this", "[is", "some",
"]", and "text". So it would be very hard to do what you want.

patternReplaceCharFilterFactory will see the entire input as one
string and operate on it, _then" send it through the tokenizer.

And also don't be fooled by the fact that the _stored_ data will still
contain the removed words. So when you get the doc back from solr
you'll see the original input, brackets and all. In the above example,
if you returned the field you'd still see

this [is some ] text

when the doc matched. This doc would be found when searching for
"this" or "text", but _not_ when searching for "is" or "some".

You want some pattern like

Best,
Erick

On Wed, May 10, 2017 at 6:08 PM, Michael Tobias  wrote:
> I am sure this is very simple but I cannot get the pattern right.
>
> How can I use solr.PatternReplaceFilterFactory to remove all words in 
> brackets from being indexed?
>
> eg [ignore this]
>
> thanks
>
> Michael
>

Re: newbie question

2016-09-07 Thread John Bickerstaff

the /solr is a "chroot" -- if used, everything for solr goes into
zookeeper's /solr "directory"

It isn't required, but is very useful for keeping things separated.  I use
it to handle different Solr versions for upgrading (/solr5_4 or /solr6_2)

If not used, everything you put into Zookeeper (some microservices, Kafka,
etc) will all end up at the "root" and it'll be a cluster...

On Wed, Sep 7, 2016 at 1:26 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Well, first off the ZK ensemble string is usually specified as
> dayrhegapd016.enterprisenet.org:2181,host2:2181,host3:2181/solr
> (note that the /solr is only at the end, not every node).
>
> Second, I always get confused whether the /solr is necessary or not.
>
> Again, though, the Cloudera user's list is probably a place to get better
> answers.
>
> Best,
> Erick
>
> On Wed, Sep 7, 2016 at 12:15 PM, Darshan Pandya <darshanpan...@gmail.com>
> wrote:
> > Thank Erik,
> > So seems like the problem is that when I upload the configs to zookeeper
> > and then inspect zookeeper-client and ls /solr/configs it is showing to
> be
> > empty.
> >
> > I executed the following command to upload the config
> >
> > solrctl --zk
> > dayrhegapd016.enterprisenet.org:2181/solr,host2:2181/solr,
> host3:2181/solr
> > --solr host4:8983/solr/ instancedir --create config1 $HOME/config1/
> >
> >
> > Zookeeper-client ls /solr/configs/
> >
> > does not show this configuration present there.
> >
> >
> >
> > On Wed, Sep 7, 2016 at 2:02 PM, Erick Erickson <erickerick...@gmail.com>
> > wrote:
> >
> >> I'm a bit rusty on solrctl (and you might get faster/more up-to-date
> >> responses on the Cloudera lists). But to create a collection, you
> >> first need to have uploaded the configs to Zookeeper, things like
> >> schema.xml, solrconfig.xml etc. I forget
> >> what the solrctl command is, but something like "upconfig" IIRC.
> >>
> >> Once that's done, you either
> >> 1> specify the collection name exactly the same as the config name
> >> you uploaded
> >> or
> >> 2> use one of the other parameters to tell collectionX to use configsetY
> >> with the collection create command. solrctl help should show you all
> these
> >> options...
> >>
> >> Best,
> >> Erick
> >>
> >> On Wed, Sep 7, 2016 at 11:43 AM, Darshan Pandya <
> darshanpan...@gmail.com>
> >> wrote:
> >> > Gonzalo,
> >> > Thanks for responding,
> >> > executed the parameters you suggested, it still shows me the same
> error.
> >> > Sincerely,
> >> > darshan
> >> >
> >> > On Wed, Sep 7, 2016 at 1:13 PM, Gonzalo Rodriguez <
> >> > grodrig...@searchtechnologies.com> wrote:
> >> >
> >> >> Hi Darshan,
> >> >>
> >> >> It looks like you are listing the instanceDir's name twice in the
> create
> >> >> collection command, it should be
> >> >>
> >> >> $ solrctl --zk  host:2181/solr --solr host:8983/solr/ collection
> >> --create
> >> >> Catalog_search_index -s 10 -c Catalog_search_index
> >> >>
> >> >> Without the extra ". Catalog_search_index" at the end. Also, because
> >> your
> >> >> new collection's name is the same as the instanceDir's, you could
> just
> >> omit
> >> >> that parameter and it should work ok.
> >> >>
> >> >> Try that and see if it works.
> >> >>
> >> >> Good luck,
> >> >> Gonzalo
> >> >>
> >> >> -Original Message-
> >> >> From: Darshan Pandya [mailto:darshanpan...@gmail.com]
> >> >> Sent: Wednesday, September 7, 2016 12:02 PM
> >> >> To: solr-user@lucene.apache.org
> >> >> Subject: newbie question
> >> >>
> >> >> hello,
> >> >>
> >> >> I am using solr cloud with cloudera. When I try to create a
> collection,
> >> it
> >> >> fails with the following error.
> >> >> Any hints / answers will be helpful.
> >> >>
> >> >>
> >> >> $ solrctl --zk  host:2181/solr instancedir --list
> >> >>
> >> >> Catalog_search_index
> >> >>
> >> >> $ solrctl --zk  shot:2181/solr --solr host:8983/solr/ collection
> >> --create
> >> >> Catalog_search_index -s 10 -c Catalog_search_index.Catalog_
> search_index
> >> >>
> >> >> Error: A call to SolrCloud WEB APIs failed: HTTP/1.1 200 OK
> >> >>
> >> >> Server: Apache-Coyote/1.1
> >> >>
> >> >> Content-Type: application/xml;charset=UTF-8
> >> >>
> >> >> Transfer-Encoding: chunked
> >> >>
> >> >> Date: Wed, 07 Sep 2016 17:58:13 GMT
> >> >>
> >> >>
> >> >> 
> >> >>
> >> >>
> >> >> 
> >> >>
> >> >>
> >> >> 
> >> >>
> >> >> 
> >> >>
> >> >> 0
> >> >>
> >> >> 
> >> >>
> >> >> 1165
> >> >>
> >> >> 
> >> >>
> >> >> 
> >> >>
> >> >> 
> >> >>
> >> >> org.apache.solr.client.solrj.impl.HttpSolrServer$
> >> RemoteSolrException:Error
> >> >> CREATEing SolrCore 'Catalog_search_index_shard1_replica1': Unable to
> >> >> create core [Catalog_search_index_shard1_replica1] Caused by:
> Specified
> >> >> config does not exist in ZooKeeper:Catalog_search_
> >> >> index.dataCatalog_search_index
> >> >>
> >> >>
> >> >> --
> >> >> Sincerely,
> >> >> Darshan
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Sincerely,
> >> > Darshan
> >>
> >
> >
> >
> > --
> > Sincerely,
> > Darshan
>

Re: newbie question

2016-09-07 Thread Darshan Pandya

Many Thanks! I will move this to a cloudera list.

On Wed, Sep 7, 2016 at 2:26 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Well, first off the ZK ensemble string is usually specified as
> dayrhegapd016.enterprisenet.org:2181,host2:2181,host3:2181/solr
> (note that the /solr is only at the end, not every node).
>
> Second, I always get confused whether the /solr is necessary or not.
>
> Again, though, the Cloudera user's list is probably a place to get better
> answers.
>
> Best,
> Erick
>
> On Wed, Sep 7, 2016 at 12:15 PM, Darshan Pandya <darshanpan...@gmail.com>
> wrote:
> > Thank Erik,
> > So seems like the problem is that when I upload the configs to zookeeper
> > and then inspect zookeeper-client and ls /solr/configs it is showing to
> be
> > empty.
> >
> > I executed the following command to upload the config
> >
> > solrctl --zk
> > dayrhegapd016.enterprisenet.org:2181/solr,host2:2181/solr,
> host3:2181/solr
> > --solr host4:8983/solr/ instancedir --create config1 $HOME/config1/
> >
> >
> > Zookeeper-client ls /solr/configs/
> >
> > does not show this configuration present there.
> >
> >
> >
> > On Wed, Sep 7, 2016 at 2:02 PM, Erick Erickson <erickerick...@gmail.com>
> > wrote:
> >
> >> I'm a bit rusty on solrctl (and you might get faster/more up-to-date
> >> responses on the Cloudera lists). But to create a collection, you
> >> first need to have uploaded the configs to Zookeeper, things like
> >> schema.xml, solrconfig.xml etc. I forget
> >> what the solrctl command is, but something like "upconfig" IIRC.
> >>
> >> Once that's done, you either
> >> 1> specify the collection name exactly the same as the config name
> >> you uploaded
> >> or
> >> 2> use one of the other parameters to tell collectionX to use configsetY
> >> with the collection create command. solrctl help should show you all
> these
> >> options...
> >>
> >> Best,
> >> Erick
> >>
> >> On Wed, Sep 7, 2016 at 11:43 AM, Darshan Pandya <
> darshanpan...@gmail.com>
> >> wrote:
> >> > Gonzalo,
> >> > Thanks for responding,
> >> > executed the parameters you suggested, it still shows me the same
> error.
> >> > Sincerely,
> >> > darshan
> >> >
> >> > On Wed, Sep 7, 2016 at 1:13 PM, Gonzalo Rodriguez <
> >> > grodrig...@searchtechnologies.com> wrote:
> >> >
> >> >> Hi Darshan,
> >> >>
> >> >> It looks like you are listing the instanceDir's name twice in the
> create
> >> >> collection command, it should be
> >> >>
> >> >> $ solrctl --zk  host:2181/solr --solr host:8983/solr/ collection
> >> --create
> >> >> Catalog_search_index -s 10 -c Catalog_search_index
> >> >>
> >> >> Without the extra ". Catalog_search_index" at the end. Also, because
> >> your
> >> >> new collection's name is the same as the instanceDir's, you could
> just
> >> omit
> >> >> that parameter and it should work ok.
> >> >>
> >> >> Try that and see if it works.
> >> >>
> >> >> Good luck,
> >> >> Gonzalo
> >> >>
> >> >> -Original Message-
> >> >> From: Darshan Pandya [mailto:darshanpan...@gmail.com]
> >> >> Sent: Wednesday, September 7, 2016 12:02 PM
> >> >> To: solr-user@lucene.apache.org
> >> >> Subject: newbie question
> >> >>
> >> >> hello,
> >> >>
> >> >> I am using solr cloud with cloudera. When I try to create a
> collection,
> >> it
> >> >> fails with the following error.
> >> >> Any hints / answers will be helpful.
> >> >>
> >> >>
> >> >> $ solrctl --zk  host:2181/solr instancedir --list
> >> >>
> >> >> Catalog_search_index
> >> >>
> >> >> $ solrctl --zk  shot:2181/solr --solr host:8983/solr/ collection
> >> --create
> >> >> Catalog_search_index -s 10 -c Catalog_search_index.Catalog_
> search_index
> >> >>
> >> >> Error: A call to SolrCloud WEB APIs failed: HTTP/1.1 200 OK
> >> >>
> >> >> Server: Apache-Coyote/1.1
> >> >>
> >> >> Content-Type: application/xml;charset=UTF-8
> >> >>
> >> >> Transfer-Encoding: chunked
> >> >>
> >> >> Date: Wed, 07 Sep 2016 17:58:13 GMT
> >> >>
> >> >>
> >> >> 
> >> >>
> >> >>
> >> >> 
> >> >>
> >> >>
> >> >> 
> >> >>
> >> >> 
> >> >>
> >> >> 0
> >> >>
> >> >> 
> >> >>
> >> >> 1165
> >> >>
> >> >> 
> >> >>
> >> >> 
> >> >>
> >> >> 
> >> >>
> >> >> org.apache.solr.client.solrj.impl.HttpSolrServer$
> >> RemoteSolrException:Error
> >> >> CREATEing SolrCore 'Catalog_search_index_shard1_replica1': Unable to
> >> >> create core [Catalog_search_index_shard1_replica1] Caused by:
> Specified
> >> >> config does not exist in ZooKeeper:Catalog_search_
> >> >> index.dataCatalog_search_index
> >> >>
> >> >>
> >> >> --
> >> >> Sincerely,
> >> >> Darshan
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Sincerely,
> >> > Darshan
> >>
> >
> >
> >
> > --
> > Sincerely,
> > Darshan
>



-- 
Sincerely,
Darshan

Re: newbie question

2016-09-07 Thread Erick Erickson

Well, first off the ZK ensemble string is usually specified as
dayrhegapd016.enterprisenet.org:2181,host2:2181,host3:2181/solr
(note that the /solr is only at the end, not every node).

Second, I always get confused whether the /solr is necessary or not.

Again, though, the Cloudera user's list is probably a place to get better
answers.

Best,
Erick

On Wed, Sep 7, 2016 at 12:15 PM, Darshan Pandya <darshanpan...@gmail.com> wrote:
> Thank Erik,
> So seems like the problem is that when I upload the configs to zookeeper
> and then inspect zookeeper-client and ls /solr/configs it is showing to be
> empty.
>
> I executed the following command to upload the config
>
> solrctl --zk
> dayrhegapd016.enterprisenet.org:2181/solr,host2:2181/solr,host3:2181/solr
> --solr host4:8983/solr/ instancedir --create config1 $HOME/config1/
>
>
> Zookeeper-client ls /solr/configs/
>
> does not show this configuration present there.
>
>
>
> On Wed, Sep 7, 2016 at 2:02 PM, Erick Erickson <erickerick...@gmail.com>
> wrote:
>
>> I'm a bit rusty on solrctl (and you might get faster/more up-to-date
>> responses on the Cloudera lists). But to create a collection, you
>> first need to have uploaded the configs to Zookeeper, things like
>> schema.xml, solrconfig.xml etc. I forget
>> what the solrctl command is, but something like "upconfig" IIRC.
>>
>> Once that's done, you either
>> 1> specify the collection name exactly the same as the config name
>> you uploaded
>> or
>> 2> use one of the other parameters to tell collectionX to use configsetY
>> with the collection create command. solrctl help should show you all these
>> options...
>>
>> Best,
>> Erick
>>
>> On Wed, Sep 7, 2016 at 11:43 AM, Darshan Pandya <darshanpan...@gmail.com>
>> wrote:
>> > Gonzalo,
>> > Thanks for responding,
>> > executed the parameters you suggested, it still shows me the same error.
>> > Sincerely,
>> > darshan
>> >
>> > On Wed, Sep 7, 2016 at 1:13 PM, Gonzalo Rodriguez <
>> > grodrig...@searchtechnologies.com> wrote:
>> >
>> >> Hi Darshan,
>> >>
>> >> It looks like you are listing the instanceDir's name twice in the create
>> >> collection command, it should be
>> >>
>> >> $ solrctl --zk  host:2181/solr --solr host:8983/solr/ collection
>> --create
>> >> Catalog_search_index -s 10 -c Catalog_search_index
>> >>
>> >> Without the extra ". Catalog_search_index" at the end. Also, because
>> your
>> >> new collection's name is the same as the instanceDir's, you could just
>> omit
>> >> that parameter and it should work ok.
>> >>
>> >> Try that and see if it works.
>> >>
>> >> Good luck,
>> >> Gonzalo
>> >>
>> >> -Original Message-
>> >> From: Darshan Pandya [mailto:darshanpan...@gmail.com]
>> >> Sent: Wednesday, September 7, 2016 12:02 PM
>> >> To: solr-user@lucene.apache.org
>> >> Subject: newbie question
>> >>
>> >> hello,
>> >>
>> >> I am using solr cloud with cloudera. When I try to create a collection,
>> it
>> >> fails with the following error.
>> >> Any hints / answers will be helpful.
>> >>
>> >>
>> >> $ solrctl --zk  host:2181/solr instancedir --list
>> >>
>> >> Catalog_search_index
>> >>
>> >> $ solrctl --zk  shot:2181/solr --solr host:8983/solr/ collection
>> --create
>> >> Catalog_search_index -s 10 -c Catalog_search_index.Catalog_search_index
>> >>
>> >> Error: A call to SolrCloud WEB APIs failed: HTTP/1.1 200 OK
>> >>
>> >> Server: Apache-Coyote/1.1
>> >>
>> >> Content-Type: application/xml;charset=UTF-8
>> >>
>> >> Transfer-Encoding: chunked
>> >>
>> >> Date: Wed, 07 Sep 2016 17:58:13 GMT
>> >>
>> >>
>> >> 
>> >>
>> >>
>> >> 
>> >>
>> >>
>> >> 
>> >>
>> >> 
>> >>
>> >> 0
>> >>
>> >> 
>> >>
>> >> 1165
>> >>
>> >> 
>> >>
>> >> 
>> >>
>> >> 
>> >>
>> >> org.apache.solr.client.solrj.impl.HttpSolrServer$
>> RemoteSolrException:Error
>> >> CREATEing SolrCore 'Catalog_search_index_shard1_replica1': Unable to
>> >> create core [Catalog_search_index_shard1_replica1] Caused by: Specified
>> >> config does not exist in ZooKeeper:Catalog_search_
>> >> index.dataCatalog_search_index
>> >>
>> >>
>> >> --
>> >> Sincerely,
>> >> Darshan
>> >>
>> >
>> >
>> >
>> > --
>> > Sincerely,
>> > Darshan
>>
>
>
>
> --
> Sincerely,
> Darshan

Re: newbie question

2016-09-07 Thread Darshan Pandya

Thank Erik,
So seems like the problem is that when I upload the configs to zookeeper
and then inspect zookeeper-client and ls /solr/configs it is showing to be
empty.

I executed the following command to upload the config

solrctl --zk
dayrhegapd016.enterprisenet.org:2181/solr,host2:2181/solr,host3:2181/solr
--solr host4:8983/solr/ instancedir --create config1 $HOME/config1/


Zookeeper-client ls /solr/configs/

does not show this configuration present there.



On Wed, Sep 7, 2016 at 2:02 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> I'm a bit rusty on solrctl (and you might get faster/more up-to-date
> responses on the Cloudera lists). But to create a collection, you
> first need to have uploaded the configs to Zookeeper, things like
> schema.xml, solrconfig.xml etc. I forget
> what the solrctl command is, but something like "upconfig" IIRC.
>
> Once that's done, you either
> 1> specify the collection name exactly the same as the config name
> you uploaded
> or
> 2> use one of the other parameters to tell collectionX to use configsetY
> with the collection create command. solrctl help should show you all these
> options...
>
> Best,
> Erick
>
> On Wed, Sep 7, 2016 at 11:43 AM, Darshan Pandya <darshanpan...@gmail.com>
> wrote:
> > Gonzalo,
> > Thanks for responding,
> > executed the parameters you suggested, it still shows me the same error.
> > Sincerely,
> > darshan
> >
> > On Wed, Sep 7, 2016 at 1:13 PM, Gonzalo Rodriguez <
> > grodrig...@searchtechnologies.com> wrote:
> >
> >> Hi Darshan,
> >>
> >> It looks like you are listing the instanceDir's name twice in the create
> >> collection command, it should be
> >>
> >> $ solrctl --zk  host:2181/solr --solr host:8983/solr/ collection
> --create
> >> Catalog_search_index -s 10 -c Catalog_search_index
> >>
> >> Without the extra ". Catalog_search_index" at the end. Also, because
> your
> >> new collection's name is the same as the instanceDir's, you could just
> omit
> >> that parameter and it should work ok.
> >>
> >> Try that and see if it works.
> >>
> >> Good luck,
> >> Gonzalo
> >>
> >> -Original Message-
> >> From: Darshan Pandya [mailto:darshanpan...@gmail.com]
> >> Sent: Wednesday, September 7, 2016 12:02 PM
> >> To: solr-user@lucene.apache.org
> >> Subject: newbie question
> >>
> >> hello,
> >>
> >> I am using solr cloud with cloudera. When I try to create a collection,
> it
> >> fails with the following error.
> >> Any hints / answers will be helpful.
> >>
> >>
> >> $ solrctl --zk  host:2181/solr instancedir --list
> >>
> >> Catalog_search_index
> >>
> >> $ solrctl --zk  shot:2181/solr --solr host:8983/solr/ collection
> --create
> >> Catalog_search_index -s 10 -c Catalog_search_index.Catalog_search_index
> >>
> >> Error: A call to SolrCloud WEB APIs failed: HTTP/1.1 200 OK
> >>
> >> Server: Apache-Coyote/1.1
> >>
> >> Content-Type: application/xml;charset=UTF-8
> >>
> >> Transfer-Encoding: chunked
> >>
> >> Date: Wed, 07 Sep 2016 17:58:13 GMT
> >>
> >>
> >> 
> >>
> >>
> >> 
> >>
> >>
> >> 
> >>
> >> 
> >>
> >> 0
> >>
> >> 
> >>
> >> 1165
> >>
> >> 
> >>
> >> 
> >>
> >> 
> >>
> >> org.apache.solr.client.solrj.impl.HttpSolrServer$
> RemoteSolrException:Error
> >> CREATEing SolrCore 'Catalog_search_index_shard1_replica1': Unable to
> >> create core [Catalog_search_index_shard1_replica1] Caused by: Specified
> >> config does not exist in ZooKeeper:Catalog_search_
> >> index.dataCatalog_search_index
> >>
> >>
> >> --
> >> Sincerely,
> >> Darshan
> >>
> >
> >
> >
> > --
> > Sincerely,
> > Darshan
>



-- 
Sincerely,
Darshan

Re: newbie question

2016-09-07 Thread Erick Erickson

I'm a bit rusty on solrctl (and you might get faster/more up-to-date
responses on the Cloudera lists). But to create a collection, you
first need to have uploaded the configs to Zookeeper, things like
schema.xml, solrconfig.xml etc. I forget
what the solrctl command is, but something like "upconfig" IIRC.

Once that's done, you either
1> specify the collection name exactly the same as the config name
you uploaded
or
2> use one of the other parameters to tell collectionX to use configsetY
with the collection create command. solrctl help should show you all these
options...

Best,
Erick

On Wed, Sep 7, 2016 at 11:43 AM, Darshan Pandya <darshanpan...@gmail.com> wrote:
> Gonzalo,
> Thanks for responding,
> executed the parameters you suggested, it still shows me the same error.
> Sincerely,
> darshan
>
> On Wed, Sep 7, 2016 at 1:13 PM, Gonzalo Rodriguez <
> grodrig...@searchtechnologies.com> wrote:
>
>> Hi Darshan,
>>
>> It looks like you are listing the instanceDir's name twice in the create
>> collection command, it should be
>>
>> $ solrctl --zk  host:2181/solr --solr host:8983/solr/ collection --create
>> Catalog_search_index -s 10 -c Catalog_search_index
>>
>> Without the extra ". Catalog_search_index" at the end. Also, because your
>> new collection's name is the same as the instanceDir's, you could just omit
>> that parameter and it should work ok.
>>
>> Try that and see if it works.
>>
>> Good luck,
>> Gonzalo
>>
>> -Original Message-
>> From: Darshan Pandya [mailto:darshanpan...@gmail.com]
>> Sent: Wednesday, September 7, 2016 12:02 PM
>> To: solr-user@lucene.apache.org
>> Subject: newbie question
>>
>> hello,
>>
>> I am using solr cloud with cloudera. When I try to create a collection, it
>> fails with the following error.
>> Any hints / answers will be helpful.
>>
>>
>> $ solrctl --zk  host:2181/solr instancedir --list
>>
>> Catalog_search_index
>>
>> $ solrctl --zk  shot:2181/solr --solr host:8983/solr/ collection --create
>> Catalog_search_index -s 10 -c Catalog_search_index.Catalog_search_index
>>
>> Error: A call to SolrCloud WEB APIs failed: HTTP/1.1 200 OK
>>
>> Server: Apache-Coyote/1.1
>>
>> Content-Type: application/xml;charset=UTF-8
>>
>> Transfer-Encoding: chunked
>>
>> Date: Wed, 07 Sep 2016 17:58:13 GMT
>>
>>
>> 
>>
>>
>> 
>>
>>
>> 
>>
>> 
>>
>> 0
>>
>> 
>>
>> 1165
>>
>> 
>>
>> 
>>
>> 
>>
>> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
>> CREATEing SolrCore 'Catalog_search_index_shard1_replica1': Unable to
>> create core [Catalog_search_index_shard1_replica1] Caused by: Specified
>> config does not exist in ZooKeeper:Catalog_search_
>> index.dataCatalog_search_index
>>
>>
>> --
>> Sincerely,
>> Darshan
>>
>
>
>
> --
> Sincerely,
> Darshan

Re: newbie question

2016-09-07 Thread Darshan Pandya

Gonzalo,
Thanks for responding,
executed the parameters you suggested, it still shows me the same error.
Sincerely,
darshan

On Wed, Sep 7, 2016 at 1:13 PM, Gonzalo Rodriguez <
grodrig...@searchtechnologies.com> wrote:

> Hi Darshan,
>
> It looks like you are listing the instanceDir's name twice in the create
> collection command, it should be
>
> $ solrctl --zk  host:2181/solr --solr host:8983/solr/ collection --create
> Catalog_search_index -s 10 -c Catalog_search_index
>
> Without the extra ". Catalog_search_index" at the end. Also, because your
> new collection's name is the same as the instanceDir's, you could just omit
> that parameter and it should work ok.
>
> Try that and see if it works.
>
> Good luck,
> Gonzalo
>
> -Original Message-
> From: Darshan Pandya [mailto:darshanpan...@gmail.com]
> Sent: Wednesday, September 7, 2016 12:02 PM
> To: solr-user@lucene.apache.org
> Subject: newbie question
>
> hello,
>
> I am using solr cloud with cloudera. When I try to create a collection, it
> fails with the following error.
> Any hints / answers will be helpful.
>
>
> $ solrctl --zk  host:2181/solr instancedir --list
>
> Catalog_search_index
>
> $ solrctl --zk  shot:2181/solr --solr host:8983/solr/ collection --create
> Catalog_search_index -s 10 -c Catalog_search_index.Catalog_search_index
>
> Error: A call to SolrCloud WEB APIs failed: HTTP/1.1 200 OK
>
> Server: Apache-Coyote/1.1
>
> Content-Type: application/xml;charset=UTF-8
>
> Transfer-Encoding: chunked
>
> Date: Wed, 07 Sep 2016 17:58:13 GMT
>
>
> 
>
>
> 
>
>
> 
>
> 
>
> 0
>
> 
>
> 1165
>
> 
>
> 
>
> 
>
> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
> CREATEing SolrCore 'Catalog_search_index_shard1_replica1': Unable to
> create core [Catalog_search_index_shard1_replica1] Caused by: Specified
> config does not exist in ZooKeeper:Catalog_search_
> index.dataCatalog_search_index
>
>
> --
> Sincerely,
> Darshan
>



-- 
Sincerely,
Darshan

RE: newbie question

2016-09-07 Thread Gonzalo Rodriguez

Hi Darshan,

It looks like you are listing the instanceDir's name twice in the create 
collection command, it should be

$ solrctl --zk  host:2181/solr --solr host:8983/solr/ collection --create 
Catalog_search_index -s 10 -c Catalog_search_index

Without the extra ". Catalog_search_index" at the end. Also, because your new 
collection's name is the same as the instanceDir's, you could just omit that 
parameter and it should work ok.

Try that and see if it works.

Good luck,
Gonzalo

-Original Message-
From: Darshan Pandya [mailto:darshanpan...@gmail.com] 
Sent: Wednesday, September 7, 2016 12:02 PM
To: solr-user@lucene.apache.org
Subject: newbie question

hello,

I am using solr cloud with cloudera. When I try to create a collection, it 
fails with the following error.
Any hints / answers will be helpful.


$ solrctl --zk  host:2181/solr instancedir --list

Catalog_search_index

$ solrctl --zk  shot:2181/solr --solr host:8983/solr/ collection --create 
Catalog_search_index -s 10 -c Catalog_search_index.Catalog_search_index

Error: A call to SolrCloud WEB APIs failed: HTTP/1.1 200 OK

Server: Apache-Coyote/1.1

Content-Type: application/xml;charset=UTF-8

Transfer-Encoding: chunked

Date: Wed, 07 Sep 2016 17:58:13 GMT












0



1165







org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
CREATEing SolrCore 'Catalog_search_index_shard1_replica1': Unable to create 
core [Catalog_search_index_shard1_replica1] Caused by: Specified config does 
not exist in ZooKeeper:Catalog_search_index.dataCatalog_search_index


--
Sincerely,
Darshan

Re: [Newbie question] in SOLR 5, would I have a "master-to-slave" relationship for two servers?

2015-11-05 Thread Erick Erickson

To pile on to Chris' comment. In the M/S situation
you describe, all the query traffic goes to the slave.

True, this relieves the slave from doing the work of
indexing, but it _also_ prevents the master from
answering queries. So going to SolrCloud trades
off indexing on _both_ machines to also querying on
_both_ machines.

And this doesn't even take into account the issues
involved in recovering if one or the other (especially
the master) goes down, which is automatically
handled in SolrCloud.

Add to that the fact that memory management is
_very_ significantly improved starting with Solr
4x (see: 
https://lucidworks.com/blog/2012/04/06/memory-comparisons-between-solr-3x-and-trunk/)
and my claim is that you are _far_ better off
using SolrCloud than M/S in 5x.

As always, YMMV of course.

Best,
Erick


On Thu, Nov 5, 2015 at 1:12 PM, Chris Hostetter
 wrote:
>
> : The database of server 2 is considered the "master" and it is replicated
> : regularly to server 1, the "slave".
> :
> : The advantage is the responsiveness of server 1 is not impacted with server
> : 2 gets busy with lots of indexing.
> :
> : QUESTION: When deploying a SOLR 5 setup, do I set things up the same way?
> : Or do I cluster bother servers together into one "cloud"?   That is, in
> : SOLR 5, how do I ensure the indexing process will not impact the
> : performance of the web app?
>
> There is nothing preventing you from using a master slave setup with Solr
> 5...
>
> https://cwiki.apache.org/confluence/display/solr/Index+Replication
>
> ...however if you do so you have to take responsibility for the same
> risks/tradeoffs that existed with this type of setup in Solr 3...
>
> 1) if the "query slave" goes down, you can't serve quiers w/o manually
> redirecting traffic to your "indexing master"
>
> 2) if the "indexing master" goes down you can't accept index updates w/o
> manually redirecting update to your "query slave" -- and manually
> rectifying the descrepencies if/when your master comes back online.
>
>
> When using a cloud based setup these types of problems go away because
> there is no single "master", clients can send updates/queries to any node
> (and if you use SolrJ your clients will be "ZK aware" and know
> automatically if/when a node is down or new nodes are added) ...
> many people concerned about performance/reliability consider these
> benefits more important then the risks/tradeoffs of performance impacts of
> indexing directy to nodes that are serving queries -- especially with
> other NRT (Near Real Time) related improvements to Solr over the years
> (Soft Commits, DocValues instead of FieldCache, etc...)
>
>
> -Hoss
> http://www.lucidworks.com/

Re: [Newbie question] what is a "core" and are they different from 3.x to 5.x ?

2015-11-05 Thread Chris Hostetter


: I can see there is something called a "core" ... it appears there can be
: many cores for a single SOLR server.
: 
: Can someone "explain like I'm five" -- what is a core?

https://cwiki.apache.org/confluence/display/solr/Solr+Cores+and+solr.xml

"In Solr, the term core is used to refer to a single index and associated 
transaction log and configuration files (including schema.xml and 
solrconfig.xml, among others). Your Solr installation can have multiple 
cores if needed, which allows you to index data with different structures 
in the same server, and maintain more control over how your data is 
presented to different audiences."

: And how do "cores" differ from 3.x to 5.x.


The only fundemental differences between "cores" in Solr 3.x vs 5.x are:

1) in 3.x there was a concept known as the "default core" (if you didn't 
explicitly use multiple cores)  with 5.x every request (updates or 
queries) must be to an explicit core (or collection)

2) when using SolrCloud in 5.x, you should think (logically) in terms of 
the higher level concept of "collections" which (depending on the settings 
when the collection is created) may be *implemented* by multiple cores 
that are managed under the covers for you...

https://cwiki.apache.org/confluence/display/solr/SolrCloud
https://cwiki.apache.org/confluence/display/solr/Nodes%2C+Cores%2C+Clusters+and+Leaders


-Hoss
http://www.lucidworks.com/

Re: [Newbie question] in SOLR 5, would I have a "master-to-slave" relationship for two servers?

2015-11-05 Thread Chris Hostetter


: The database of server 2 is considered the "master" and it is replicated
: regularly to server 1, the "slave".
: 
: The advantage is the responsiveness of server 1 is not impacted with server
: 2 gets busy with lots of indexing.
: 
: QUESTION: When deploying a SOLR 5 setup, do I set things up the same way?
: Or do I cluster bother servers together into one "cloud"?   That is, in
: SOLR 5, how do I ensure the indexing process will not impact the
: performance of the web app?

There is nothing preventing you from using a master slave setup with Solr 
5...

https://cwiki.apache.org/confluence/display/solr/Index+Replication

...however if you do so you have to take responsibility for the same 
risks/tradeoffs that existed with this type of setup in Solr 3...

1) if the "query slave" goes down, you can't serve quiers w/o manually 
redirecting traffic to your "indexing master"

2) if the "indexing master" goes down you can't accept index updates w/o 
manually redirecting update to your "query slave" -- and manually 
rectifying the descrepencies if/when your master comes back online.


When using a cloud based setup these types of problems go away because 
there is no single "master", clients can send updates/queries to any node 
(and if you use SolrJ your clients will be "ZK aware" and know 
automatically if/when a node is down or new nodes are added) ... 
many people concerned about performance/reliability consider these 
benefits more important then the risks/tradeoffs of performance impacts of 
indexing directy to nodes that are serving queries -- especially with 
other NRT (Near Real Time) related improvements to Solr over the years 
(Soft Commits, DocValues instead of FieldCache, etc...)


-Hoss
http://www.lucidworks.com/

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Toke Eskildsen

On Wed, 2014-03-19 at 11:55 +0100, Colin R wrote:
 We run a central database of 14M (and growing) photos with dates, captions,
 keywords, etc. 
 
 We currently upgrading from old Lucene Servers to latest Solr running with a
 couple of dedicated  servers (6 core, 36GB, 500SSD). Planning on using Solr
 Cloud.

What hardware are your past experiences based on? If they have less
cores, lower memory and spinning drives, I foresee that your question
can be reduced to which architecture you prefer from a logistic point of
view, rather than performance.

 We take in thousands of changes each day (big and small) so indexing may be
 a bigger problem than searching.

Thousands of updates in a day is a very low number. Do you have hard
requirements for update time, perform heavy faceting or do anything
special for this to be a cause of concern?

 Is it quicker/better to just have one big 14M index and filter the
 complexities for each website or is it better to still maintain hundreds of
 indexes so we are searching smaller one.

All else being equal, a search in a specific small index will be faster
than filtering on the large one. But as we know, all else is never
equal. A 14M document index in itself is not really a challenge for
Lucene/Solr, but this depends a lot on your specific setup. How large is
the 14M index in terms of bytes?

 Bear in mind, we get thousands of changes a day PLUS very busy search servers.

How many queries/second are we talking about here? What is a typical
query (faceting, grouping, special processing...)?

Regards,
Toke Eskildsen, State and University Library, Denmark

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Colin R

Hi Toke

Thanks for replying.

My question is really regarding index architecture. One big or many small
(with merged big ones)

We probably get 5-10K photos added each day. Others are updated, some are
deleted.

Updates need to happen quite fast (e.g. within minutes of our Databases
receiving them).

In terms of bytes, each photo has a up to 1.5KB of data.

Special requirements are search by date range, text, date range and text.
Plus some boolean filtering. All results can be sorted by date or filename.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Newbie-Question-Master-Index-or-100s-Small-Index-tp4125407p4125429.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Toke Eskildsen

On Wed, 2014-03-19 at 13:28 +0100, Colin R wrote:
 My question is really regarding index architecture. One big or many small
 (with merged big ones)

One difference is that having a single index/collection gives you better
ranked searches within each collection. If you only use date/filename
sorting, that is of course irrelevant.

 In terms of bytes, each photo has a up to 1.5KB of data.

So about 20GB for the full index?

 Special requirements are search by date range, text, date range and text.
 Plus some boolean filtering. All results can be sorted by date or filename.

With no faceting, grouping or similar aggregating processing,
(re)opening of an index searcher should be very fast. The only thing
that takes a moment is the initial date or filename sorting. Asking for
minute-level data updates is thus very modest. With the information you
have given, you could aim for a few seconds.

None of the things you have said gives any cause for concern about
performance and even though you have an existing system running and is
upgrading to a presumably faster one, you sound concerned. Do you
currently have performance problems, and if so, what is your current
hardware?

- Toke Eskildsen, State and University Library, Denmark

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Colin R

Hi Toke

Our current configuration Lucene 2.(something) with RAILO/CFML app server.

10K drives, Quad Core, 16GB, Two servers. But the indexing and searching are
starting to fail and our developer is no longer with us so it is quicker to
rebuild than fix all the code.

Our existing config is lots of indexes with merges into the larger ones.

They are still running very fast but indexing is causing us issues.

Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Newbie-Question-Master-Index-or-100s-Small-Index-tp4125407p4125447.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Erick Erickson

Oh My. 2(something) is ancient, I second your move
to scrap the current situation and start over. I'm
really curious what the _reason_ for such a complex
setup are/were.

I second Toke's comments. This is actually
quite small by modern Solr/Lucene standards.

Personally I would index them all to a single index,
include something like a 'source' field that allowed
one to restrict the returned documents by a filter
query (fq) clause.

Toke makes the point that you will get subtly different
search results because the tf/idf calculations are
slightly different across your entire corpus than
within various sub-sections, but I suspect that you
won't notice it. Test and see, you can change later.

One thing to look at is the new hard/soft commit
distinction, see:
http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

The short form is you want to define your hard
autocommit to be fairly short (maybe 1 minute?)
with openSearcher=false for durability and your
soft commit whatever latency you need for being
able to search the newly-added docs.

I don't know how you're feeding docs to Solr, but
if you're using the ExtractingRequestHandler,
you are
1 transmitting the entire document over the wire,
only to throw most of it away. I'm guessing your 1.5K
of data is just a few percent of the total file size.
2 you're putting the extraction work on the same
box running Solr.

If that machine is overloaded, consider moving the Tika
processing over to one or more clients and only
sending the data you actually want to index over to Solr,
See:
http://searchhub.org/2012/02/14/indexing-with-solrj/

Best,
Erick

On Wed, Mar 19, 2014 at 7:02 AM, Colin R colin.russ...@dasmail.co.uk wrote:
Hi Toke

Our current configuration Lucene 2.(something) with RAILO/CFML app server.

10K drives, Quad Core, 16GB, Two servers. But the indexing and searching are
starting to fail and our developer is no longer with us so it is quicker to
rebuild than fix all the code.

Our existing config is lots of indexes with merges into the larger ones.

They are still running very fast but indexing is causing us issues.

Thanks

--
View this message in context:
http://lucene.472066.n3.nabble.com/Newbie-Question-Master-Index-or-100s-Small-Index-tp4125407p4125447.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Shawn Heisey


On 3/19/2014 4:55 AM, Colin R wrote:

My question is an architecture one.

These photos are currently indexed and searched in three ways.

1: The 14M pictures from above are split into a few hundred indexes that
feed a single website. This means index sizes of between 100 and 500,000
entries each.

2: 95% of these same photos are also wanted for searching on a global site.
Index size of 12M plus.

3: 80% of these same photos are also required for smaller group sites. Index
sizes of between 400K and 4M.

We currently make changes the single indexes and then merge into groups and
global. Due to the size of the numbers, is it worth changing or not.

Is it quicker/better to just have one big 14M index and filter the
complexities for each website or is it better to still maintain hundreds of
indexes so we are searching smaller one. Bear in mind, we get thousands of
changes a day PLUS very busy search servers.


My primary use for Solr is an archive of 92 million documents, most of 
which are photos.  We have thousands of new photos every day.  I haven't 
been cleared to mention what company it's for.


This screenshot of my status servlet page answers tons of questions 
about my index, but if you have additional questions, ask:


https://www.dropbox.com/s/6p1puq1gq3j8nln/solr-status-servlet.png

Here are some details about each host that you cannot see in the 
screenshot: 6 SATA disks in RAID10 with 3TB of usable space.  64GB of 
RAM.  Dual quad-core Intel E54xx series CPUs.Chain A is running Solr 
4.2.1 on Java 6, chain B is running Solr 4.6.1 on Java 7, with some 
additional plugin software that increases the index size.  There is one 
Solr process per host, with a 6GB heap.


As long as you index fields that can be used to filter searches 
according to what a user is allowed to see, I don't see any problem with 
putting all of your data into one index.The main thing you'll want to be 
sure of is that you have enough RAM to effectively cache your index.  
Because you have SSD, you probably don't need to have enough RAM to 
cache ALL of the index data, but it wouldn't hurt.  With 36GB of RAM per 
machine, you will probably have enough.


Thanks,
Shawn

Re: Newbie question on Deduplication overWriteDupes flag

2014-02-06 Thread Chris Hostetter


: How do I achieve, add if not there, fail if duplicate is found. I though

You can use the optimistic concurrency features to do this, by including a 
_version_=-1 field value in the document.

this will instruct solr that the update should only be processed if the 
document does not already exist...

https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents




-Hoss
http://www.lucidworks.com/

Re: Newbie question on Deduplication overWriteDupes flag

2014-02-06 Thread Alexandre Rafalovitch

A follow up question on this (as it is kind of new functionality).

What happens if several documents are submitted and one of them fails
due to that? Do they get rolled back or only one?

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Thu, Feb 6, 2014 at 11:17 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : How do I achieve, add if not there, fail if duplicate is found. I though

 You can use the optimistic concurrency features to do this, by including a
 _version_=-1 field value in the document.

 this will instruct solr that the update should only be processed if the
 document does not already exist...

 https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents




 -Hoss
 http://www.lucidworks.com/

Re: Newbie question on recurring theme: Dynamic Fields

2013-02-20 Thread Erik Hatcher

You need to use dynamicField not field, that's all :)

 Erik

On Feb 20, 2013, at 4:06, Erik Dybdahl erik...@gmail.com wrote:

 Hi,
 I'm currently assessing lucene/solr as a search front end for documents
 currently stored in an rdbms.
 The data has been made searchable to clients, in a way so that each
 client/customer may define what elements of their documents are searchable,
 by defining a field name, and a reference from that name into the data
 which is subsequently used to extract the value from the client documents
 and for each document associate it with the customer chosen field name.
 The concept of dynamic fields seemed to be exactly what I was looking for.
 Under Solr Features, Detailed Features, lucene.apache.org/solr says
 Dynamic Fields enables on-the-fly addition of new fields.
 And, in more depth, at wiki.apache.org/solr/SchemaXml:
 One of the powerful features of Lucene is that you don't have to
 pre-define every field when you first create your index.
 :
 For example the following dynamic field declaration tells Solr that
 whenever it sees a field name ending in _i which is not an explicitly
 defined field, then it should dynamically create an integer field with that
 name...
dynamicField name=*_i  type=integer  indexed=true
 stored=true/
 Cool.
 However, after definining
   field name=customerField_* type=string indexed=true
 stored=true multiValued=true/
 then trying to add the dynamic fields from the values in the db using solrj
 thus
 
 solrInputDocument.addField(customerField_+searchFieldResultSet.getString(name),
 searchFieldResultSet.getString(value));
 (i.e. attempting to create a dynamic field for e.g. the customer field
 FirstName) yields
 
 org.apache.solr.common.SolrException: ERROR: [doc=62485318] unknown field
 'customerField_FirstName'
at
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:404)
 
 How come? The documentation states clearly that solr should dynamically
 create a field in such cases.
 I am aware that similar problems have been discussed in several threads
 before, but the appearant complexity of the issue confuses me.
 
 Is what is stated in the documentation correct?
 In that case, what am I doing wrong, and what is the correct way?
 If the Dynamic Field concept is just a way of specifying common
 characteristics of multiple defined fields, then I think the documentation
 ought to be changed.
 
 I was able to solve the problem by following a suggestion in one of the
 threads, namely to create just one generic field, and then concatenate name
 and value of the customer field into it, like this:
   field name=customerField_s type=string indexed=true stored=true
 multiValued=true/
 and
solrInputDocument.addField(customerField_s,
 searchFieldResultSet.getString(name)+$+searchFieldResultSet.getString(value));
 which enables queries like:
  customerField_s:LastName$PETTERSON
 but that's not a very elegant solution.
 Also, attempting to use customerField_* in the query does not work, and
 from reading the documentation I do not understand why.
 
 Anyway, the response time when searching for these values is superb, when
 compared to doing the search directly in the database :-)
 
 Regards
 Erik

Re: Newbie question on recurring theme: Dynamic Fields

2013-02-20 Thread Toke Eskildsen

On Wed, 2013-02-20 at 10:06 +0100, Erik Dybdahl wrote:
 However, after definining
field name=customerField_* type=string indexed=true
 stored=true multiValued=true/

Seems like a typo to me: You need to write dynamicField, not
field, when defining a dynamic field.

Regards,
Toke Eskildsen

Re: Newbie question on recurring theme: Dynamic Fields

2013-02-20 Thread Erik Dybdahl

Excellent, works like a charm!
Though embarassing, it's still a good thing the only problem was me being
blind :-)

Thank you, Toke and Erik.


On Wed, Feb 20, 2013 at 11:47 AM, Toke Eskildsen 
t...@statsbiblioteket.dkwrote:

 On Wed, 2013-02-20 at 10:06 +0100, Erik Dybdahl wrote:
  However, after definining
 field name=customerField_* type=string indexed=true
  stored=true multiValued=true/

 Seems like a typo to me: You need to write dynamicField, not
 field, when defining a dynamic field.

 Regards,
 Toke Eskildsen

AW: newbie question

2012-07-31 Thread Markus Klose

You can add the parameter -Djetty.port=8984 to start your second solr on port 
8984.

java -Djetty.port=8984  -jar start.jar


Viele Grüße aus Augsburg

Markus Klose
SHI Elektronische Medien GmbH 
 

-Ursprüngliche Nachricht-
Von: Kate deBethune [mailto:kdebeth...@gmail.com] 
Gesendet: Montag, 30. Juli 2012 22:43
An: solr-user@lucene.apache.org
Betreff: newbie question

Hi,

I have been able to set up the SOLR demo environment as described in SOLR
3.6.1 tutorial:
http://lucene.apache.org/solr/api-3_6_1/doc-files/tutorial.html.

Actually, I set it up while it was still SOLR 3.6.0.

The developer I am working with has created a custom SOLR instance using
3.6.1 and has packaged it up in the same manner as the demo. However, when I 
run the java -jar start.jar command in the example directory of my SOLR
3.6.1 instance and I open the admin interface on my local host:
http://localhost:8983/solr/admin/, the admin webpage points to the 3.6.0 
instance.  The log says something like the JVM is already in use for port 8983.

How can I open my 3.6.1 instance?

I hope this question is not too elementary.

Many thanks in advance for any help you can provide.

Thanks,
Kate

Re: Newbie question on sorting

2012-05-02 Thread Jacek

Erick, I'll do that. Thank you very much.

Regards,
Jacek

On Tue, May 1, 2012 at 7:19 AM, Erick Erickson erickerick...@gmail.comwrote:

 The easiest way is to do that in the app. That is, return the top
 10 to the app (by score) then re-order them there. There's nothing
 in Solr that I know of that does what you want out of the box.

 Best
 Erick

 On Mon, Apr 30, 2012 at 11:10 AM, Jacek pjac...@gmail.com wrote:
  Hello all,
 
  I'm facing this simple problem, yet impossible to resolve for me (I'm a
  newbie in Solr).
  I need to sort the results by score (it is simple, of course), but then
  what I need is to take top 10 results, and re-order it (only those top 10
  results) by a date field.
  It's not the same as sort=score,creationdate
 
  Any suggestions will be greatly appreciated!

Re: Newbie question on sorting

2012-05-01 Thread Erick Erickson

The easiest way is to do that in the app. That is, return the top
10 to the app (by score) then re-order them there. There's nothing
in Solr that I know of that does what you want out of the box.

Best
Erick

On Mon, Apr 30, 2012 at 11:10 AM, Jacek pjac...@gmail.com wrote:
 Hello all,

 I'm facing this simple problem, yet impossible to resolve for me (I'm a
 newbie in Solr).
 I need to sort the results by score (it is simple, of course), but then
 what I need is to take top 10 results, and re-order it (only those top 10
 results) by a date field.
 It's not the same as sort=score,creationdate

 Any suggestions will be greatly appreciated!

how to transform a URL (newbie question)

2011-11-20 Thread Bent Jensen

I am a beginner to solr and need to ask the following:
Using the apache-solr example, how can I display an url in the xml document
as an active link/url in http? Do i need to add some special transform in
the example.xslt file? 

thanks
Ben
-
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11

Re: how to transform a URL (newbie question)

2011-11-20 Thread Erik Hatcher

Ben, 

Not quite sure how to interpret what you're asking here.  Are you speaking of 
the /browse view?  If so, you can tweak the templates under conf/velocity to 
make links out of things.

But generally, it's the end application that would take the results from Solr 
and render links as appropriate.

Erik

On Nov 20, 2011, at 11:53 , Bent Jensen wrote:

 I am a beginner to solr and need to ask the following:
 Using the apache-solr example, how can I display an url in the xml document
 as an active link/url in http? Do i need to add some special transform in
 the example.xslt file? 
 
 thanks
 Ben
 -
 No virus found in this message.
 Checked by AVG - www.avg.com
 Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11

RE: how to transform a URL (newbie question)

2011-11-20 Thread Bent Jensen

Erik,
OK, I will look at that. Basically, what I amtrying to do is to index a
document with lots of URLs. I also index the url and give it a field type.
Don't know much about solr yet, but though maybe I can transform the url to
an active link, i.e. 'a href'. I tried putting the href into the xml
document, but it just prints out as text in html. I also could not find any
xslt transform or schema.

thanks
Ben

-Original Message-
From: Erik Hatcher [mailto:erik.hatc...@gmail.com] 
Sent: Sunday, November 20, 2011 9:05 AM
To: solr-user@lucene.apache.org
Subject: Re: how to transform a URL (newbie question)

Ben, 

Not quite sure how to interpret what you're asking here.  Are you speaking
of the /browse view?  If so, you can tweak the templates under conf/velocity
to make links out of things.

But generally, it's the end application that would take the results from
Solr and render links as appropriate.

Erik

On Nov 20, 2011, at 11:53 , Bent Jensen wrote:

 I am a beginner to solr and need to ask the following:
 Using the apache-solr example, how can I display an url in the xml
document
 as an active link/url in http? Do i need to add some special transform in
 the example.xslt file? 
 
 thanks
 Ben
 -
 No virus found in this message.
 Checked by AVG - www.avg.com
 Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11
 

-
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11
-
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11

Re: how to transform a URL (newbie question)

2011-11-20 Thread Erick Erickson

I think you're confusing Solr with a web app G

Solr itself has nothing to do whatsoever with presenting
things to the user. It just returns, as you have seen,
XML (or JSON or ) formatted replies. It's up to the
application layer to do something intelligent with those.

That said, the /browse request handler that ships with the
example code uses something
called the VelocityResponseWriter to render pages, where
the VeolcityResponseWriter interacts with the templates
Erik Hatcher mentioned to show you pages. So think of
all the Velocity stuff as your app engine for demo purposes.

Erik is directing you at that code if you want to hack the
Solr example to display stuff.

Hope that helps
Erick (not Hatcher G)

On Sun, Nov 20, 2011 at 2:15 PM, Bent Jensen bentjen...@yahoo.com wrote:
 Erik,
 OK, I will look at that. Basically, what I amtrying to do is to index a
 document with lots of URLs. I also index the url and give it a field type.
 Don't know much about solr yet, but though maybe I can transform the url to
 an active link, i.e. 'a href'. I tried putting the href into the xml
 document, but it just prints out as text in html. I also could not find any
 xslt transform or schema.

 thanks
 Ben

 -Original Message-
 From: Erik Hatcher [mailto:erik.hatc...@gmail.com]
 Sent: Sunday, November 20, 2011 9:05 AM
 To: solr-user@lucene.apache.org
 Subject: Re: how to transform a URL (newbie question)

 Ben,

 Not quite sure how to interpret what you're asking here.  Are you speaking
 of the /browse view?  If so, you can tweak the templates under conf/velocity
 to make links out of things.

 But generally, it's the end application that would take the results from
 Solr and render links as appropriate.

        Erik

 On Nov 20, 2011, at 11:53 , Bent Jensen wrote:

 I am a beginner to solr and need to ask the following:
 Using the apache-solr example, how can I display an url in the xml
 document
 as an active link/url in http? Do i need to add some special transform in
 the example.xslt file?

 thanks
 Ben
 -
 No virus found in this message.
 Checked by AVG - www.avg.com
 Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11


 -
 No virus found in this message.
 Checked by AVG - www.avg.com
 Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11
 -
 No virus found in this message.
 Checked by AVG - www.avg.com
 Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11

a newbie question reagarding keyword count in each document

2011-11-11 Thread zeek

Hi All,

I am realtively new to Solr/Lucene and need some help.

- I am basically storing documents where each document represents an Entity
(a thing, a place etc)
- each Entity has some unique features that i need to store in a filed(s)
- also, i need to store the mention of those features (based on information
extracted from some other sources)
- when i query these documents, i need to be able to retrieve the x most
talked about features of that entity. 

can i do such thing in solr/lucene?  I was thinking that I create a
multi-valued field where i add the feature every time it was mentioned in my
sources.  but how would i get most mentioned features (based on the count)
for that particular entity?  If not possible in solr, i was thinking of
storing that information in a database but I really want to avoid such
option

Any help would be greatly appreciated.

Thanks,

Zeek

--
View this message in context: 
http://lucene.472066.n3.nabble.com/a-newbie-question-reagarding-keyword-count-in-each-document-tp3500489p3500489.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie question

2011-11-01 Thread Chris Hostetter


: If using CommonsHttpSolrServer query() method with parameter wt=json, when
: retrieving QueryResponse, how to do to get JSON result output stream ?

when you are using the CommonsHttpSolrServer level of API, the client 
takes care of parsing the response (which is typically in an efficient 
binary representation) into the basic data structures.

if you just want the raw response (in json or xml, or whatever) as a java 
String, then it may not be the API you really want: just use a basic 
HttpClient.


Alternately: you could consider writing your own subclass of 
ResponseParser that just slurps the InputStream that CommonsHttpSolrServer 
fetches for you.


-Hoss

RE: Newbie question, ant target for packaging source files from local copy?

2011-08-29 Thread syyang

Hi Steve,

I've filed a new JIRA issue along with the patch, which can be found at
lt;https://issues.apache.org/jira/browse/LUCENE-3406gt;.

Please let me know if you see any problem.

Thanks!
-Sid

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Newbie-question-ant-target-for-packaging-source-files-from-local-copy-tp3282787p3294320.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie Question, can I store structured sub elements?

2011-08-25 Thread Zac Tolley

I know I can have multi value on them but that doesn't let me see that
a showing instance happens at a particular time on a particular
channel, just that it shows on a range of channels at a range of times

Starting to think I will have to either store a formatted string that
combines them or keep it flat just for indexing, retrieve ids and use
them to get data out of the RDBMS


On 24 Aug 2011, at 23:09, dan whelan d...@adicio.com wrote:

 You could change starttime and channelname to multiValued=true and use these 
 fields to store all the values for those fields.

 showing.movie_id and showing.id probably isn't needed in a solr record.



 On 8/24/11 7:53 AM, Zac Tolley wrote:
 I have a very scenario in which I have a film and showings, each film has
 multiple showings at set times on set channels, so I have:

 Movie
 -
 id
 title
 description
 duration


 Showing
 -
 id
 movie_id
 starttime
 channelname



 I want to know can I store this in solr so that I keep this stucture?

 I did try to do an initial import with the DIH using this config:

 entity name=movie query=SELECT * from movies
field column=ID name=id/
field column=TITLE name=title/
field name=DESCRIPTION name=description/

entity name=showing query=SELECT * from showings WHERE movie_id = ${
 movie.id}
  field column=ID name=id/
  field column=STARTTIME name=starttime/
  field column=CHANNELNAME name=channelname/
/entity
 /entity

 I was hoping, for each movie to get a sub entity with the showing like:

 doc
str name=title./str
showing
  str name=channelname.



 but instead all the fields are flattened down to the top level.

 I know this must be easy, what am I missing... ?

RE: Newbie question, ant target for packaging source files from local copy?

2011-08-25 Thread Steven A Rowe

Hi sid,

The current source packaging scheme aims to *avoid* including local changes :), 
so yes, there is no support currently for what you want to do.

Prior to https://issues.apache.org/jira/browse/LUCENE-2973, the source 
packaging scheme used the current sources rather than pulling from Subversion.  
If you check out trunk revision 1083212 or earlier, or branch_3x revision 
1083234 or earlier, you can see how it used to be done.

If you want to resurrect the previous source packaging scheme as a new Ant 
target (maybe named package-local-src-tgz?), and you make a new JIRA issue 
and post a patch, and I'll help you get it committed (assuming nobody objects). 
 If you haven't seen the Solr Wiki HowToContribute page 
http://wiki.apache.org/solr/HowToContribute, it may be of use to you for this.

Steve

 -Original Message-
 From: syyang [mailto:syyan...@gmail.com]
 Sent: Wednesday, August 24, 2011 10:07 PM
 To: solr-user@lucene.apache.org
 Subject: Newbie question, ant target for packaging source files from
 local copy?
 
 Hi all,
 
 I am trying to package source files containing local changes. While
 running
 ant dist creates a war file containing the local changes, running ant
 package-src-tgz exports files straight from svn repository, and does not
 pick up any of the local changes.
 
 Is there an ant target that I can use to package local copy of the source
 files? Or are are we expected to just write our own?
 
 Thanks,
 -Sid
 
 --
 View this message in context: http://lucene.472066.n3.nabble.com/Newbie-
 question-ant-target-for-packaging-source-files-from-local-copy-
 tp3282787p3282787.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie Question, can I store structured sub elements?

2011-08-25 Thread Erick Erickson

nope, it's not easy. Solr docs are flat, flat, flat with the tiny
exception that multiValued fields are returned as a lists.

However, you can count on multi-valued fields being returned
in the order they were added, so it might work out for you to
treat these as parallel arrays in Solr documents.

Best
Erick

On Thu, Aug 25, 2011 at 3:10 AM, Zac Tolley z...@thetolleys.com wrote:
 I know I can have multi value on them but that doesn't let me see that
 a showing instance happens at a particular time on a particular
 channel, just that it shows on a range of channels at a range of times

 Starting to think I will have to either store a formatted string that
 combines them or keep it flat just for indexing, retrieve ids and use
 them to get data out of the RDBMS


 On 24 Aug 2011, at 23:09, dan whelan d...@adicio.com wrote:

 You could change starttime and channelname to multiValued=true and use these 
 fields to store all the values for those fields.

 showing.movie_id and showing.id probably isn't needed in a solr record.



 On 8/24/11 7:53 AM, Zac Tolley wrote:
 I have a very scenario in which I have a film and showings, each film has
 multiple showings at set times on set channels, so I have:

 Movie
 -
 id
 title
 description
 duration


 Showing
 -
 id
 movie_id
 starttime
 channelname



 I want to know can I store this in solr so that I keep this stucture?

 I did try to do an initial import with the DIH using this config:

 entity name=movie query=SELECT * from movies
    field column=ID name=id/
    field column=TITLE name=title/
    field name=DESCRIPTION name=description/

    entity name=showing query=SELECT * from showings WHERE movie_id = ${
 movie.id}
      field column=ID name=id/
      field column=STARTTIME name=starttime/
      field column=CHANNELNAME name=channelname/
    /entity
 /entity

 I was hoping, for each movie to get a sub entity with the showing like:

 doc
    str name=title./str
    showing
      str name=channelname.



 but instead all the fields are flattened down to the top level.

 I know this must be easy, what am I missing... ?

Re: Newbie Question, can I store structured sub elements?

2011-08-25 Thread Zac Tolley

have come to that conclusion  so had to choose between multiple fields with
multiple vales or a field with delimited text, gone for the former

On Thu, Aug 25, 2011 at 7:58 PM, Erick Erickson erickerick...@gmail.comwrote:

 nope, it's not easy. Solr docs are flat, flat, flat with the tiny
 exception that multiValued fields are returned as a lists.

 However, you can count on multi-valued fields being returned
 in the order they were added, so it might work out for you to
 treat these as parallel arrays in Solr documents.

 Best
 Erick

 On Thu, Aug 25, 2011 at 3:10 AM, Zac Tolley z...@thetolleys.com wrote:
  I know I can have multi value on them but that doesn't let me see that
  a showing instance happens at a particular time on a particular
  channel, just that it shows on a range of channels at a range of times
 
  Starting to think I will have to either store a formatted string that
  combines them or keep it flat just for indexing, retrieve ids and use
  them to get data out of the RDBMS
 
 
  On 24 Aug 2011, at 23:09, dan whelan d...@adicio.com wrote:
 
  You could change starttime and channelname to multiValued=true and use
 these fields to store all the values for those fields.
 
  showing.movie_id and showing.id probably isn't needed in a solr record.
 
 
 
  On 8/24/11 7:53 AM, Zac Tolley wrote:
  I have a very scenario in which I have a film and showings, each film
 has
  multiple showings at set times on set channels, so I have:
 
  Movie
  -
  id
  title
  description
  duration
 
 
  Showing
  -
  id
  movie_id
  starttime
  channelname
 
 
 
  I want to know can I store this in solr so that I keep this stucture?
 
  I did try to do an initial import with the DIH using this config:
 
  entity name=movie query=SELECT * from movies
 field column=ID name=id/
 field column=TITLE name=title/
 field name=DESCRIPTION name=description/
 
 entity name=showing query=SELECT * from showings WHERE movie_id
 = ${
  movie.id}
   field column=ID name=id/
   field column=STARTTIME name=starttime/
   field column=CHANNELNAME name=channelname/
 /entity
  /entity
 
  I was hoping, for each movie to get a sub entity with the showing like:
 
  doc
 str name=title./str
 showing
   str name=channelname.
 
 
 
  but instead all the fields are flattened down to the top level.
 
  I know this must be easy, what am I missing... ?

Re: Newbie Question, can I store structured sub elements?

2011-08-25 Thread Paul Libbrecht

Delimited text is the baby form of lists.
Text can be made very very structured (think XML, ontologies...).
I think the crux is your search needs.

For example, with Lucene, I made a search for formulæ (including sub-terms) by 
converting the OpenMath-encoded terms into rows of tokens and querying with 
SpanQueries. Quite structured to my taste.

What you don't have is the freedom of joins which brings a very flexible query 
mechanism almost independent of the schema... but this often can be 
circumvented by the flat solr and lucene storage whose performance is really 
amazing.

paul


Le 25 août 2011 à 21:07, Zac Tolley a écrit :

 have come to that conclusion  so had to choose between multiple fields with
 multiple vales or a field with delimited text, gone for the former
 
 On Thu, Aug 25, 2011 at 7:58 PM, Erick Erickson 
 erickerick...@gmail.comwrote:
 
 nope, it's not easy. Solr docs are flat, flat, flat with the tiny
 exception that multiValued fields are returned as a lists.
 
 However, you can count on multi-valued fields being returned
 in the order they were added, so it might work out for you to
 treat these as parallel arrays in Solr documents.
 
 Best
 Erick
 
 On Thu, Aug 25, 2011 at 3:10 AM, Zac Tolley z...@thetolleys.com wrote:
 I know I can have multi value on them but that doesn't let me see that
 a showing instance happens at a particular time on a particular
 channel, just that it shows on a range of channels at a range of times
 
 Starting to think I will have to either store a formatted string that
 combines them or keep it flat just for indexing, retrieve ids and use
 them to get data out of the RDBMS
 
 
 On 24 Aug 2011, at 23:09, dan whelan d...@adicio.com wrote:
 
 You could change starttime and channelname to multiValued=true and use
 these fields to store all the values for those fields.
 
 showing.movie_id and showing.id probably isn't needed in a solr record.
 
 
 
 On 8/24/11 7:53 AM, Zac Tolley wrote:
 I have a very scenario in which I have a film and showings, each film
 has
 multiple showings at set times on set channels, so I have:
 
 Movie
 -
 id
 title
 description
 duration
 
 
 Showing
 -
 id
 movie_id
 starttime
 channelname
 
 
 
 I want to know can I store this in solr so that I keep this stucture?
 
 I did try to do an initial import with the DIH using this config:
 
 entity name=movie query=SELECT * from movies
   field column=ID name=id/
   field column=TITLE name=title/
   field name=DESCRIPTION name=description/
 
   entity name=showing query=SELECT * from showings WHERE movie_id
 = ${
 movie.id}
 field column=ID name=id/
 field column=STARTTIME name=starttime/
 field column=CHANNELNAME name=channelname/
   /entity
 /entity
 
 I was hoping, for each movie to get a sub entity with the showing like:
 
 doc
   str name=title./str
   showing
 str name=channelname.
 
 
 
 but instead all the fields are flattened down to the top level.
 
 I know this must be easy, what am I missing... ?

Re: Newbie Question, can I store structured sub elements?

2011-08-25 Thread Zac Tolley

My search is very simple, mainly on titles, actors, show times and channels.
Having multiple lists of values is probably better for that, and as the
order is kept the same its relatively simple to map the response back onto
pojos for my presentation layer.

On Thu, Aug 25, 2011 at 8:18 PM, Paul Libbrecht p...@hoplahup.net wrote:

 Delimited text is the baby form of lists.
 Text can be made very very structured (think XML, ontologies...).
 I think the crux is your search needs.

 For example, with Lucene, I made a search for formulæ (including sub-terms)
 by converting the OpenMath-encoded terms into rows of tokens and querying
 with SpanQueries. Quite structured to my taste.

 What you don't have is the freedom of joins which brings a very flexible
 query mechanism almost independent of the schema... but this often can be
 circumvented by the flat solr and lucene storage whose performance is really
 amazing.

 paul


 Le 25 août 2011 à 21:07, Zac Tolley a écrit :

  have come to that conclusion  so had to choose between multiple fields
 with
  multiple vales or a field with delimited text, gone for the former
 
  On Thu, Aug 25, 2011 at 7:58 PM, Erick Erickson erickerick...@gmail.com
 wrote:
 
  nope, it's not easy. Solr docs are flat, flat, flat with the tiny
  exception that multiValued fields are returned as a lists.
 
  However, you can count on multi-valued fields being returned
  in the order they were added, so it might work out for you to
  treat these as parallel arrays in Solr documents.
 
  Best
  Erick
 
  On Thu, Aug 25, 2011 at 3:10 AM, Zac Tolley z...@thetolleys.com wrote:
  I know I can have multi value on them but that doesn't let me see that
  a showing instance happens at a particular time on a particular
  channel, just that it shows on a range of channels at a range of times
 
  Starting to think I will have to either store a formatted string that
  combines them or keep it flat just for indexing, retrieve ids and use
  them to get data out of the RDBMS
 
 
  On 24 Aug 2011, at 23:09, dan whelan d...@adicio.com wrote:
 
  You could change starttime and channelname to multiValued=true and use
  these fields to store all the values for those fields.
 
  showing.movie_id and showing.id probably isn't needed in a solr
 record.
 
 
 
  On 8/24/11 7:53 AM, Zac Tolley wrote:
  I have a very scenario in which I have a film and showings, each film
  has
  multiple showings at set times on set channels, so I have:
 
  Movie
  -
  id
  title
  description
  duration
 
 
  Showing
  -
  id
  movie_id
  starttime
  channelname
 
 
 
  I want to know can I store this in solr so that I keep this stucture?
 
  I did try to do an initial import with the DIH using this config:
 
  entity name=movie query=SELECT * from movies
field column=ID name=id/
field column=TITLE name=title/
field name=DESCRIPTION name=description/
 
entity name=showing query=SELECT * from showings WHERE movie_id
  = ${
  movie.id}
  field column=ID name=id/
  field column=STARTTIME name=starttime/
  field column=CHANNELNAME name=channelname/
/entity
  /entity
 
  I was hoping, for each movie to get a sub entity with the showing
 like:
 
  doc
str name=title./str
showing
  str name=channelname.
 
 
 
  but instead all the fields are flattened down to the top level.
 
  I know this must be easy, what am I missing... ?

Re: Newbie Question, can I store structured sub elements?

2011-08-25 Thread Paul Libbrecht

Whether multi-valued or token-streams, the question is search, not 
(de)serialization: that's opaque to Solr which will take and give it to you as 
needed.

paul


Le 25 août 2011 à 21:24, Zac Tolley a écrit :

 My search is very simple, mainly on titles, actors, show times and channels.
 Having multiple lists of values is probably better for that, and as the
 order is kept the same its relatively simple to map the response back onto
 pojos for my presentation layer.

Re: Newbie Question, can I store structured sub elements?

2011-08-24 Thread dan whelan

You could change starttime and channelname to multiValued=true and use 
these fields to store all the values for those fields.


showing.movie_id and showing.id probably isn't needed in a solr record.



On 8/24/11 7:53 AM, Zac Tolley wrote:

I have a very scenario in which I have a film and showings, each film has
multiple showings at set times on set channels, so I have:

Movie
-
id
title
description
duration


Showing
-
id
movie_id
starttime
channelname



I want to know can I store this in solr so that I keep this stucture?

I did try to do an initial import with the DIH using this config:

entity name=movie query=SELECT * from movies
field column=ID name=id/
field column=TITLE name=title/
field name=DESCRIPTION name=description/

entity name=showing query=SELECT * from showings WHERE movie_id = ${
movie.id}
  field column=ID name=id/
  field column=STARTTIME name=starttime/
  field column=CHANNELNAME name=channelname/
/entity
/entity

I was hoping, for each movie to get a sub entity with the showing like:

doc
str name=title./str
showing
  str name=channelname.



but instead all the fields are flattened down to the top level.

I know this must be easy, what am I missing... ?

Re: Newbie question: how to deal with different # of search results per page due to pagination then grouping

2011-06-02 Thread Michael Sokolov

Just keep one extra facet value hidden; ie request one more than you 
need to show the current page.  If you get it, there are more (show the 
next button), otherwise there aren't.  You can't page arbitrarily deep 
like this, but you can have a next button reliably enabled or disabled.


On 6/1/2011 5:57 PM, Robert Petersen wrote:

Yes that is exactly the issue... we're thinking just maybe always have a
next button and if you go too far you just get zero results.  User gets
what the user asks for, and so user could simply back up if desired to
where the facet still has values.  Could also detect an empty facet
results on the front end.  You can also only expand one facet only to
allow paging only the facet pane and not the whole page using an ajax
call.



-Original Message-
From: Jonathan Rochkind [mailto:rochk...@jhu.edu]
Sent: Wednesday, June 01, 2011 2:30 PM
To: solr-user@lucene.apache.org
Cc: Robert Petersen
Subject: Re: Newbie question: how to deal with different # of search
results per page due to pagination then grouping

How do you know whether to provide a 'next' button, or whether you are
the end of your facet list?

On 6/1/2011 4:47 PM, Robert Petersen wrote:

I think facet.offset allows facet paging nicely by letting you index
into the list of facet values.  It is working for me...

http://wiki.apache.org/solr/SimpleFacetParameters#facet.offset


-Original Message-
From: Jonathan Rochkind [mailto:rochk...@jhu.edu]
Sent: Wednesday, June 01, 2011 12:41 PM
To: solr-user@lucene.apache.org
Subject: Re: Newbie question: how to deal with different # of search
results per page due to pagination then grouping

There's no great way to do that.

One approach would be using facets, but that will just get you the
author names (as stored in fields), and not the documents under it. If
you really only want to show the author names, facets could work. One
issue with facets though is Solr won't tell you the total number of
facet values for your query, so it's tricky to provide next/prev

paging

through them.

There is also a 'field collapsing' feature that I think is not in a
released Solr, but may be in the Solr repo. I'm not sure it will quite
do what you want either though, although it's related and worth a

look.

http://wiki.apache.org/solr/FieldCollapsing

Another vaguely related thing that is also not yet in a released Solr,
is a 'join' function. That could possibly be used to do what you want,
although it'd be tricky too.
https://issues.apache.org/jira/browse/SOLR-2272

Jonathan

On 6/1/2011 2:56 PM, beccax wrote:

Apologize if this question has already been raised.  I tried

searching

but

couldn't find the relevant posts.

We've indexed a bunch of documents by different authors.  Then for

search

results, we'd like to show the authors that have 1 or more documents
matching the search keywords.

The problem is right now our solr search method first paginates

results to

100 documents per page, then we take the results and group by

authors.

This

results in different number of authors per page.  (Some authors may

only

have one matching document and others 5 or 10.)

How do we change it to somehow show the same number of authors (say

25) per

page?

I mean alternatively we could just show all the documents themselves

ordered

by author, but it's not the user experience we're looking for.

Thanks so much.  And please let me know if you need more details not
provided here.
B

--
View this message in context:

http://lucene.472066.n3.nabble.com/Newbie-question-how-to-deal-with-diff
erent-of-search-results-per-page-due-to-pagination-then-grouping-tp30121

68p3012168.html

Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie question: how to deal with different # of search results per page due to pagination then grouping

2011-06-01 Thread Jonathan Rochkind

There's no great way to do that.

One approach would be using facets, but that will just get you the
author names (as stored in fields), and not the documents under it. If
you really only want to show the author names, facets could work. One
issue with facets though is Solr won't tell you the total number of
facet values for your query, so it's tricky to provide next/prev paging
through them.

There is also a 'field collapsing' feature that I think is not in a
released Solr, but may be in the Solr repo. I'm not sure it will quite
do what you want either though, although it's related and worth a look.
http://wiki.apache.org/solr/FieldCollapsing

Another vaguely related thing that is also not yet in a released Solr,
is a 'join' function. That could possibly be used to do what you want,
although it'd be tricky too. https://issues.apache.org/jira/browse/SOLR-2272

Jonathan

On 6/1/2011 2:56 PM, beccax wrote:

Apologize if this question has already been raised. I tried searching but
couldn't find the relevant posts.

We've indexed a bunch of documents by different authors. Then for search
results, we'd like to show the authors that have 1 or more documents
matching the search keywords.

The problem is right now our solr search method first paginates results to
100 documents per page, then we take the results and group by authors. This
results in different number of authors per page. (Some authors may only
have one matching document and others 5 or 10.)

How do we change it to somehow show the same number of authors (say 25) per
page?

I mean alternatively we could just show all the documents themselves ordered
by author, but it's not the user experience we're looking for.

Thanks so much. And please let me know if you need more details not
provided here.
B

--
View this message in context:
http://lucene.472066.n3.nabble.com/Newbie-question-how-to-deal-with-different-of-search-results-per-page-due-to-pagination-then-grouping-tp3012168p3012168.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Newbie question: how to deal with different # of search results per page due to pagination then grouping

2011-06-01 Thread Robert Petersen

Don't manually group by author from your results, the list will always
be incomplete...  use faceting instead to show the authors of the books
you have found in your search.

http://wiki.apache.org/solr/SolrFacetingOverview

-Original Message-
From: beccax [mailto:bec...@gmail.com] 
Sent: Wednesday, June 01, 2011 11:56 AM
To: solr-user@lucene.apache.org
Subject: Newbie question: how to deal with different # of search results
per page due to pagination then grouping

Apologize if this question has already been raised.  I tried searching
but
couldn't find the relevant posts.

We've indexed a bunch of documents by different authors.  Then for
search
results, we'd like to show the authors that have 1 or more documents
matching the search keywords.  

The problem is right now our solr search method first paginates results
to
100 documents per page, then we take the results and group by authors.
This
results in different number of authors per page.  (Some authors may only
have one matching document and others 5 or 10.)

How do we change it to somehow show the same number of authors (say 25)
per
page?

I mean alternatively we could just show all the documents themselves
ordered
by author, but it's not the user experience we're looking for.

Thanks so much.  And please let me know if you need more details not
provided here.
B

--
View this message in context:
http://lucene.472066.n3.nabble.com/Newbie-question-how-to-deal-with-diff
erent-of-search-results-per-page-due-to-pagination-then-grouping-tp30121
68p3012168.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Newbie question: how to deal with different # of search results per page due to pagination then grouping

2011-06-01 Thread Robert Petersen

I think facet.offset allows facet paging nicely by letting you index
into the list of facet values.  It is working for me...

http://wiki.apache.org/solr/SimpleFacetParameters#facet.offset

-Original Message-
From: Jonathan Rochkind [mailto:rochk...@jhu.edu] 
Sent: Wednesday, June 01, 2011 12:41 PM
To: solr-user@lucene.apache.org
Subject: Re: Newbie question: how to deal with different # of search
results per page due to pagination then grouping

There's no great way to do that.

One approach would be using facets, but that will just get you the 
author names (as stored in fields), and not the documents under it. If 
you really only want to show the author names, facets could work. One 
issue with facets though is Solr won't tell you the total number of 
facet values for your query, so it's tricky to provide next/prev paging 
through them.

There is also a 'field collapsing' feature that I think is not in a 
released Solr, but may be in the Solr repo. I'm not sure it will quite 
do what you want either though, although it's related and worth a look. 
http://wiki.apache.org/solr/FieldCollapsing

Another vaguely related thing that is also not yet in a released Solr, 
is a 'join' function. That could possibly be used to do what you want, 
although it'd be tricky too.
https://issues.apache.org/jira/browse/SOLR-2272

Jonathan

On 6/1/2011 2:56 PM, beccax wrote:
 Apologize if this question has already been raised.  I tried searching
but
 couldn't find the relevant posts.

 We've indexed a bunch of documents by different authors.  Then for
search
 results, we'd like to show the authors that have 1 or more documents
 matching the search keywords.

 The problem is right now our solr search method first paginates
results to
 100 documents per page, then we take the results and group by authors.
This
 results in different number of authors per page.  (Some authors may
only
 have one matching document and others 5 or 10.)

 How do we change it to somehow show the same number of authors (say
25) per
 page?

 I mean alternatively we could just show all the documents themselves
ordered
 by author, but it's not the user experience we're looking for.

 Thanks so much.  And please let me know if you need more details not
 provided here.
 B

 --
 View this message in context:
http://lucene.472066.n3.nabble.com/Newbie-question-how-to-deal-with-diff
erent-of-search-results-per-page-due-to-pagination-then-grouping-tp30121
68p3012168.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie question: how to deal with different # of search results per page due to pagination then grouping

2011-06-01 Thread Jonathan Rochkind

How do you know whether to provide a 'next' button, or whether you are 
the end of your facet list?


On 6/1/2011 4:47 PM, Robert Petersen wrote:

I think facet.offset allows facet paging nicely by letting you index
into the list of facet values.  It is working for me...

http://wiki.apache.org/solr/SimpleFacetParameters#facet.offset


-Original Message-
From: Jonathan Rochkind [mailto:rochk...@jhu.edu]
Sent: Wednesday, June 01, 2011 12:41 PM
To: solr-user@lucene.apache.org
Subject: Re: Newbie question: how to deal with different # of search
results per page due to pagination then grouping

There's no great way to do that.

One approach would be using facets, but that will just get you the
author names (as stored in fields), and not the documents under it. If
you really only want to show the author names, facets could work. One
issue with facets though is Solr won't tell you the total number of
facet values for your query, so it's tricky to provide next/prev paging
through them.

There is also a 'field collapsing' feature that I think is not in a
released Solr, but may be in the Solr repo. I'm not sure it will quite
do what you want either though, although it's related and worth a look.
http://wiki.apache.org/solr/FieldCollapsing

Another vaguely related thing that is also not yet in a released Solr,
is a 'join' function. That could possibly be used to do what you want,
although it'd be tricky too.
https://issues.apache.org/jira/browse/SOLR-2272

Jonathan

On 6/1/2011 2:56 PM, beccax wrote:

Apologize if this question has already been raised.  I tried searching

but

couldn't find the relevant posts.

We've indexed a bunch of documents by different authors.  Then for

search

results, we'd like to show the authors that have 1 or more documents
matching the search keywords.

The problem is right now our solr search method first paginates

results to

100 documents per page, then we take the results and group by authors.

This

results in different number of authors per page.  (Some authors may

only

have one matching document and others 5 or 10.)

How do we change it to somehow show the same number of authors (say

25) per

page?

I mean alternatively we could just show all the documents themselves

ordered

by author, but it's not the user experience we're looking for.

Thanks so much.  And please let me know if you need more details not
provided here.
B

--
View this message in context:

http://lucene.472066.n3.nabble.com/Newbie-question-how-to-deal-with-diff
erent-of-search-results-per-page-due-to-pagination-then-grouping-tp30121
68p3012168.html

Sent from the Solr - User mailing list archive at Nabble.com.

RE: Newbie question: how to deal with different # of search results per page due to pagination then grouping

2011-06-01 Thread Robert Petersen

Yes that is exactly the issue... we're thinking just maybe always have a
next button and if you go too far you just get zero results.  User gets
what the user asks for, and so user could simply back up if desired to
where the facet still has values.  Could also detect an empty facet
results on the front end.  You can also only expand one facet only to
allow paging only the facet pane and not the whole page using an ajax
call.



-Original Message-
From: Jonathan Rochkind [mailto:rochk...@jhu.edu] 
Sent: Wednesday, June 01, 2011 2:30 PM
To: solr-user@lucene.apache.org
Cc: Robert Petersen
Subject: Re: Newbie question: how to deal with different # of search
results per page due to pagination then grouping

How do you know whether to provide a 'next' button, or whether you are 
the end of your facet list?

On 6/1/2011 4:47 PM, Robert Petersen wrote:
 I think facet.offset allows facet paging nicely by letting you index
 into the list of facet values.  It is working for me...

 http://wiki.apache.org/solr/SimpleFacetParameters#facet.offset


 -Original Message-
 From: Jonathan Rochkind [mailto:rochk...@jhu.edu]
 Sent: Wednesday, June 01, 2011 12:41 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Newbie question: how to deal with different # of search
 results per page due to pagination then grouping

 There's no great way to do that.

 One approach would be using facets, but that will just get you the
 author names (as stored in fields), and not the documents under it. If
 you really only want to show the author names, facets could work. One
 issue with facets though is Solr won't tell you the total number of
 facet values for your query, so it's tricky to provide next/prev
paging
 through them.

 There is also a 'field collapsing' feature that I think is not in a
 released Solr, but may be in the Solr repo. I'm not sure it will quite
 do what you want either though, although it's related and worth a
look.
 http://wiki.apache.org/solr/FieldCollapsing

 Another vaguely related thing that is also not yet in a released Solr,
 is a 'join' function. That could possibly be used to do what you want,
 although it'd be tricky too.
 https://issues.apache.org/jira/browse/SOLR-2272

 Jonathan

 On 6/1/2011 2:56 PM, beccax wrote:
 Apologize if this question has already been raised.  I tried
searching
 but
 couldn't find the relevant posts.

 We've indexed a bunch of documents by different authors.  Then for
 search
 results, we'd like to show the authors that have 1 or more documents
 matching the search keywords.

 The problem is right now our solr search method first paginates
 results to
 100 documents per page, then we take the results and group by
authors.
 This
 results in different number of authors per page.  (Some authors may
 only
 have one matching document and others 5 or 10.)

 How do we change it to somehow show the same number of authors (say
 25) per
 page?

 I mean alternatively we could just show all the documents themselves
 ordered
 by author, but it's not the user experience we're looking for.

 Thanks so much.  And please let me know if you need more details not
 provided here.
 B

 --
 View this message in context:

http://lucene.472066.n3.nabble.com/Newbie-question-how-to-deal-with-diff

erent-of-search-results-per-page-due-to-pagination-then-grouping-tp30121
 68p3012168.html
 Sent from the Solr - User mailing list archive at Nabble.com.

RE: newbie question for DataImportHandler

2011-05-31 Thread Kevin Bootz

In the op it's stated that the index was deleted. I'm guessing that means the 
physical files, /data/  
quote
populate the table 
 with another million rows of data.
 I remove the index that solr previously create. I restart solr and go 
 to
the
 data import handler development console and do the full import again.
endquote

Is there a separate cache that could be causing the issue? I'm a newbie as well 
and it seems that if I delete the index there shouldn't be any vestige info 
left anywhere

Thanks

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Sunday, May 29, 2011 9:00 PM
To: solr-user@lucene.apache.org
Subject: Re: newbie question for DataImportHandler

This trips up a lot of folks. Sold just marks docs as deleted, the terms etc 
are left in the index until an optimize is performed, or the segments are 
merged. This latter isn't very predictable, so just do an optimize.

The docs aren't returned as results though.

Best
Erick
On May 24, 2011 10:22 PM, antoniosi antonio...@gmail.com wrote:
 Hi,

 I am new to Solr; apologize in advance if this is a stupid question.

 I have created a simple database, with only 1 table with 3 columns, 
 id, name, and last_update fields.

 I populate the database with 1 million test rows.
 I run solr, go to the data import handler development console and do a
full
 import. I use the Luke tool to look at the content of the lucene index.

 This all works fine so far.

 I remove all the 1 million rows from my table and populate the table 
 with another million rows of data.
 I remove the index that solr previously create. I restart solr and go 
 to
the
 data import handler development console and do the full import again.

 I use the Luke tool to look at the content of the lucene index. 
 However,
I
 am seeing the old data in my new index.

 Doe Solr keeps a cached copy of the index somewhere?

 I hope I have described my problem clearly.

 Thanks in advance.

 --
 View this message in context:
http://lucene.472066.n3.nabble.com/newbie-question-for-DataImportHandler-tp2982277p2982277.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: newbie question for DataImportHandler

2011-05-29 Thread Erick Erickson

This trips up a lot of folks. Sold just marks docs as deleted, the terms etc
are left in the index until an optimize is performed, or the segments are
merged. This latter isn't very predictable, so just do an optimize.

The docs aren't returned as results though.

Best
Erick
On May 24, 2011 10:22 PM, antoniosi antonio...@gmail.com wrote:
 Hi,

 I am new to Solr; apologize in advance if this is a stupid question.

 I have created a simple database, with only 1 table with 3 columns, id,
 name, and last_update fields.

 I populate the database with 1 million test rows.
 I run solr, go to the data import handler development console and do a
full
 import. I use the Luke tool to look at the content of the lucene index.

 This all works fine so far.

 I remove all the 1 million rows from my table and populate the table with
 another million rows of data.
 I remove the index that solr previously create. I restart solr and go to
the
 data import handler development console and do the full import again.

 I use the Luke tool to look at the content of the lucene index. However,
I
 am seeing the old data in my new index.

 Doe Solr keeps a cached copy of the index somewhere?

 I hope I have described my problem clearly.

 Thanks in advance.

 --
 View this message in context:
http://lucene.472066.n3.nabble.com/newbie-question-for-DataImportHandler-tp2982277p2982277.html
 Sent from the Solr - User mailing list archive at Nabble.com.

RE: newbie question for DataImportHandler

2011-05-24 Thread Zac Smith

Sounds like you might not be committing the delete. How are you deleting it?
If you run the data import handler with clean=true (which is the default) it 
will delete the data for you anyway so you don't need to delete it yourself.

Hope that helps.

-Original Message-
From: antoniosi [mailto:antonio...@gmail.com] 
Sent: Tuesday, May 24, 2011 4:43 PM
To: solr-user@lucene.apache.org
Subject: newbie question for DataImportHandler

Hi,

I am new to Solr; apologize in advance if this is a stupid question.

I have created a simple database, with only 1 table with 3 columns, id, name, 
and last_update fields.

I populate the database with 1 million test rows.
I run solr, go to the data import handler development console and do a full 
import. I use the Luke tool to look at the content of the lucene index.

This all works fine so far.

I remove all the 1 million rows from my table and populate the table with 
another million rows of data.
I remove the index that solr previously create. I restart solr and go to the 
data import handler development console and do the full import again.

I use the Luke tool to look at the content of the lucene index. However, I am 
seeing the old data in my new index.

Doe Solr keeps a cached copy of the index somewhere?

I hope I have described my problem clearly.

Thanks in advance.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/newbie-question-for-DataImportHandler-tp2982277p2982277.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: A Newbie Question

2010-11-15 Thread Lance Norskog

There is no current feature is what I meant. Yes, it would be very 
handy to do this.


I handled this problem in the DIH by creating two documents, both with 
the same unique ID. The first doc just had the metadata. The second 
document parsed the input with Tika, but had 'skip doc on error' set. 
So, if the parsing worked, the parsed document overwrote the first 
document. If parsing failed, the metadata-only document went in.


Works quite well!

Ken Krugler wrote:


On Nov 14, 2010, at 3:02pm, Lance Norskog wrote:


Yes, the ExtractingRequestHandler uses Tika to parse many file formats.

Solr 1.4.1 uses a previous version of Tika (0.6 or 0.7).

Here's the problem with Tika and extraction utilities in general: 
they are not perfect. They will fail on some files. In the 
ExtractingRequestHandler's case, there is no way to let it fail in 
parsing but save the document's metadata anyway with a notation: 
sorry not parsed.


By there is no way do you mean in configuring the current 
ExtractingRequestHandler? Or is there some fundamental issue with how 
Solr uses Tika that prevents ExtractingRequestHandler from being 
modified to work this way (which seems like a useful configuration 
settings)?


Regards,

-- Ken

I would rather have the unix 'strings' command parse my documents 
(thanks to a co-worker for this).


K. Seshadri Iyer wrote:

Thanks for all the responses.

Govind: To answer your question, yes, all I want to search is plain 
text
files. They are located in NFS directories across multiple 
Solaris/Linux

storage boxes. The total storage is in hundreds of terabytes.

I have just got started with Solr and my understanding is that I will
somehow need Tika to help stream/upload files to Solr. I don't know 
anything
about Java programming, being a system admin. So far, I have read 
that the
autodetect parser in Tika will somehow detect the file type and I 
can use
the stream to populate Solr. How, that is still a mystery to me - 
working on

it. Any tips appreciated; thanks in advance.

Sesh



On 13 November 2010 15:24, Govind Kanshigovind.kan...@gmail.com  
wrote:



Another pov you might want to think about - what kind of search you 
want.

Just plain - full text search or there is something more to those text
files. Are they grouped in folders? Do the folders imply certain 
kind of

grouping/hierarchy/tagging?

I recently was trying to help somebody who had files across lot of 
places
grouped by date/subject/author - he wanted to ensure these are 
fields

which too can act as filters/navigators.

Just an input - ignore it if you just want plain full text search.

On Sat, Nov 13, 2010 at 11:25 AM, Lance Norskoggoks...@gmail.com  
wrote:



About web servers: Solr is a servlet war file and needs a Java web 
server

container to run. The example/ folder in the Solr disribution uses
'Jetty', and this is fine for small production-quality projects.  
You can

just copy the example/ directory somewhere to set up your own running


Solr;


that's what I always do.

About indexing programs: if you know Unix scripting, it may be 
easiest to

walk the file system yourself with the 'find' program and create Solr


input


XML files.

But yes, you definitely want the Solr 1.4 Enterprise manual. I spent


months

learning this stuff very slowly, and the book would have been 
great back

then.

Lance


Erick Erickson wrote:



Think of the data import handler (DIH) as Solr pulling data to index
from some source based on configuration. So, once you set up
your DIH config to point to your file system, you issue a command
to solr like OK, do your data import thing. See the
FileListEntityProcessor.
http://wiki.apache.org/solr/DataImportHandler

http://wiki.apache.org/solr/DataImportHandlerSolrJ is a clent 
library

you'd use to push data to Solr. Basically, you
write a Java program that uses SolrJ to walk the file system, find
documents, create a Solr document and sent that to Solr. It's not
nearly as complex as it soundsG. See:
http://wiki.apache.org/solr/Solrj

http://wiki.apache.org/solr/SolrjIt's probably worth your while to


get


a
copy of Solr 1.4, Enterprise Search Server
by Erik Pugh and David Smiley.

Best
Erick

On Fri, Nov 12, 2010 at 8:37 AM, K. Seshadri 
Iyerseshadri...@gmail.com



wrote:






Hi Lance,

Thank you very much for responding (not sure how I reply to the 
group,

so,
writing to you).

Can you please expand on your suggestion? I am not a web guy and 
so,

don't
know where to start.

What is the difference between SolrJ and DataImportHandler? Do I 
need



to


set
up web servers on all my storage boxes?

Apologies for the basic level of questions, but hope I can get 
started

and
implement this before the year end (you know why :o)

Thanks,

Sesh

On 12 November 2010 13:31, Lance Norskoggoks...@gmail.com   
wrote:






Using 'curl' is fine. There is a library called SolrJ for Java and
other libraries for other scripting languages that let you 
upload with
more control. There is a

Re: A Newbie Question

2010-11-14 Thread K. Seshadri Iyer

Thanks for all the responses.

Govind: To answer your question, yes, all I want to search is plain text
files. They are located in NFS directories across multiple Solaris/Linux
storage boxes. The total storage is in hundreds of terabytes.

I have just got started with Solr and my understanding is that I will
somehow need Tika to help stream/upload files to Solr. I don't know anything
about Java programming, being a system admin. So far, I have read that the
autodetect parser in Tika will somehow detect the file type and I can use
the stream to populate Solr. How, that is still a mystery to me - working on
it. Any tips appreciated; thanks in advance.

Sesh



On 13 November 2010 15:24, Govind Kanshi govind.kan...@gmail.com wrote:

 Another pov you might want to think about - what kind of search you want.
 Just plain - full text search or there is something more to those text
 files. Are they grouped in folders? Do the folders imply certain kind of
 grouping/hierarchy/tagging?

 I recently was trying to help somebody who had files across lot of places
 grouped by date/subject/author - he wanted to ensure these are fields
 which too can act as filters/navigators.

 Just an input - ignore it if you just want plain full text search.

 On Sat, Nov 13, 2010 at 11:25 AM, Lance Norskog goks...@gmail.com wrote:

  About web servers: Solr is a servlet war file and needs a Java web server
  container to run. The example/ folder in the Solr disribution uses
  'Jetty', and this is fine for small production-quality projects.  You can
  just copy the example/ directory somewhere to set up your own running
 Solr;
  that's what I always do.
 
  About indexing programs: if you know Unix scripting, it may be easiest to
  walk the file system yourself with the 'find' program and create Solr
 input
  XML files.
 
  But yes, you definitely want the Solr 1.4 Enterprise manual. I spent
 months
  learning this stuff very slowly, and the book would have been great back
  then.
 
  Lance
 
 
  Erick Erickson wrote:
 
  Think of the data import handler (DIH) as Solr pulling data to index
  from some source based on configuration. So, once you set up
  your DIH config to point to your file system, you issue a command
  to solr like OK, do your data import thing. See the
  FileListEntityProcessor.
  http://wiki.apache.org/solr/DataImportHandler
 
  http://wiki.apache.org/solr/DataImportHandlerSolrJ is a clent library
  you'd use to push data to Solr. Basically, you
  write a Java program that uses SolrJ to walk the file system, find
  documents, create a Solr document and sent that to Solr. It's not
  nearly as complex as it soundsG. See:
  http://wiki.apache.org/solr/Solrj
 
  http://wiki.apache.org/solr/SolrjIt's probably worth your while to
 get
  a
  copy of Solr 1.4, Enterprise Search Server
  by Erik Pugh and David Smiley.
 
  Best
  Erick
 
  On Fri, Nov 12, 2010 at 8:37 AM, K. Seshadri Iyerseshadri...@gmail.com
  wrote:
 
 
 
  Hi Lance,
 
  Thank you very much for responding (not sure how I reply to the group,
  so,
  writing to you).
 
  Can you please expand on your suggestion? I am not a web guy and so,
  don't
  know where to start.
 
  What is the difference between SolrJ and DataImportHandler? Do I need
 to
  set
  up web servers on all my storage boxes?
 
  Apologies for the basic level of questions, but hope I can get started
  and
  implement this before the year end (you know why :o)
 
  Thanks,
 
  Sesh
 
  On 12 November 2010 13:31, Lance Norskoggoks...@gmail.com  wrote:
 
 
 
  Using 'curl' is fine. There is a library called SolrJ for Java and
  other libraries for other scripting languages that let you upload with
  more control. There is a thing in Solr called the DataImportHandler
  that lets you script walking a file system.
 
  On Thu, Nov 11, 2010 at 8:38 PM, K. Seshadri Iyer
 seshadri...@gmail.com
 
  wrote:
 
 
  Hi,
 
  Pardon me if this sounds very elementary, but I have a very basic
 
 
  question
 
 
  regarding Solr search. I have about 10 storage devices running
 Solaris
 
 
  with
 
 
  hundreds of thousands of text files (there are other files, as well,
 
 
  but
 
 
  my
 
 
  target is these text files). The directories on the Solaris boxes are
  exported and are available as NFS mounts.
 
  I have installed Solr 1.4 on a Linux box and have tested the
 
 
  installation,
 
 
  using curl to post  documents. However, the manual says that curl is
 
 
  not
 
 
  the
 
 
  recommended way of posting documents to Solr. Could someone please
 tell
 
 
  me
 
 
  what is the preferred approach in such an environment? I am not a
 
 
  programmer
 
 
  and would appreciate some hand-holding here :o)
 
  Thanks in advance,
 
  Sesh
 
 
 
 
 
  --
  Lance Norskog
  goks...@gmail.com

Re: A Newbie Question

2010-11-14 Thread Lance Norskog


Yes, the ExtractingRequestHandler uses Tika to parse many file formats.

Solr 1.4.1 uses a previous version of Tika (0.6 or 0.7).

Here's the problem with Tika and extraction utilities in general: they 
are not perfect. They will fail on some files. In the 
ExtractingRequestHandler's case, there is no way to let it fail in 
parsing but save the document's metadata anyway with a notation: sorry 
not parsed.  I would rather have the unix 'strings' command parse my 
documents (thanks to a co-worker for this).


K. Seshadri Iyer wrote:

Thanks for all the responses.

Govind: To answer your question, yes, all I want to search is plain text
files. They are located in NFS directories across multiple Solaris/Linux
storage boxes. The total storage is in hundreds of terabytes.

I have just got started with Solr and my understanding is that I will
somehow need Tika to help stream/upload files to Solr. I don't know anything
about Java programming, being a system admin. So far, I have read that the
autodetect parser in Tika will somehow detect the file type and I can use
the stream to populate Solr. How, that is still a mystery to me - working on
it. Any tips appreciated; thanks in advance.

Sesh



On 13 November 2010 15:24, Govind Kanshigovind.kan...@gmail.com  wrote:

   

Another pov you might want to think about - what kind of search you want.
Just plain - full text search or there is something more to those text
files. Are they grouped in folders? Do the folders imply certain kind of
grouping/hierarchy/tagging?

I recently was trying to help somebody who had files across lot of places
grouped by date/subject/author - he wanted to ensure these are fields
which too can act as filters/navigators.

Just an input - ignore it if you just want plain full text search.

On Sat, Nov 13, 2010 at 11:25 AM, Lance Norskoggoks...@gmail.com  wrote:

 

About web servers: Solr is a servlet war file and needs a Java web server
container to run. The example/ folder in the Solr disribution uses
'Jetty', and this is fine for small production-quality projects.  You can
just copy the example/ directory somewhere to set up your own running
   

Solr;
 

that's what I always do.

About indexing programs: if you know Unix scripting, it may be easiest to
walk the file system yourself with the 'find' program and create Solr
   

input
 

XML files.

But yes, you definitely want the Solr 1.4 Enterprise manual. I spent
   

months
 

learning this stuff very slowly, and the book would have been great back
then.

Lance


Erick Erickson wrote:

   

Think of the data import handler (DIH) as Solr pulling data to index
from some source based on configuration. So, once you set up
your DIH config to point to your file system, you issue a command
to solr like OK, do your data import thing. See the
FileListEntityProcessor.
http://wiki.apache.org/solr/DataImportHandler

http://wiki.apache.org/solr/DataImportHandlerSolrJ is a clent library
you'd use to push data to Solr. Basically, you
write a Java program that uses SolrJ to walk the file system, find
documents, create a Solr document and sent that to Solr. It's not
nearly as complex as it soundsG. See:
http://wiki.apache.org/solr/Solrj

http://wiki.apache.org/solr/SolrjIt's probably worth your while to
 

get
 

a
copy of Solr 1.4, Enterprise Search Server
by Erik Pugh and David Smiley.

Best
Erick

On Fri, Nov 12, 2010 at 8:37 AM, K. Seshadri Iyerseshadri...@gmail.com
 

wrote:
   



 

Hi Lance,

Thank you very much for responding (not sure how I reply to the group,
so,
writing to you).

Can you please expand on your suggestion? I am not a web guy and so,
don't
know where to start.

What is the difference between SolrJ and DataImportHandler? Do I need
   

to
 

set
up web servers on all my storage boxes?

Apologies for the basic level of questions, but hope I can get started
and
implement this before the year end (you know why :o)

Thanks,

Sesh

On 12 November 2010 13:31, Lance Norskoggoks...@gmail.com   wrote:



   

Using 'curl' is fine. There is a library called SolrJ for Java and
other libraries for other scripting languages that let you upload with
more control. There is a thing in Solr called the DataImportHandler
that lets you script walking a file system.

On Thu, Nov 11, 2010 at 8:38 PM, K. Seshadri Iyer
 

seshadri...@gmail.com
 

wrote:


 

Hi,

Pardon me if this sounds very elementary, but I have a very basic


   

question


 

regarding Solr search. I have about 10 storage devices running
   

Solaris
 


   

with


 

hundreds of thousands of text files (there are other files, as well,


   

but
 


   

my


 

target is these text files). The directories on the Solaris boxes are
exported and are available as NFS mounts.

I have installed Solr 1.4 on a

Re: A Newbie Question

2010-11-14 Thread Ken Krugler



On Nov 14, 2010, at 3:02pm, Lance Norskog wrote:

Yes, the ExtractingRequestHandler uses Tika to parse many file  
formats.


Solr 1.4.1 uses a previous version of Tika (0.6 or 0.7).

Here's the problem with Tika and extraction utilities in general:  
they are not perfect. They will fail on some files. In the  
ExtractingRequestHandler's case, there is no way to let it fail in  
parsing but save the document's metadata anyway with a notation:  
sorry not parsed.


By there is no way do you mean in configuring the current  
ExtractingRequestHandler? Or is there some fundamental issue with how  
Solr uses Tika that prevents ExtractingRequestHandler from being  
modified to work this way (which seems like a useful configuration  
settings)?


Regards,

-- Ken

I would rather have the unix 'strings' command parse my documents  
(thanks to a co-worker for this).


K. Seshadri Iyer wrote:

Thanks for all the responses.

Govind: To answer your question, yes, all I want to search is plain  
text
files. They are located in NFS directories across multiple Solaris/ 
Linux

storage boxes. The total storage is in hundreds of terabytes.

I have just got started with Solr and my understanding is that I will
somehow need Tika to help stream/upload files to Solr. I don't know  
anything
about Java programming, being a system admin. So far, I have read  
that the
autodetect parser in Tika will somehow detect the file type and I  
can use
the stream to populate Solr. How, that is still a mystery to me -  
working on

it. Any tips appreciated; thanks in advance.

Sesh



On 13 November 2010 15:24, Govind Kanshigovind.kan...@gmail.com   
wrote:



Another pov you might want to think about - what kind of search  
you want.
Just plain - full text search or there is something more to those  
text
files. Are they grouped in folders? Do the folders imply certain  
kind of

grouping/hierarchy/tagging?

I recently was trying to help somebody who had files across lot of  
places
grouped by date/subject/author - he wanted to ensure these are  
fields

which too can act as filters/navigators.

Just an input - ignore it if you just want plain full text search.

On Sat, Nov 13, 2010 at 11:25 AM, Lance  
Norskoggoks...@gmail.com  wrote:



About web servers: Solr is a servlet war file and needs a Java  
web server
container to run. The example/ folder in the Solr disribution  
uses
'Jetty', and this is fine for small production-quality projects.   
You can
just copy the example/ directory somewhere to set up your own  
running



Solr;


that's what I always do.

About indexing programs: if you know Unix scripting, it may be  
easiest to
walk the file system yourself with the 'find' program and create  
Solr



input


XML files.

But yes, you definitely want the Solr 1.4 Enterprise manual. I  
spent



months

learning this stuff very slowly, and the book would have been  
great back

then.

Lance


Erick Erickson wrote:


Think of the data import handler (DIH) as Solr pulling data to  
index

from some source based on configuration. So, once you set up
your DIH config to point to your file system, you issue a command
to solr like OK, do your data import thing. See the
FileListEntityProcessor.
http://wiki.apache.org/solr/DataImportHandler

http://wiki.apache.org/solr/DataImportHandlerSolrJ is a clent  
library

you'd use to push data to Solr. Basically, you
write a Java program that uses SolrJ to walk the file system, find
documents, create a Solr document and sent that to Solr. It's not
nearly as complex as it soundsG. See:
http://wiki.apache.org/solr/Solrj

http://wiki.apache.org/solr/SolrjIt's probably worth your  
while to



get


a
copy of Solr 1.4, Enterprise Search Server
by Erik Pugh and David Smiley.

Best
Erick

On Fri, Nov 12, 2010 at 8:37 AM, K. Seshadri Iyerseshadri...@gmail.com


wrote:






Hi Lance,

Thank you very much for responding (not sure how I reply to the  
group,

so,
writing to you).

Can you please expand on your suggestion? I am not a web guy  
and so,

don't
know where to start.

What is the difference between SolrJ and DataImportHandler? Do  
I need



to


set
up web servers on all my storage boxes?

Apologies for the basic level of questions, but hope I can get  
started

and
implement this before the year end (you know why :o)

Thanks,

Sesh

On 12 November 2010 13:31, Lance Norskoggoks...@gmail.com
wrote:





Using 'curl' is fine. There is a library called SolrJ for Java  
and
other libraries for other scripting languages that let you  
upload with
more control. There is a thing in Solr called the  
DataImportHandler

that lets you script walking a file system.

On Thu, Nov 11, 2010 at 8:38 PM, K. Seshadri Iyer


seshadri...@gmail.com


wrote:




Hi,

Pardon me if this sounds very elementary, but I have a very  
basic





question




regarding Solr search. I have about 10 storage devices running


Solaris





with



hundreds of thousands of text files (there are other files,  
as well,

Re: A Newbie Question

2010-11-13 Thread Govind Kanshi

Another pov you might want to think about - what kind of search you want.
Just plain - full text search or there is something more to those text
files. Are they grouped in folders? Do the folders imply certain kind of
grouping/hierarchy/tagging?

I recently was trying to help somebody who had files across lot of places
grouped by date/subject/author - he wanted to ensure these are fields
which too can act as filters/navigators.

Just an input - ignore it if you just want plain full text search.

On Sat, Nov 13, 2010 at 11:25 AM, Lance Norskog goks...@gmail.com wrote:

 About web servers: Solr is a servlet war file and needs a Java web server
 container to run. The example/ folder in the Solr disribution uses
 'Jetty', and this is fine for small production-quality projects.  You can
 just copy the example/ directory somewhere to set up your own running Solr;
 that's what I always do.

 About indexing programs: if you know Unix scripting, it may be easiest to
 walk the file system yourself with the 'find' program and create Solr input
 XML files.

 But yes, you definitely want the Solr 1.4 Enterprise manual. I spent months
 learning this stuff very slowly, and the book would have been great back
 then.

 Lance


 Erick Erickson wrote:

 Think of the data import handler (DIH) as Solr pulling data to index
 from some source based on configuration. So, once you set up
 your DIH config to point to your file system, you issue a command
 to solr like OK, do your data import thing. See the
 FileListEntityProcessor.
 http://wiki.apache.org/solr/DataImportHandler

 http://wiki.apache.org/solr/DataImportHandlerSolrJ is a clent library
 you'd use to push data to Solr. Basically, you
 write a Java program that uses SolrJ to walk the file system, find
 documents, create a Solr document and sent that to Solr. It's not
 nearly as complex as it soundsG. See:
 http://wiki.apache.org/solr/Solrj

 http://wiki.apache.org/solr/SolrjIt's probably worth your while to get
 a
 copy of Solr 1.4, Enterprise Search Server
 by Erik Pugh and David Smiley.

 Best
 Erick

 On Fri, Nov 12, 2010 at 8:37 AM, K. Seshadri Iyerseshadri...@gmail.com
 wrote:



 Hi Lance,

 Thank you very much for responding (not sure how I reply to the group,
 so,
 writing to you).

 Can you please expand on your suggestion? I am not a web guy and so,
 don't
 know where to start.

 What is the difference between SolrJ and DataImportHandler? Do I need to
 set
 up web servers on all my storage boxes?

 Apologies for the basic level of questions, but hope I can get started
 and
 implement this before the year end (you know why :o)

 Thanks,

 Sesh

 On 12 November 2010 13:31, Lance Norskoggoks...@gmail.com  wrote:



 Using 'curl' is fine. There is a library called SolrJ for Java and
 other libraries for other scripting languages that let you upload with
 more control. There is a thing in Solr called the DataImportHandler
 that lets you script walking a file system.

 On Thu, Nov 11, 2010 at 8:38 PM, K. Seshadri Iyerseshadri...@gmail.com

 wrote:


 Hi,

 Pardon me if this sounds very elementary, but I have a very basic


 question


 regarding Solr search. I have about 10 storage devices running Solaris


 with


 hundreds of thousands of text files (there are other files, as well,


 but


 my


 target is these text files). The directories on the Solaris boxes are
 exported and are available as NFS mounts.

 I have installed Solr 1.4 on a Linux box and have tested the


 installation,


 using curl to post  documents. However, the manual says that curl is


 not


 the


 recommended way of posting documents to Solr. Could someone please tell


 me


 what is the preferred approach in such an environment? I am not a


 programmer


 and would appreciate some hand-holding here :o)

 Thanks in advance,

 Sesh





 --
 Lance Norskog
 goks...@gmail.com

Re: A Newbie Question

2010-11-12 Thread Lance Norskog

Using 'curl' is fine. There is a library called SolrJ for Java and
other libraries for other scripting languages that let you upload with
more control. There is a thing in Solr called the DataImportHandler
that lets you script walking a file system.

On Thu, Nov 11, 2010 at 8:38 PM, K. Seshadri Iyer seshadri...@gmail.com wrote:
 Hi,

 Pardon me if this sounds very elementary, but I have a very basic question
 regarding Solr search. I have about 10 storage devices running Solaris with
 hundreds of thousands of text files (there are other files, as well, but my
 target is these text files). The directories on the Solaris boxes are
 exported and are available as NFS mounts.

 I have installed Solr 1.4 on a Linux box and have tested the installation,
 using curl to post  documents. However, the manual says that curl is not the
 recommended way of posting documents to Solr. Could someone please tell me
 what is the preferred approach in such an environment? I am not a programmer
 and would appreciate some hand-holding here :o)

 Thanks in advance,

 Sesh




-- 
Lance Norskog
goks...@gmail.com

Re: A Newbie Question

2010-11-12 Thread K. Seshadri Iyer

Hi Lance,

Thank you very much for responding (not sure how I reply to the group, so,
writing to you).

Can you please expand on your suggestion? I am not a web guy and so, don't
know where to start.

What is the difference between SolrJ and DataImportHandler? Do I need to set
up web servers on all my storage boxes?

Apologies for the basic level of questions, but hope I can get started and
implement this before the year end (you know why :o)

Thanks,

Sesh

On 12 November 2010 13:31, Lance Norskog goks...@gmail.com wrote:

 Using 'curl' is fine. There is a library called SolrJ for Java and
 other libraries for other scripting languages that let you upload with
 more control. There is a thing in Solr called the DataImportHandler
 that lets you script walking a file system.

 On Thu, Nov 11, 2010 at 8:38 PM, K. Seshadri Iyer seshadri...@gmail.com
 wrote:
  Hi,
 
  Pardon me if this sounds very elementary, but I have a very basic
 question
  regarding Solr search. I have about 10 storage devices running Solaris
 with
  hundreds of thousands of text files (there are other files, as well, but
 my
  target is these text files). The directories on the Solaris boxes are
  exported and are available as NFS mounts.
 
  I have installed Solr 1.4 on a Linux box and have tested the
 installation,
  using curl to post  documents. However, the manual says that curl is not
 the
  recommended way of posting documents to Solr. Could someone please tell
 me
  what is the preferred approach in such an environment? I am not a
 programmer
  and would appreciate some hand-holding here :o)
 
  Thanks in advance,
 
  Sesh
 



 --
 Lance Norskog
 goks...@gmail.com

Re: A Newbie Question

2010-11-12 Thread Erick Erickson

Think of the data import handler (DIH) as Solr pulling data to index
from some source based on configuration. So, once you set up
your DIH config to point to your file system, you issue a command
to solr like OK, do your data import thing. See the
FileListEntityProcessor.
http://wiki.apache.org/solr/DataImportHandler

http://wiki.apache.org/solr/DataImportHandlerSolrJ is a clent library
you'd use to push data to Solr. Basically, you
write a Java program that uses SolrJ to walk the file system, find
documents, create a Solr document and sent that to Solr. It's not
nearly as complex as it sounds G. See:
http://wiki.apache.org/solr/Solrj

http://wiki.apache.org/solr/SolrjIt's probably worth your while to get a
copy of Solr 1.4, Enterprise Search Server
by Erik Pugh and David Smiley.

Best
Erick

On Fri, Nov 12, 2010 at 8:37 AM, K. Seshadri Iyer seshadri...@gmail.comwrote:

 Hi Lance,

 Thank you very much for responding (not sure how I reply to the group, so,
 writing to you).

 Can you please expand on your suggestion? I am not a web guy and so, don't
 know where to start.

 What is the difference between SolrJ and DataImportHandler? Do I need to
 set
 up web servers on all my storage boxes?

 Apologies for the basic level of questions, but hope I can get started and
 implement this before the year end (you know why :o)

 Thanks,

 Sesh

 On 12 November 2010 13:31, Lance Norskog goks...@gmail.com wrote:

  Using 'curl' is fine. There is a library called SolrJ for Java and
  other libraries for other scripting languages that let you upload with
  more control. There is a thing in Solr called the DataImportHandler
  that lets you script walking a file system.
 
  On Thu, Nov 11, 2010 at 8:38 PM, K. Seshadri Iyer seshadri...@gmail.com
 
  wrote:
   Hi,
  
   Pardon me if this sounds very elementary, but I have a very basic
  question
   regarding Solr search. I have about 10 storage devices running Solaris
  with
   hundreds of thousands of text files (there are other files, as well,
 but
  my
   target is these text files). The directories on the Solaris boxes are
   exported and are available as NFS mounts.
  
   I have installed Solr 1.4 on a Linux box and have tested the
  installation,
   using curl to post  documents. However, the manual says that curl is
 not
  the
   recommended way of posting documents to Solr. Could someone please tell
  me
   what is the preferred approach in such an environment? I am not a
  programmer
   and would appreciate some hand-holding here :o)
  
   Thanks in advance,
  
   Sesh
  
 
 
 
  --
  Lance Norskog
  goks...@gmail.com

Re: A Newbie Question

2010-11-12 Thread Lance Norskog

About web servers: Solr is a servlet war file and needs a Java web 
server container to run. The example/ folder in the Solr disribution 
uses 'Jetty', and this is fine for small production-quality projects.  
You can just copy the example/ directory somewhere to set up your own 
running Solr; that's what I always do.


About indexing programs: if you know Unix scripting, it may be easiest 
to walk the file system yourself with the 'find' program and create Solr 
input XML files.


But yes, you definitely want the Solr 1.4 Enterprise manual. I spent 
months learning this stuff very slowly, and the book would have been 
great back then.


Lance

Erick Erickson wrote:

Think of the data import handler (DIH) as Solr pulling data to index
from some source based on configuration. So, once you set up
your DIH config to point to your file system, you issue a command
to solr like OK, do your data import thing. See the
FileListEntityProcessor.
http://wiki.apache.org/solr/DataImportHandler

http://wiki.apache.org/solr/DataImportHandlerSolrJ is a clent library
you'd use to push data to Solr. Basically, you
write a Java program that uses SolrJ to walk the file system, find
documents, create a Solr document and sent that to Solr. It's not
nearly as complex as it soundsG. See:
http://wiki.apache.org/solr/Solrj

http://wiki.apache.org/solr/SolrjIt's probably worth your while to get a
copy of Solr 1.4, Enterprise Search Server
by Erik Pugh and David Smiley.

Best
Erick

On Fri, Nov 12, 2010 at 8:37 AM, K. Seshadri Iyerseshadri...@gmail.comwrote:

   

Hi Lance,

Thank you very much for responding (not sure how I reply to the group, so,
writing to you).

Can you please expand on your suggestion? I am not a web guy and so, don't
know where to start.

What is the difference between SolrJ and DataImportHandler? Do I need to
set
up web servers on all my storage boxes?

Apologies for the basic level of questions, but hope I can get started and
implement this before the year end (you know why :o)

Thanks,

Sesh

On 12 November 2010 13:31, Lance Norskoggoks...@gmail.com  wrote:

 

Using 'curl' is fine. There is a library called SolrJ for Java and
other libraries for other scripting languages that let you upload with
more control. There is a thing in Solr called the DataImportHandler
that lets you script walking a file system.

On Thu, Nov 11, 2010 at 8:38 PM, K. Seshadri Iyerseshadri...@gmail.com

wrote:
   

Hi,

Pardon me if this sounds very elementary, but I have a very basic
 

question
   

regarding Solr search. I have about 10 storage devices running Solaris
 

with
   

hundreds of thousands of text files (there are other files, as well,
 

but
 

my
   

target is these text files). The directories on the Solaris boxes are
exported and are available as NFS mounts.

I have installed Solr 1.4 on a Linux box and have tested the
 

installation,
   

using curl to post  documents. However, the manual says that curl is
 

not
 

the
   

recommended way of posting documents to Solr. Could someone please tell
 

me
   

what is the preferred approach in such an environment? I am not a
 

programmer
   

and would appreciate some hand-holding here :o)

Thanks in advance,

Sesh

 



--
Lance Norskog
goks...@gmail.com

A Newbie Question

2010-11-11 Thread K. Seshadri Iyer

Hi,

Pardon me if this sounds very elementary, but I have a very basic question
regarding Solr search. I have about 10 storage devices running Solaris with
hundreds of thousands of text files (there are other files, as well, but my
target is these text files). The directories on the Solaris boxes are
exported and are available as NFS mounts.

I have installed Solr 1.4 on a Linux box and have tested the installation,
using curl to post  documents. However, the manual says that curl is not the
recommended way of posting documents to Solr. Could someone please tell me
what is the preferred approach in such an environment? I am not a programmer
and would appreciate some hand-holding here :o)

Thanks in advance,

Sesh

Re: Newbie question: no search results

2010-09-05 Thread BobG


Hi Lance and Gora,

Thanks for your support!

I have changed 
 fields
field name=Shop_artikel_rg type=string  indexed=true 
stored=true
/
field name=Artikel type=string indexed=true stored=true /
field name=Omschrijving type=string indexed=true stored=true /
 /fields
Into
 fields
field name=Shop_artikel_rg type=string  indexed=true 
stored=true
/
field name=Artikel type=text indexed=true stored=true /
field name=Omschrijving type=text indexed=true stored=true /
 /fields

In the schema.xml, then restarted Tomcat. 
Also I used quotation marks for my search string and it works fine now!

Problem solved!

Best regards,
Bob

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Newbie-question-no-search-results-tp1416482p1422211.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie question: no search results

2010-09-04 Thread Gora Mohanty

On Sat, 4 Sep 2010 01:15:11 -0700 (PDT)
BobG b...@bitwise-bncc.nl wrote:

 
 Hi,
 I am trying to set up a new SOLR search engine on a windows
 platform. It seems like I managed to fill an index with the
 contents of my SQL server table.
 
 When I use the default *.* query I get a nice result:
[...]
 However when I try to query the table with a search term I get no
 results.
[...]

 What am I doing wrong here?
[...]

Could you post your schema.xml. In particular, how is the Artikel
field being processed at index/query time?

Regards,
Gora

Re: Newbie question: no search results

2010-09-04 Thread Lance Norskog

More directly: if the 'Artikel' field is a string, only the whole 
string will match:

Artikel:Kerstman baardstel.
Or you can use a wildcard:  Kerstmann*  or just Kerst*

If it is a text field, it is chopped into words and 
q=Artikel:Kerstmann would work.


Gora Mohanty wrote:

On Sat, 4 Sep 2010 01:15:11 -0700 (PDT)
BobGb...@bitwise-bncc.nl  wrote:

   

Hi,
I am trying to set up a new SOLR search engine on a windows
platform. It seems like I managed to fill an index with the
contents of my SQL server table.

When I use the default *.* query I get a nice result:
 

[...]
   

However when I try to query the table with a search term I get no
results.
 

[...]

   

What am I doing wrong here?
 

[...]

Could you post your schema.xml. In particular, how is the Artikel
field being processed at index/query time?

Regards,
Gora

RE: Newbie question about search behavior

2010-08-16 Thread Markus Jelsma

You can append it in your middleware, or try the EdgeNGramTokenizer [1]. If 
you're going for the latter, don't forget to reindex and expect a larger index.

 

[1]: 
http://lucene.apache.org/java/2_9_0/api/all/org/apache/lucene/analysis/ngram/EdgeNGramTokenizer.html
-Original message-
From: Mike Thomsen mikerthom...@gmail.com
Sent: Mon 16-08-2010 19:09
To: solr-user@lucene.apache.org; 
Subject: Newbie question about search behavior

Is it possible to set up Lucene to treat a keyword search such as

title:News

implicitly like

title:News*

so that any title that begins with News will be returned without the
user having to throw in a wildcard?

Also, are there any common filters and such that are generally
considered a good practice to throw into the schema for an
English-language website?

Thanks,

Mike

setting up schema (newbie question)

2010-07-20 Thread Travis Low

I have a large database table with many document records, and I plan to use
SOLR to improve the searching for the documents.

The twist here is that perhaps 50% of the records will originate from
outside sources, and sometimes those records may be updated versions of
documents we already have.  Currently, a human visually examines the
incoming information and performs a few document searches, and decides if a
new document must be created, or an existing one should be updated.  We
would like to automate the matching to some extent, and it occurs to me that
SOLR might be useful for this as well.

Each document has many attributes that can be used for matching.  The
attributes are all in lookup tables.  For example, there is a location
field that might be something like Central Public Library, Crawford, NE
for row with id #.  The incoming document might have something like
Crawford Central Public Library, Nebraska, which ideally would map to
# as well.

I'm currently thinking that a two-phase import might work.  First, we use
SOLR to try and get a list of attribute ids for the incoming document.
Those can be used for ordinary database queries to find primary keys of
potential matches.  Then we use SOLR again to search the reduced list for
the unstructured information, essentially by including those primary keys as
part of the search.

I was looking at the example for DIH here:
http://wiki.apache.org/solr/DataImportHandler and it is clear, but it
obviously slanted on finding the products.  I need to find the categories so
that I can *then* find the products, if that makes sense.

Any suggestions on how to proceed?  My first thought is that I should set up
two SOLR instances, one for indexing only attributes, and one for the
documents themselves.

Thanks in advance for any help.

cheers,

Travis

Re: setting up schema (newbie question)

2010-07-20 Thread Lance Norskog

This is a DIH plug-in that lets you seach Solr directly in the processing chain.

https://issues.apache.org/jira/browse/SOLR-1499

You can fetch a database record, search Solr, then search the DB again
using the return values.

Lance

On Tue, Jul 20, 2010 at 1:35 PM, Travis Low t...@4centurion.com wrote:
 I have a large database table with many document records, and I plan to use
 SOLR to improve the searching for the documents.

 The twist here is that perhaps 50% of the records will originate from
 outside sources, and sometimes those records may be updated versions of
 documents we already have.  Currently, a human visually examines the
 incoming information and performs a few document searches, and decides if a
 new document must be created, or an existing one should be updated.  We
 would like to automate the matching to some extent, and it occurs to me that
 SOLR might be useful for this as well.

 Each document has many attributes that can be used for matching.  The
 attributes are all in lookup tables.  For example, there is a location
 field that might be something like Central Public Library, Crawford, NE
 for row with id #.  The incoming document might have something like
 Crawford Central Public Library, Nebraska, which ideally would map to
 # as well.

 I'm currently thinking that a two-phase import might work.  First, we use
 SOLR to try and get a list of attribute ids for the incoming document.
 Those can be used for ordinary database queries to find primary keys of
 potential matches.  Then we use SOLR again to search the reduced list for
 the unstructured information, essentially by including those primary keys as
 part of the search.

 I was looking at the example for DIH here:
 http://wiki.apache.org/solr/DataImportHandler and it is clear, but it
 obviously slanted on finding the products.  I need to find the categories so
 that I can *then* find the products, if that makes sense.

 Any suggestions on how to proceed?  My first thought is that I should set up
 two SOLR instances, one for indexing only attributes, and one for the
 documents themselves.

 Thanks in advance for any help.

 cheers,

 Travis




-- 
Lance Norskog
goks...@gmail.com

Re: newbie question on how to batch commit documents

2010-06-01 Thread findbestopensource

Add commit after the loop. I would advise to use commit in a separate
thread. I do keep separate timer thread, where every minute I will do
commit and at the end of every day I will optimize the index.

Regards
Aditya
www.findbestopensource.com


On Tue, Jun 1, 2010 at 2:57 AM, Steve Kuo kuosen...@gmail.com wrote:

 I have a newbie question on what is the best way to batch add/commit a
 large
 collection of document data via solrj.  My first attempt  was to write a
 multi-threaded application that did following.

 CollectionSolrInputDocument docs = new ArrayListSolrInputDocument();
 for (Widget w : widges) {
doc.addField(id, w.getId());
doc.addField(name, w.getName());
   doc.addField(price, w.getPrice());
doc.addField(category, w.getCat());
doc.addField(srcType, w.getSrcType());
docs.add(doc);

// commit docs to solr server
server.add(docs);
server.commit();
 }

 And I got this exception.

 rg.apache.solr.common.SolrException:

 Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later


 Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later

at
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424)
at
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
at
 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at
 org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:86)

 The solrj wiki/documents seemed to indicate that because multiple threads
 were calling SolrServer.commit() which in term called
 CommonsHttpSolrServer.request() resulting in multiple searchers.  My first
 thought was to change the configs for autowarming.  But after looking at
 the
 autowarm params, I am not sure what can be changed or perhaps a different
 approach is recommened.

filterCache
  class=solr.FastLRUCache
  size=512
  initialSize=512
  autowarmCount=0/

queryResultCache
  class=solr.LRUCache
  size=512
  initialSize=512
  autowarmCount=0/

documentCache
  class=solr.LRUCache
  size=512
  initialSize=512
  autowarmCount=0/

 Your help is much appreciated.

Re: newbie question on how to batch commit documents

2010-06-01 Thread olivier sallou

I would additionally suggest to use embeddedSolrServer for large uploads if
possible, performance are better.

2010/5/31 Steve Kuo kuosen...@gmail.com

 I have a newbie question on what is the best way to batch add/commit a
 large
 collection of document data via solrj.  My first attempt  was to write a
 multi-threaded application that did following.

 CollectionSolrInputDocument docs = new ArrayListSolrInputDocument();
 for (Widget w : widges) {
doc.addField(id, w.getId());
doc.addField(name, w.getName());
   doc.addField(price, w.getPrice());
doc.addField(category, w.getCat());
doc.addField(srcType, w.getSrcType());
docs.add(doc);

// commit docs to solr server
server.add(docs);
server.commit();
 }

 And I got this exception.

 rg.apache.solr.common.SolrException:

 Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later


 Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later

at
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424)
at
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
at
 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at
 org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:86)

 The solrj wiki/documents seemed to indicate that because multiple threads
 were calling SolrServer.commit() which in term called
 CommonsHttpSolrServer.request() resulting in multiple searchers.  My first
 thought was to change the configs for autowarming.  But after looking at
 the
 autowarm params, I am not sure what can be changed or perhaps a different
 approach is recommened.

filterCache
  class=solr.FastLRUCache
  size=512
  initialSize=512
  autowarmCount=0/

queryResultCache
  class=solr.LRUCache
  size=512
  initialSize=512
  autowarmCount=0/

documentCache
  class=solr.LRUCache
  size=512
  initialSize=512
  autowarmCount=0/

 Your help is much appreciated.

Re: newbie question on how to batch commit documents

2010-06-01 Thread Chris Hostetter


: CommonsHttpSolrServer.request() resulting in multiple searchers.  My first
: thought was to change the configs for autowarming.  But after looking at the
: autowarm params, I am not sure what can be changed or perhaps a different
: approach is recommened.

even with 0 autowarming (which is what you have) it can still take time to 
close/open a searcher on every commit -- which is why a commit per doc is 
not usually a good idea (and is *definitely* not a good idea when doing 
batch indexing.

most people can get away with just doing one commit after all their docs 
have been added (ie: at the end of the batch) but if you've got a ot of 
distinct clients, doing a lot of parllel indexing and you don't want to 
coordinate who is responsible for sending the commit, you can configure 
autocommit to happen on the solr server...

http://wiki.apache.org/solr/SolrConfigXml#Update_Handler_Section

...but in general you should make sure that your clients sending docs can 
deal with the occasional long delays (or possibly even needing to retry) 
when an occasional commit might block add/delete operations because of an 
expensive segment merge.

-Hoss

Re: newbie question on how to batch commit documents

2010-05-31 Thread Erik Hatcher

Move the commit outside your loop and you'll be in better shape.   
Better yet, enable autocommit in solrconfig.xml and don't commit from  
your multithreaded client, otherwise you still run the risk of too  
many commits happening concurrently.


Erik

On May 31, 2010, at 5:27 PM, Steve Kuo wrote:

I have a newbie question on what is the best way to batch add/commit  
a large
collection of document data via solrj.  My first attempt  was to  
write a

multi-threaded application that did following.

CollectionSolrInputDocument docs = new  
ArrayListSolrInputDocument();

for (Widget w : widges) {
   doc.addField(id, w.getId());
   doc.addField(name, w.getName());
  doc.addField(price, w.getPrice());
   doc.addField(category, w.getCat());
   doc.addField(srcType, w.getSrcType());
   docs.add(doc);

   // commit docs to solr server
   server.add(docs);
   server.commit();
}

And I got this exception.

rg.apache.solr.common.SolrException:
Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later

Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later

	at  
org 
.apache 
.solr 
.client 
.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java: 
424)
	at  
org 
.apache 
.solr 
.client 
.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java: 
243)
	at  
org 
.apache 
.solr 
.client 
.solrj 
.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)

at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:86)

The solrj wiki/documents seemed to indicate that because multiple  
threads

were calling SolrServer.commit() which in term called
CommonsHttpSolrServer.request() resulting in multiple searchers.  My  
first
thought was to change the configs for autowarming.  But after  
looking at the
autowarm params, I am not sure what can be changed or perhaps a  
different

approach is recommened.

   filterCache
 class=solr.FastLRUCache
 size=512
 initialSize=512
 autowarmCount=0/

   queryResultCache
 class=solr.LRUCache
 size=512
 initialSize=512
 autowarmCount=0/

   documentCache
 class=solr.LRUCache
 size=512
 initialSize=512
 autowarmCount=0/

Your help is much appreciated.

Re: Newbie Question on Custom Query Generation

2010-01-29 Thread Erik Hatcher

dismax won't quite give you the same query result.  What you can do  
pretty easily, though, is create a QParser and QParserPlugin pair,  
register it solrconfig.xml and then use defType=name registered.   
Pretty straightforward.  Have a look at Solr's various QParserPlugin  
implementations for details.


Erik

On Jan 29, 2010, at 12:30 AM, Abin Mathew wrote:

Hi I want to generate my own customized query from the input string  
entered

by the user. It should look something like this

*Search field : Microsoft*
*
Generated Query*  :
description:microsoft +((tags:microsoft^1.5 title:microsoft^3.0
role:microsoft requi
rement:microsoft company:microsoft city:microsoft)^5.0)  
tags:microsoft^2.0

title:microsoft^3.5 functionalArea:microsoft

*The lucene code we used is like this*
BooleanQuery must = new BooleanQuery();

addToBooleanQuery(must, tags, inputData, synonymAnalyzer, 1.5f);
addToBooleanQuery(must, title, inputData, synonymAnalyzer);
addToBooleanQuery(must, role, inputData, synonymAnalyzer);
addToBooleanQuery(query, description, inputData, synonymAnalyzer);
addToBooleanQuery(must, requirement, inputData, synonymAnalyzer);
addToBooleanQuery(must, company, inputData, standardAnalyzer);
addToBooleanQuery(must, city, inputData, standardAnalyzer);
must.setBoost(5.0f);
query.add(must, Occur.MUST);
addToBooleanQuery(query, tags, includeAll, synonymAnalyzer, 2.0f);
addToBooleanQuery(query, title, includeAll, synonymAnalyzer, 3.5f);
addToBooleanQuery(query, functionalArea, inputData,  
synonymAnalyzer,);

*
In Simple english*
addToBooleanQuery will add the particular field to the query after  
analysing

using the analyser mentioned and setting a boost as specified
So there MUST be a keyword match with any of the fields
tags,title,role,description,requirement,company,city and it SHOULD  
occur

in the fields tags,title and functionalArea.

Hope you have got an idea of my requirement. I am not asking anyone  
to do it
for me. Please let me know where can i start and give me some useful  
tips to
move ahead with this. I believe that it has to do with modifying the  
XML
configuration file and setting the parameters in Dismax handler. But  
I am

still not sure. Please help

Thanks  Regards
Abin Mathew

Re: Newbie Question on Custom Query Generation

2010-01-29 Thread Wangsheng Mei

What's the point of generating your own query?
Are you sure that solr query syntax cannot satisfy your need?

2010/1/29 Abin Mathew abin.mat...@toostep.com

 Hi I want to generate my own customized query from the input string entered
 by the user. It should look something like this

 *Search field : Microsoft*
 *
 Generated Query*  :
 description:microsoft +((tags:microsoft^1.5 title:microsoft^3.0
 role:microsoft requi
 rement:microsoft company:microsoft city:microsoft)^5.0) tags:microsoft^2.0
 title:microsoft^3.5 functionalArea:microsoft

 *The lucene code we used is like this*
 BooleanQuery must = new BooleanQuery();

 addToBooleanQuery(must, tags, inputData, synonymAnalyzer, 1.5f);
 addToBooleanQuery(must, title, inputData, synonymAnalyzer);
 addToBooleanQuery(must, role, inputData, synonymAnalyzer);
 addToBooleanQuery(query, description, inputData, synonymAnalyzer);
 addToBooleanQuery(must, requirement, inputData, synonymAnalyzer);
 addToBooleanQuery(must, company, inputData, standardAnalyzer);
 addToBooleanQuery(must, city, inputData, standardAnalyzer);
 must.setBoost(5.0f);
 query.add(must, Occur.MUST);
 addToBooleanQuery(query, tags, includeAll, synonymAnalyzer, 2.0f);
 addToBooleanQuery(query, title, includeAll, synonymAnalyzer, 3.5f);
 addToBooleanQuery(query, functionalArea, inputData, synonymAnalyzer,);
 *
 In Simple english*
 addToBooleanQuery will add the particular field to the query after
 analysing
 using the analyser mentioned and setting a boost as specified
 So there MUST be a keyword match with any of the fields
 tags,title,role,description,requirement,company,city and it SHOULD occur
 in the fields tags,title and functionalArea.

 Hope you have got an idea of my requirement. I am not asking anyone to do
 it
 for me. Please let me know where can i start and give me some useful tips
 to
 move ahead with this. I believe that it has to do with modifying the XML
 configuration file and setting the parameters in Dismax handler. But I am
 still not sure. Please help

 Thanks  Regards
 Abin Mathew




-- 
梅旺生

Re: Newbie Question on Custom Query Generation

2010-01-29 Thread Abin Mathew

Hi, I realized the power of Dismax Query Handler recently and now I
dont need to generate my own query since Dismax is giving better
results.Thanks a lot

2010/1/29 Wangsheng Mei hairr...@gmail.com:
 What's the point of generating your own query?
 Are you sure that solr query syntax cannot satisfy your need?

 2010/1/29 Abin Mathew abin.mat...@toostep.com

 Hi I want to generate my own customized query from the input string entered
 by the user. It should look something like this

 *Search field : Microsoft*
 *
 Generated Query*  :
 description:microsoft +((tags:microsoft^1.5 title:microsoft^3.0
 role:microsoft requi
 rement:microsoft company:microsoft city:microsoft)^5.0) tags:microsoft^2.0
 title:microsoft^3.5 functionalArea:microsoft

 *The lucene code we used is like this*
 BooleanQuery must = new BooleanQuery();

 addToBooleanQuery(must, tags, inputData, synonymAnalyzer, 1.5f);
 addToBooleanQuery(must, title, inputData, synonymAnalyzer);
 addToBooleanQuery(must, role, inputData, synonymAnalyzer);
 addToBooleanQuery(query, description, inputData, synonymAnalyzer);
 addToBooleanQuery(must, requirement, inputData, synonymAnalyzer);
 addToBooleanQuery(must, company, inputData, standardAnalyzer);
 addToBooleanQuery(must, city, inputData, standardAnalyzer);
 must.setBoost(5.0f);
 query.add(must, Occur.MUST);
 addToBooleanQuery(query, tags, includeAll, synonymAnalyzer, 2.0f);
 addToBooleanQuery(query, title, includeAll, synonymAnalyzer, 3.5f);
 addToBooleanQuery(query, functionalArea, inputData, synonymAnalyzer,);
 *
 In Simple english*
 addToBooleanQuery will add the particular field to the query after
 analysing
 using the analyser mentioned and setting a boost as specified
 So there MUST be a keyword match with any of the fields
 tags,title,role,description,requirement,company,city and it SHOULD occur
 in the fields tags,title and functionalArea.

 Hope you have got an idea of my requirement. I am not asking anyone to do
 it
 for me. Please let me know where can i start and give me some useful tips
 to
 move ahead with this. I believe that it has to do with modifying the XML
 configuration file and setting the parameters in Dismax handler. But I am
 still not sure. Please help

 Thanks  Regards
 Abin Mathew




 --
 梅旺生

Re: Solr + MySQL newbie question

2010-01-28 Thread Erik Hatcher

Solr has the DataImportHandler framework that allows a straightforward  
configuration to control indexing from any relational database (with  
JDBC support).  See http://wiki.apache.org/solr/DataImportHandler for  
details.


If you did go the Java route (though not recommended at this point),  
using the SolrJ library you won't ever see XML (nor would there even  
be XML involved if configured for the binary protocol).


Erik

On Jan 28, 2010, at 2:47 AM, Manish Gulati wrote:

I am planning to use Solr to power search on the site. Our db is  
mysql and we need to index some tables in the schema into Solr.  
Based on my initial research it appears that I need to write a java  
program that will create xml documents (say mydocs.xml) with add  
command and then use this command to index it in Solr java -jar  
post.jar mydocs.xml.


Kindly let me know if this is fine or some other sophiscticated  
solution exist for mysql synching.


--
Manish

Solr + MySQL newbie question

2010-01-27 Thread Manish Gulati

I am planning to use Solr to power search on the site. Our db is mysql and we 
need to index some tables in the schema into Solr. Based on my initial research 
it appears that I need to write a java program that will create xml documents 
(say mydocs.xml) with add command and then use this command to index it in Solr 
java -jar post.jar mydocs.xml. 

Kindly let me know if this is fine or some other sophiscticated solution exist 
for mysql synching. 

--
Manish

Re: Newbie question

2009-05-13 Thread Wayne Pope


Hello Shalin,

thaks you for your help. yes it answers my question.

Much appreciated



Shalin Shekhar Mangar wrote:
 
 On Tue, May 12, 2009 at 9:48 PM, Wayne Pope
 waynemailingli...@gmail.comwrote:
 

 I have this request:


 http://localhost:8983/solr/select?start=0rows=20qt=dismaxq=copyhl=truehl.snippets=4hl.fragsize=50facet=truefacet.mincount=1facet.limit=8facet.field=typefq=company-id%3A1wt=javabinversion=2.2

 (I've been using this to see it rendered in the browser:

 http://localhost:8983/solr/select?indent=onversion=2.2q=copystart=0rows=10fl=*%2Cscoreqt=standardwt=standardexplainOther=hl=onhl.fl=featureshl=truehl.fragsize=50
 )


 that I've been trying out. I get a good responce - however the
 hl.fragsize
 is ignored and the hl.fragsize in the solrconfig.xml is ignored. Instead
 I
 get back the whole document (10,000 chars!) in the doc txt field. And
 bizarely the response header is this:

 
 hl.fragsize is relevant only for the snippets created by the highlighter.
 The returned fields will always have the complete data for a document.
 Does
 that answer your question?
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 

-- 
View this message in context: 
http://www.nabble.com/Newbie-question-tp23505802p23518485.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie question

2009-05-12 Thread Shalin Shekhar Mangar

On Tue, May 12, 2009 at 9:48 PM, Wayne Pope waynemailingli...@gmail.comwrote:


 I have this request:


 http://localhost:8983/solr/select?start=0rows=20qt=dismaxq=copyhl=truehl.snippets=4hl.fragsize=50facet=truefacet.mincount=1facet.limit=8facet.field=typefq=company-id%3A1wt=javabinversion=2.2

 (I've been using this to see it rendered in the browser:

 http://localhost:8983/solr/select?indent=onversion=2.2q=copystart=0rows=10fl=*%2Cscoreqt=standardwt=standardexplainOther=hl=onhl.fl=featureshl=truehl.fragsize=50
 )


 that I've been trying out. I get a good responce - however the hl.fragsize
 is ignored and the hl.fragsize in the solrconfig.xml is ignored. Instead I
 get back the whole document (10,000 chars!) in the doc txt field. And
 bizarely the response header is this:


hl.fragsize is relevant only for the snippets created by the highlighter.
The returned fields will always have the complete data for a document. Does
that answer your question?

-- 
Regards,
Shalin Shekhar Mangar.

Re: newbie question about indexing RSS feeds with SOLR

2009-04-28 Thread Koji Sekiguchi

Just an FYI: I've never tried, but there seems to be RSS feed sample in DIH:

http://wiki.apache.org/solr/DataImportHandler#head-e68aa93c9ca7b8d261cede2bf1d6110ab1725476

Koji

Tom H wrote:
 Hi,

 I've just downloaded solr and got it working, it seems pretty cool.

 I have a project which needs to maintain an index of articles that were
 published on the web via rss feed.

 Basically I need to watch some rss feeds, and search and index the items
 to be searched.

 Additionally, I need to run jobs based on particular keywords or events
 during parsing.

 is this something that I can do with SOLR? are their any related
 projects using SOLR that are better suited to indexing specific xml
 types like RSS?

 I had a look at the project enormo which appears to be a property
 lettings and sales listing aggregator. But I can see that they must have
 solved some of the problems I am thinking of such as scheduled indexing
 of remote resources, and writing a parser to get data fields from some
 other sites templates.

 Any advice would be welcome...

 Many Thanks,

 Tom

SOLR newbie question: How to filter the results based on my Unique Key

2009-02-28 Thread Venu Mittal

Hi List,

Is it possible to filter out the duplicate results using a particular field in 
the document.
e.g.

doc
 field name=cust_id123/field
 field name=unique_id1/field   
field name=emaila...@b.com/field
/doc
doc
 field name=cust_id123/field
 field name=unique_id2/field
field name=emaila...@b.com/field
/doc

Now if I search for email = a...@b.com I get 2 search results but I want to 
send just one record cause my cust_id is same. Is it possible or do I need to 
handle it in the calling application.
 
Thanks

1 2 >

1 - 100 of 167 matches

Mail list logo