subject:"Newbie question"

Re: Newbie question - Error loading an existing config file

2019-02-21 Thread Erick Erickson

I have absolutely no idea when it comes to Drupal, the Drupal folks would be
much better equipped to answer.

Best,
Erick

> On Feb 21, 2019, at 8:16 AM, Greg Robinson  wrote:
> 
> Thanks for the feedback.
> 
> So here is where I'm at.
> 
> I first went ahead and deleted the existing core that was returning the
> error using the following command: bin/solr delete -c new_solr_core
> 
> Now when I access the admin panel, there are no errors.
> 
> I then referred to the large "warning" box on the CREATE action
> documentation:
> 
> "While it’s possible to create a core for a non-existent collection, this
> approach is not supported and not recommended. Always create a collection
> using the *Collections API* before creating a core directly for it."
> 
> I then tried to create a collection using the following "Collections API"
> command:
> 
> http://localhost:8983/solr/admin/collections?action=CREATE&name=newCollection&numShards=2&replicationFactor=1&wt=xml
> 
> This was the response:
> 
> 
> 
> 400
> 2
> 
> 
> 
> org.apache.solr.common.SolrException
> org.apache.solr.common.SolrException
> 
> Solr instance is not running in SolrCloud mode.
> 400
> 
> 
> 
> I guess my main question is, do I need to be running in "SolrCloud mode" if
> my intention is to use Solr Server to index a Drupal 7 website? We're
> currently using opensolr.com which is working fine but we're trying to
> avoid the monthly costs associated with their "Shared Solr Cloud" plan.
> 
> Thanks!
> 
> 
> 
> 
> 
> On Wed, Feb 20, 2019 at 8:34 PM Shawn Heisey  wrote:
> 
>> On 2/20/2019 11:07 AM, Greg Robinson wrote:
>>> Lets try this: https://imgur.com/a/z5OzbLW
>>> 
>>> What I'm trying to do seems pretty straightforward:
>>> 
>>> 1. Install Solr Server 7.4 on Linux (Completed)
>>> 2. Connect my Drupal 7 site to the Solr Server and use it for indexing
>>> content
>>> 
>>> My understanding is that I must first create a core in order to connect
>> my
>>> drupal site to Solr Server. This is where I'm currently stuck.
>> 
>> The assertion in your screenshot that the dataDir must exist is
>> incorrect.  If current versions of Solr say this also, that is something
>> we will need to change.  This is what actually happens:  If all the
>> other requirements are met and the dataDir does not exist, it will be
>> created automatically when the core starts, if the process has
>> sufficient permissions.
>> 
>> See the large "warning" box on the CREATE action documentation for
>> details on what you need:
>> 
>> 
>> https://lucene.apache.org/solr/guide/7_4/coreadmin-api.html#coreadmin-create
>> 
>> The warning box is the one that has a red triangle to the left of it.
>> The red triangle contains an exclamation point.
>> 
>> The essence of what it says there is that the core's instance directory
>> must exist, that directory must contain a "conf" directory, and all
>> required config files must be in the conf directory.
>> 
>> If you're running in SolrCloud mode, then you're using the wrong API.
>> 
>> Thanks,
>> Shawn
>> 
> 
> 
> -- 
> Greg Robinson
> CEO - Mobile*Enhanced*
> www.mobileenhanced.com
> g...@mobileenhanced.com
> 303-598-1865

Re: Newbie question - Error loading an existing config file

2019-02-21 Thread Greg Robinson

Thanks for the feedback.

So here is where I'm at.

I first went ahead and deleted the existing core that was returning the
error using the following command: bin/solr delete -c new_solr_core

Now when I access the admin panel, there are no errors.

I then referred to the large "warning" box on the CREATE action
documentation:

"While it’s possible to create a core for a non-existent collection, this
approach is not supported and not recommended. Always create a collection
using the *Collections API* before creating a core directly for it."

I then tried to create a collection using the following "Collections API"
command:

http://localhost:8983/solr/admin/collections?action=CREATE&name=newCollection&numShards=2&replicationFactor=1&wt=xml

This was the response:

400
2

org.apache.solr.common.SolrException
org.apache.solr.common.SolrException

Solr instance is not running in SolrCloud mode.
400

I guess my main question is, do I need to be running in "SolrCloud mode" if
my intention is to use Solr Server to index a Drupal 7 website? We're
currently using opensolr.com which is working fine but we're trying to
avoid the monthly costs associated with their "Shared Solr Cloud" plan.

Thanks!

On Wed, Feb 20, 2019 at 8:34 PM Shawn Heisey  wrote:

> On 2/20/2019 11:07 AM, Greg Robinson wrote:
> > Lets try this: https://imgur.com/a/z5OzbLW
> >
> > What I'm trying to do seems pretty straightforward:
> >
> > 1. Install Solr Server 7.4 on Linux (Completed)
> > 2. Connect my Drupal 7 site to the Solr Server and use it for indexing
> > content
> >
> > My understanding is that I must first create a core in order to connect
> my
> > drupal site to Solr Server. This is where I'm currently stuck.
>
> The assertion in your screenshot that the dataDir must exist is
> incorrect.  If current versions of Solr say this also, that is something
> we will need to change.  This is what actually happens:  If all the
> other requirements are met and the dataDir does not exist, it will be
> created automatically when the core starts, if the process has
> sufficient permissions.
>
> See the large "warning" box on the CREATE action documentation for
> details on what you need:
>
>
> https://lucene.apache.org/solr/guide/7_4/coreadmin-api.html#coreadmin-create
>
> The warning box is the one that has a red triangle to the left of it.
> The red triangle contains an exclamation point.
>
> The essence of what it says there is that the core's instance directory
> must exist, that directory must contain a "conf" directory, and all
> required config files must be in the conf directory.
>
> If you're running in SolrCloud mode, then you're using the wrong API.
>
> Thanks,
> Shawn
>

-- 
Greg Robinson
CEO - Mobile*Enhanced*
www.mobileenhanced.com
g...@mobileenhanced.com
303-598-1865

Re: Newbie question - Error loading an existing config file

2019-02-20 Thread Shawn Heisey


On 2/20/2019 11:07 AM, Greg Robinson wrote:

Lets try this: https://imgur.com/a/z5OzbLW

What I'm trying to do seems pretty straightforward:

1. Install Solr Server 7.4 on Linux (Completed)
2. Connect my Drupal 7 site to the Solr Server and use it for indexing
content

My understanding is that I must first create a core in order to connect my
drupal site to Solr Server. This is where I'm currently stuck.


The assertion in your screenshot that the dataDir must exist is 
incorrect.  If current versions of Solr say this also, that is something 
we will need to change.  This is what actually happens:  If all the 
other requirements are met and the dataDir does not exist, it will be 
created automatically when the core starts, if the process has 
sufficient permissions.


See the large "warning" box on the CREATE action documentation for 
details on what you need:


https://lucene.apache.org/solr/guide/7_4/coreadmin-api.html#coreadmin-create

The warning box is the one that has a red triangle to the left of it. 
The red triangle contains an exclamation point.


The essence of what it says there is that the core's instance directory 
must exist, that directory must contain a "conf" directory, and all 
required config files must be in the conf directory.


If you're running in SolrCloud mode, then you're using the wrong API.

Thanks,
Shawn

Re: Newbie question - Error loading an existing config file

2019-02-20 Thread Greg Robinson

Gotcha.

Lets try this: https://imgur.com/a/z5OzbLW

What I'm trying to do seems pretty straightforward:

1. Install Solr Server 7.4 on Linux (Completed)
2. Connect my Drupal 7 site to the Solr Server and use it for indexing
content

My understanding is that I must first create a core in order to connect my
drupal site to Solr Server. This is where I'm currently stuck.

Thanks for your help!

On Wed, Feb 20, 2019 at 10:43 AM Erick Erickson 
wrote:

> Attachments generally are stripped by the mail server.
>
> Are you trying to create a core as part of a SolrCloud _collection_? If
> so, this
> is an anti-pattern, use the collection API commands. Shot in the dark.
>
> Best,
> Erick
>
> > On Feb 19, 2019, at 3:05 PM, Greg Robinson 
> wrote:
> >
> > I used the front end admin (see attached)
> >
> > thanks
> >
> > On Tue, Feb 19, 2019 at 3:54 PM Erick Erickson 
> wrote:
> > Hmmm, that’s not very helpful…..
> >
> > Don’t quite know what to say. There should be something more helpful
> > in the logs.
> >
> > Hmmm, How did you create the core?
> >
> > Best,
> > Erick
> >
> >
> > > On Feb 19, 2019, at 1:29 PM, Greg Robinson 
> wrote:
> > >
> > > Thanks for your direction regarding the log.
> > >
> > > I was able to locate it and these two lines stood out:
> > >
> > > Caused by: org.apache.solr.common.SolrException: Could not load conf
> for
> > > core new_solr_core: Error loading solr config from
> > > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> > >
> > > Caused by: org.apache.solr.common.SolrException: Error loading solr
> config
> > > from /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> > >
> > > which seems to point to the same issue.
> > >
> > > I also went ahead and updated permissions/owner to "solr" on all
> > > directories and files within "/home/solr/server/solr/new_solr_core".
> > >
> > > Still no luck. This is currently the same message that I'm getting on
> the
> > > admin front end:
> > >
> > > new_solr_core:
> > >
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > > Could not load conf for core new_solr_core: Error loading solr config
> from
> > > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml.
> > >
> > > thanks!
> > >
> > >
> > >
> > > On Tue, Feb 19, 2019 at 1:55 PM Erick Erickson <
> erickerick...@gmail.com>
> > > wrote:
> > >
> > >> do a recursive seach for “solr.log" under SOLR_HOME…….
> > >>
> > >> Best,
> > >> ERick
> > >>
> > >>> On Feb 19, 2019, at 8:08 AM, Greg Robinson 
> > >> wrote:
> > >>>
> > >>> Hi Erick,
> > >>>
> > >>> Thanks for the quick response.
> > >>>
> > >>> Here is what is currently contained within  the conf dir:
> > >>>
> > >>> drwxr-xr-x 2 root root  4096 Feb 18 17:51 lang
> > >>> -rw-r--r-- 1 root root 54513 Feb 18 17:51 managed-schema
> > >>> -rw-r--r-- 1 root root   329 Feb 18 17:51 params.json
> > >>> -rw-r--r-- 1 root root   894 Feb 18 17:51 protwords.txt
> > >>> -rwxrwxrwx 1 root root 55323 Feb 18 17:51 solrconfig.xml
> > >>> -rw-r--r-- 1 root root   795 Feb 18 17:51 stopwords.txt
> > >>> -rw-r--r-- 1 root root  1153 Feb 18 17:51 synonyms.txt
> > >>>
> > >>> As far as the log, where exactly might I find the specific log that
> would
> > >>> give more info in regards to this error?
> > >>>
> > >>> thanks again!
> > >>>
> > >>> On Tue, Feb 19, 2019 at 9:06 AM Erick Erickson <
> erickerick...@gmail.com>
> > >>> wrote:
> > >>>
> >  Are all the other files there in your conf dir? Solrconfig.xml
> > >> references
> >  things like nanaged-schema etc.
> > 
> >  Also, your log file might contain more clues...
> > 
> >  On Tue, Feb 19, 2019, 08:03 Greg Robinson  > >> wrote:
> > 
> > > Hello,
> > >
> > > We have Solr 7.4 up and running on a Linux machine.
> > >
> > > I'm just trying to add a new core so that I can eventually point a
> > >> Drupal
> > > site to the Solr Server for indexing.
> > >
> > > When attempting to add a core, I'm getting the following error:
> > >
> > > new_solr_core:
> > >
> > 
> > >>
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > > Could not load conf for core new_solr_core: Error loading solr
> config
> >  from
> > > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> > >
> > > I've confirmed that
> > > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml exists
> but I'm
> > > still getting the error.
> > >
> > > Any direction is appreciated.
> > >
> > > Thanks!
> > >
> > 
> > >>>
> > >>>
> > >>> --
> > >>> Greg Robinson
> > >>> CEO - Mobile*Enhanced*
> > >>> www.mobileenhanced.com
> > >>> g...@mobileenhanced.com
> > >>> 303-598-1865
> > >>
> > >>
> > >
> > > --
> > > Greg Robinson
> > > CEO - Mobile*Enhanced*
> > > www.mobileenhanced.com
> > > g...@mobileenhanced.com
> > > 303-598-1865
> >
> >
> >
> > --
> > Greg Robinson
> > CEO - MobileEnhanced
> > www.mobileenhanced.com
> > g...@mobileenhanced.com
> > 303-598-1865

Re: Newbie question - Error loading an existing config file

2019-02-20 Thread Erick Erickson

Attachments generally are stripped by the mail server.

Are you trying to create a core as part of a SolrCloud _collection_? If so, this
is an anti-pattern, use the collection API commands. Shot in the dark.

Best,
Erick

> On Feb 19, 2019, at 3:05 PM, Greg Robinson  wrote:
> 
> I used the front end admin (see attached)
> 
> thanks
> 
> On Tue, Feb 19, 2019 at 3:54 PM Erick Erickson  
> wrote:
> Hmmm, that’s not very helpful…..
> 
> Don’t quite know what to say. There should be something more helpful
> in the logs.
> 
> Hmmm, How did you create the core?
> 
> Best,
> Erick
> 
> 
> > On Feb 19, 2019, at 1:29 PM, Greg Robinson  wrote:
> > 
> > Thanks for your direction regarding the log.
> > 
> > I was able to locate it and these two lines stood out:
> > 
> > Caused by: org.apache.solr.common.SolrException: Could not load conf for
> > core new_solr_core: Error loading solr config from
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> > 
> > Caused by: org.apache.solr.common.SolrException: Error loading solr config
> > from /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> > 
> > which seems to point to the same issue.
> > 
> > I also went ahead and updated permissions/owner to "solr" on all
> > directories and files within "/home/solr/server/solr/new_solr_core".
> > 
> > Still no luck. This is currently the same message that I'm getting on the
> > admin front end:
> > 
> > new_solr_core:
> > org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > Could not load conf for core new_solr_core: Error loading solr config from
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml.
> > 
> > thanks!
> > 
> > 
> > 
> > On Tue, Feb 19, 2019 at 1:55 PM Erick Erickson 
> > wrote:
> > 
> >> do a recursive seach for “solr.log" under SOLR_HOME…….
> >> 
> >> Best,
> >> ERick
> >> 
> >>> On Feb 19, 2019, at 8:08 AM, Greg Robinson 
> >> wrote:
> >>> 
> >>> Hi Erick,
> >>> 
> >>> Thanks for the quick response.
> >>> 
> >>> Here is what is currently contained within  the conf dir:
> >>> 
> >>> drwxr-xr-x 2 root root  4096 Feb 18 17:51 lang
> >>> -rw-r--r-- 1 root root 54513 Feb 18 17:51 managed-schema
> >>> -rw-r--r-- 1 root root   329 Feb 18 17:51 params.json
> >>> -rw-r--r-- 1 root root   894 Feb 18 17:51 protwords.txt
> >>> -rwxrwxrwx 1 root root 55323 Feb 18 17:51 solrconfig.xml
> >>> -rw-r--r-- 1 root root   795 Feb 18 17:51 stopwords.txt
> >>> -rw-r--r-- 1 root root  1153 Feb 18 17:51 synonyms.txt
> >>> 
> >>> As far as the log, where exactly might I find the specific log that would
> >>> give more info in regards to this error?
> >>> 
> >>> thanks again!
> >>> 
> >>> On Tue, Feb 19, 2019 at 9:06 AM Erick Erickson 
> >>> wrote:
> >>> 
>  Are all the other files there in your conf dir? Solrconfig.xml
> >> references
>  things like nanaged-schema etc.
>  
>  Also, your log file might contain more clues...
>  
>  On Tue, Feb 19, 2019, 08:03 Greg Robinson  >> wrote:
>  
> > Hello,
> > 
> > We have Solr 7.4 up and running on a Linux machine.
> > 
> > I'm just trying to add a new core so that I can eventually point a
> >> Drupal
> > site to the Solr Server for indexing.
> > 
> > When attempting to add a core, I'm getting the following error:
> > 
> > new_solr_core:
> > 
>  
> >> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > Could not load conf for core new_solr_core: Error loading solr config
>  from
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> > 
> > I've confirmed that
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml exists but I'm
> > still getting the error.
> > 
> > Any direction is appreciated.
> > 
> > Thanks!
> > 
>  
> >>> 
> >>> 
> >>> --
> >>> Greg Robinson
> >>> CEO - Mobile*Enhanced*
> >>> www.mobileenhanced.com
> >>> g...@mobileenhanced.com
> >>> 303-598-1865
> >> 
> >> 
> > 
> > -- 
> > Greg Robinson
> > CEO - Mobile*Enhanced*
> > www.mobileenhanced.com
> > g...@mobileenhanced.com
> > 303-598-1865
> 
> 
> 
> -- 
> Greg Robinson
> CEO - MobileEnhanced
> www.mobileenhanced.com
> g...@mobileenhanced.com
> 303-598-1865

Re: Newbie question - Error loading an existing config file

2019-02-19 Thread Greg Robinson

I used the front end admin (see attached)

thanks

On Tue, Feb 19, 2019 at 3:54 PM Erick Erickson 
wrote:

> Hmmm, that’s not very helpful…..
>
> Don’t quite know what to say. There should be something more helpful
> in the logs.
>
> Hmmm, How did you create the core?
>
> Best,
> Erick
>
>
> > On Feb 19, 2019, at 1:29 PM, Greg Robinson 
> wrote:
> >
> > Thanks for your direction regarding the log.
> >
> > I was able to locate it and these two lines stood out:
> >
> > Caused by: org.apache.solr.common.SolrException: Could not load conf for
> > core new_solr_core: Error loading solr config from
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> >
> > Caused by: org.apache.solr.common.SolrException: Error loading solr
> config
> > from /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> >
> > which seems to point to the same issue.
> >
> > I also went ahead and updated permissions/owner to "solr" on all
> > directories and files within "/home/solr/server/solr/new_solr_core".
> >
> > Still no luck. This is currently the same message that I'm getting on the
> > admin front end:
> >
> > new_solr_core:
> >
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > Could not load conf for core new_solr_core: Error loading solr config
> from
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml.
> >
> > thanks!
> >
> >
> >
> > On Tue, Feb 19, 2019 at 1:55 PM Erick Erickson 
> > wrote:
> >
> >> do a recursive seach for “solr.log" under SOLR_HOME…….
> >>
> >> Best,
> >> ERick
> >>
> >>> On Feb 19, 2019, at 8:08 AM, Greg Robinson 
> >> wrote:
> >>>
> >>> Hi Erick,
> >>>
> >>> Thanks for the quick response.
> >>>
> >>> Here is what is currently contained within  the conf dir:
> >>>
> >>> drwxr-xr-x 2 root root  4096 Feb 18 17:51 lang
> >>> -rw-r--r-- 1 root root 54513 Feb 18 17:51 managed-schema
> >>> -rw-r--r-- 1 root root   329 Feb 18 17:51 params.json
> >>> -rw-r--r-- 1 root root   894 Feb 18 17:51 protwords.txt
> >>> -rwxrwxrwx 1 root root 55323 Feb 18 17:51 solrconfig.xml
> >>> -rw-r--r-- 1 root root   795 Feb 18 17:51 stopwords.txt
> >>> -rw-r--r-- 1 root root  1153 Feb 18 17:51 synonyms.txt
> >>>
> >>> As far as the log, where exactly might I find the specific log that
> would
> >>> give more info in regards to this error?
> >>>
> >>> thanks again!
> >>>
> >>> On Tue, Feb 19, 2019 at 9:06 AM Erick Erickson <
> erickerick...@gmail.com>
> >>> wrote:
> >>>
>  Are all the other files there in your conf dir? Solrconfig.xml
> >> references
>  things like nanaged-schema etc.
> 
>  Also, your log file might contain more clues...
> 
>  On Tue, Feb 19, 2019, 08:03 Greg Robinson  >> wrote:
> 
> > Hello,
> >
> > We have Solr 7.4 up and running on a Linux machine.
> >
> > I'm just trying to add a new core so that I can eventually point a
> >> Drupal
> > site to the Solr Server for indexing.
> >
> > When attempting to add a core, I'm getting the following error:
> >
> > new_solr_core:
> >
> 
> >>
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > Could not load conf for core new_solr_core: Error loading solr config
>  from
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> >
> > I've confirmed that
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml exists but
> I'm
> > still getting the error.
> >
> > Any direction is appreciated.
> >
> > Thanks!
> >
> 
> >>>
> >>>
> >>> --
> >>> Greg Robinson
> >>> CEO - Mobile*Enhanced*
> >>> www.mobileenhanced.com
> >>> g...@mobileenhanced.com
> >>> 303-598-1865
> >>
> >>
> >
> > --
> > Greg Robinson
> > CEO - Mobile*Enhanced*
> > www.mobileenhanced.com
> > g...@mobileenhanced.com
> > 303-598-1865
>
>

-- 
Greg Robinson
CEO - Mobile*Enhanced*
www.mobileenhanced.com
g...@mobileenhanced.com
303-598-1865

Re: Newbie question - Error loading an existing config file

2019-02-19 Thread Erick Erickson

Hmmm, that’s not very helpful…..

Don’t quite know what to say. There should be something more helpful
in the logs.

Hmmm, How did you create the core?

Best,
Erick


> On Feb 19, 2019, at 1:29 PM, Greg Robinson  wrote:
> 
> Thanks for your direction regarding the log.
> 
> I was able to locate it and these two lines stood out:
> 
> Caused by: org.apache.solr.common.SolrException: Could not load conf for
> core new_solr_core: Error loading solr config from
> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> 
> Caused by: org.apache.solr.common.SolrException: Error loading solr config
> from /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> 
> which seems to point to the same issue.
> 
> I also went ahead and updated permissions/owner to "solr" on all
> directories and files within "/home/solr/server/solr/new_solr_core".
> 
> Still no luck. This is currently the same message that I'm getting on the
> admin front end:
> 
> new_solr_core:
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> Could not load conf for core new_solr_core: Error loading solr config from
> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml.
> 
> thanks!
> 
> 
> 
> On Tue, Feb 19, 2019 at 1:55 PM Erick Erickson 
> wrote:
> 
>> do a recursive seach for “solr.log" under SOLR_HOME…….
>> 
>> Best,
>> ERick
>> 
>>> On Feb 19, 2019, at 8:08 AM, Greg Robinson 
>> wrote:
>>> 
>>> Hi Erick,
>>> 
>>> Thanks for the quick response.
>>> 
>>> Here is what is currently contained within  the conf dir:
>>> 
>>> drwxr-xr-x 2 root root  4096 Feb 18 17:51 lang
>>> -rw-r--r-- 1 root root 54513 Feb 18 17:51 managed-schema
>>> -rw-r--r-- 1 root root   329 Feb 18 17:51 params.json
>>> -rw-r--r-- 1 root root   894 Feb 18 17:51 protwords.txt
>>> -rwxrwxrwx 1 root root 55323 Feb 18 17:51 solrconfig.xml
>>> -rw-r--r-- 1 root root   795 Feb 18 17:51 stopwords.txt
>>> -rw-r--r-- 1 root root  1153 Feb 18 17:51 synonyms.txt
>>> 
>>> As far as the log, where exactly might I find the specific log that would
>>> give more info in regards to this error?
>>> 
>>> thanks again!
>>> 
>>> On Tue, Feb 19, 2019 at 9:06 AM Erick Erickson 
>>> wrote:
>>> 
 Are all the other files there in your conf dir? Solrconfig.xml
>> references
 things like nanaged-schema etc.
 
 Also, your log file might contain more clues...
 
 On Tue, Feb 19, 2019, 08:03 Greg Robinson > wrote:
 
> Hello,
> 
> We have Solr 7.4 up and running on a Linux machine.
> 
> I'm just trying to add a new core so that I can eventually point a
>> Drupal
> site to the Solr Server for indexing.
> 
> When attempting to add a core, I'm getting the following error:
> 
> new_solr_core:
> 
 
>> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> Could not load conf for core new_solr_core: Error loading solr config
 from
> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> 
> I've confirmed that
> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml exists but I'm
> still getting the error.
> 
> Any direction is appreciated.
> 
> Thanks!
> 
 
>>> 
>>> 
>>> --
>>> Greg Robinson
>>> CEO - Mobile*Enhanced*
>>> www.mobileenhanced.com
>>> g...@mobileenhanced.com
>>> 303-598-1865
>> 
>> 
> 
> -- 
> Greg Robinson
> CEO - Mobile*Enhanced*
> www.mobileenhanced.com
> g...@mobileenhanced.com
> 303-598-1865

Re: Newbie question - Error loading an existing config file

2019-02-19 Thread Greg Robinson

Thanks for your direction regarding the log.

I was able to locate it and these two lines stood out:

Caused by: org.apache.solr.common.SolrException: Could not load conf for
core new_solr_core: Error loading solr config from
/home/solr/server/solr/new_solr_core/conf/solrconfig.xml

Caused by: org.apache.solr.common.SolrException: Error loading solr config
from /home/solr/server/solr/new_solr_core/conf/solrconfig.xml

which seems to point to the same issue.

I also went ahead and updated permissions/owner to "solr" on all
directories and files within "/home/solr/server/solr/new_solr_core".

Still no luck. This is currently the same message that I'm getting on the
admin front end:

new_solr_core:
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Could not load conf for core new_solr_core: Error loading solr config from
/home/solr/server/solr/new_solr_core/conf/solrconfig.xml.

thanks!



On Tue, Feb 19, 2019 at 1:55 PM Erick Erickson 
wrote:

> do a recursive seach for “solr.log" under SOLR_HOME…….
>
> Best,
> ERick
>
> > On Feb 19, 2019, at 8:08 AM, Greg Robinson 
> wrote:
> >
> > Hi Erick,
> >
> > Thanks for the quick response.
> >
> > Here is what is currently contained within  the conf dir:
> >
> > drwxr-xr-x 2 root root  4096 Feb 18 17:51 lang
> > -rw-r--r-- 1 root root 54513 Feb 18 17:51 managed-schema
> > -rw-r--r-- 1 root root   329 Feb 18 17:51 params.json
> > -rw-r--r-- 1 root root   894 Feb 18 17:51 protwords.txt
> > -rwxrwxrwx 1 root root 55323 Feb 18 17:51 solrconfig.xml
> > -rw-r--r-- 1 root root   795 Feb 18 17:51 stopwords.txt
> > -rw-r--r-- 1 root root  1153 Feb 18 17:51 synonyms.txt
> >
> > As far as the log, where exactly might I find the specific log that would
> > give more info in regards to this error?
> >
> > thanks again!
> >
> > On Tue, Feb 19, 2019 at 9:06 AM Erick Erickson 
> > wrote:
> >
> >> Are all the other files there in your conf dir? Solrconfig.xml
> references
> >> things like nanaged-schema etc.
> >>
> >> Also, your log file might contain more clues...
> >>
> >> On Tue, Feb 19, 2019, 08:03 Greg Robinson  wrote:
> >>
> >>> Hello,
> >>>
> >>> We have Solr 7.4 up and running on a Linux machine.
> >>>
> >>> I'm just trying to add a new core so that I can eventually point a
> Drupal
> >>> site to the Solr Server for indexing.
> >>>
> >>> When attempting to add a core, I'm getting the following error:
> >>>
> >>> new_solr_core:
> >>>
> >>
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> >>> Could not load conf for core new_solr_core: Error loading solr config
> >> from
> >>> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> >>>
> >>> I've confirmed that
> >>> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml exists but I'm
> >>> still getting the error.
> >>>
> >>> Any direction is appreciated.
> >>>
> >>> Thanks!
> >>>
> >>
> >
> >
> > --
> > Greg Robinson
> > CEO - Mobile*Enhanced*
> > www.mobileenhanced.com
> > g...@mobileenhanced.com
> > 303-598-1865
>
>

-- 
Greg Robinson
CEO - Mobile*Enhanced*
www.mobileenhanced.com
g...@mobileenhanced.com
303-598-1865

Re: Newbie question - Error loading an existing config file

2019-02-19 Thread Erick Erickson

do a recursive seach for “solr.log" under SOLR_HOME…….

Best,
ERick

> On Feb 19, 2019, at 8:08 AM, Greg Robinson  wrote:
> 
> Hi Erick,
> 
> Thanks for the quick response.
> 
> Here is what is currently contained within  the conf dir:
> 
> drwxr-xr-x 2 root root  4096 Feb 18 17:51 lang
> -rw-r--r-- 1 root root 54513 Feb 18 17:51 managed-schema
> -rw-r--r-- 1 root root   329 Feb 18 17:51 params.json
> -rw-r--r-- 1 root root   894 Feb 18 17:51 protwords.txt
> -rwxrwxrwx 1 root root 55323 Feb 18 17:51 solrconfig.xml
> -rw-r--r-- 1 root root   795 Feb 18 17:51 stopwords.txt
> -rw-r--r-- 1 root root  1153 Feb 18 17:51 synonyms.txt
> 
> As far as the log, where exactly might I find the specific log that would
> give more info in regards to this error?
> 
> thanks again!
> 
> On Tue, Feb 19, 2019 at 9:06 AM Erick Erickson 
> wrote:
> 
>> Are all the other files there in your conf dir? Solrconfig.xml references
>> things like nanaged-schema etc.
>> 
>> Also, your log file might contain more clues...
>> 
>> On Tue, Feb 19, 2019, 08:03 Greg Robinson > 
>>> Hello,
>>> 
>>> We have Solr 7.4 up and running on a Linux machine.
>>> 
>>> I'm just trying to add a new core so that I can eventually point a Drupal
>>> site to the Solr Server for indexing.
>>> 
>>> When attempting to add a core, I'm getting the following error:
>>> 
>>> new_solr_core:
>>> 
>> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
>>> Could not load conf for core new_solr_core: Error loading solr config
>> from
>>> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
>>> 
>>> I've confirmed that
>>> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml exists but I'm
>>> still getting the error.
>>> 
>>> Any direction is appreciated.
>>> 
>>> Thanks!
>>> 
>> 
> 
> 
> -- 
> Greg Robinson
> CEO - Mobile*Enhanced*
> www.mobileenhanced.com
> g...@mobileenhanced.com
> 303-598-1865

Re: Newbie question - Error loading an existing config file

2019-02-19 Thread Greg Robinson

Hi Erick,

Thanks for the quick response.

Here is what is currently contained within  the conf dir:

drwxr-xr-x 2 root root  4096 Feb 18 17:51 lang
-rw-r--r-- 1 root root 54513 Feb 18 17:51 managed-schema
-rw-r--r-- 1 root root   329 Feb 18 17:51 params.json
-rw-r--r-- 1 root root   894 Feb 18 17:51 protwords.txt
-rwxrwxrwx 1 root root 55323 Feb 18 17:51 solrconfig.xml
-rw-r--r-- 1 root root   795 Feb 18 17:51 stopwords.txt
-rw-r--r-- 1 root root  1153 Feb 18 17:51 synonyms.txt

As far as the log, where exactly might I find the specific log that would
give more info in regards to this error?

thanks again!

On Tue, Feb 19, 2019 at 9:06 AM Erick Erickson 
wrote:

> Are all the other files there in your conf dir? Solrconfig.xml references
> things like nanaged-schema etc.
>
> Also, your log file might contain more clues...
>
> On Tue, Feb 19, 2019, 08:03 Greg Robinson 
> > Hello,
> >
> > We have Solr 7.4 up and running on a Linux machine.
> >
> > I'm just trying to add a new core so that I can eventually point a Drupal
> > site to the Solr Server for indexing.
> >
> > When attempting to add a core, I'm getting the following error:
> >
> > new_solr_core:
> >
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > Could not load conf for core new_solr_core: Error loading solr config
> from
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
> >
> > I've confirmed that
> > /home/solr/server/solr/new_solr_core/conf/solrconfig.xml exists but I'm
> > still getting the error.
> >
> > Any direction is appreciated.
> >
> > Thanks!
> >
>


-- 
Greg Robinson
CEO - Mobile*Enhanced*
www.mobileenhanced.com
g...@mobileenhanced.com
303-598-1865

Re: Newbie question - Error loading an existing config file

2019-02-19 Thread Erick Erickson

Are all the other files there in your conf dir? Solrconfig.xml references
things like nanaged-schema etc.

Also, your log file might contain more clues...

On Tue, Feb 19, 2019, 08:03 Greg Robinson  Hello,
>
> We have Solr 7.4 up and running on a Linux machine.
>
> I'm just trying to add a new core so that I can eventually point a Drupal
> site to the Solr Server for indexing.
>
> When attempting to add a core, I'm getting the following error:
>
> new_solr_core:
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> Could not load conf for core new_solr_core: Error loading solr config from
> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml
>
> I've confirmed that
> /home/solr/server/solr/new_solr_core/conf/solrconfig.xml exists but I'm
> still getting the error.
>
> Any direction is appreciated.
>
> Thanks!
>

Newbie question - Error loading an existing config file

2019-02-19 Thread Greg Robinson

Hello,

We have Solr 7.4 up and running on a Linux machine.

I'm just trying to add a new core so that I can eventually point a Drupal
site to the Solr Server for indexing.

When attempting to add a core, I'm getting the following error:

new_solr_core:
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Could not load conf for core new_solr_core: Error loading solr config from
/home/solr/server/solr/new_solr_core/conf/solrconfig.xml

I've confirmed that
/home/solr/server/solr/new_solr_core/conf/solrconfig.xml exists but I'm
still getting the error.

Any direction is appreciated.

Thanks!

7.4.0 Newbie Question about bin/post and parsing/extracting html document parts into solr dynamic field.

2018-07-17 Thread Bell, Bob

Hi, 

New to solr, so forgive any missing info on my part.   

1. I am trying figure out how to get an html document html element 
parsed into a solr dynamic field.  Is it possible ?   So let's say I have some 
specific html tag or xml tags within the html 
document, that I created a  Dynamic field for, how do I specify that kind of 
parsing/extraction ?I can successfully
index the documents, and search finds the documents based on the indexing, but 
I would extract a specific set
of data from page, could be embedded xml tag, structured data, hidden fields, 
or specific html tag.   I am just not
sure how to accomplish this.Thank you for any ideas !! 

Thanks,
Bob
Austin Public Library

Re: Newbie Question

2018-01-09 Thread Deepak Goel

*Hello*

*The code which worked for me:*

SolrClient client = new HttpSolrClient.Builder("
http://localhost:8983/solr/shakespeare";).build();

SolrQuery query = new SolrQuery();
query.setRequestHandler("/select");
query.setQuery("text_entry:henry");
query.setFields("text_entry");

QueryResponse queryResponse = null;
try
{
queryResponse = client.query(query);
}
catch (Exception e)
{

}

System.out.println("Query Response: " +queryResponse.toString());

if (queryResponse!=null && queryResponse.getResponse().size()>0)
{
SolrDocumentList results = queryResponse.getResults();
for (int i = 0; i < results.size(); ++i) {
SolrDocument document = results.get(i);
System.out.println("The result is: " +results.get(i));
System.out.println("The Document field names are: "
+document.getFieldNames());
}
}

*The data:*

{"index":{"_index":"shakespeare","_id":0}}
{"type":"act","line_id":1,"play_name":"Henry IV",
"speech_number":"","line_number":"","speaker":"","text_entry":"ACT I"}
{"index":{"_index":"shakespeare","_id":1}}
{"type":"scene","line_id":2,"play_name":"Henry
IV","speech_number":"","line_number":"","speaker":"","text_entry":"SCENE I.
London. The palace."}

*Deepak*



Virus-free.
www.avg.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>



Deepak
"Please stop cruelty to Animals, help by becoming a Vegan"
+91 73500 12833
deic...@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

On Tue, Jan 9, 2018 at 8:09 PM, Shawn Heisey  wrote:

> On 1/8/2018 10:23 AM, Deepak Goel wrote:
> > *I am trying to search for documents in my collection (Shakespeare). The
> > code is as follows:*
> >
> > SolrClient client = new HttpSolrClient.Builder("
> > http://localhost:8983/solr/shakespeare";).build();
> >
> > SolrDocument doc = client.getById("2");
> > *However this does not return any document. What mistake am I making?*
>
> The getById method accesses the handler named "/get", normally defined
> with the RealTimeGetHandler class.  In recent Solr versions, the /get
> handler is defined implicitly and does not have to be configured, but in
> older versions (not sure which ones) you do need to have it in
> solrconfig.xml.
>
> I didn't expect your code to work because getById method returns a
> SolrDocumentList and you have SolrDocument, but apparently this actually
> does work.  I have tried code very similar to yours against the
> techproducts example in version 7.1, and it works perfectly.  I will
> share the exact code I tried and what results I got below.
>
> What code have you tried after the code you've shared?  How are you
> determining that no document is returned?  Are there any error messages
> logged by the client code or Solr?  If there are, can you share them?
>
> Do you have a document in the shakespeare index that has the value "2"
> in whatever field is the uniqueKey?  Does the schema have a uniqueKey
> defined?
>
> Can you find the entry in solr.log that logs the query and share that
> entire log entry?
>
> Code:
>
> public static void main(String[] args) throws SolrServerException,
> IOException
> {
>   String baseUrl = "http://localhost:8983/solr/techproducts";;
>   SolrClient client = new HttpSolrClient.Builder(baseUrl).build();
>   SolrDocument doc = client.getById("SP2514N");
>   System.out.println(doc.getFieldValue("name"));
> }
>
> Console log from that code:
>
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for
> further details.
> Samsung SpinPoint P120 SP2514N - hard drive - 250 GB - ATA-133
>
>
> Including the collection/core name in the URL is an older way of writing
> SolrJ code.  It works well, but multiple collections can be accessed
> through one client object if you change it and your SolrJ version is new
> enough.
>
> Thanks,
> Shawn
>
>

Re: Newbie Question

2018-01-09 Thread Shawn Heisey

On 1/8/2018 10:23 AM, Deepak Goel wrote:
> *I am trying to search for documents in my collection (Shakespeare). The
> code is as follows:*
>
> SolrClient client = new HttpSolrClient.Builder("
> http://localhost:8983/solr/shakespeare";).build();
>
> SolrDocument doc = client.getById("2");
> *However this does not return any document. What mistake am I making?*

The getById method accesses the handler named "/get", normally defined
with the RealTimeGetHandler class.  In recent Solr versions, the /get
handler is defined implicitly and does not have to be configured, but in
older versions (not sure which ones) you do need to have it in
solrconfig.xml.

I didn't expect your code to work because getById method returns a
SolrDocumentList and you have SolrDocument, but apparently this actually
does work.  I have tried code very similar to yours against the
techproducts example in version 7.1, and it works perfectly.  I will
share the exact code I tried and what results I got below.

What code have you tried after the code you've shared?  How are you
determining that no document is returned?  Are there any error messages
logged by the client code or Solr?  If there are, can you share them?

Do you have a document in the shakespeare index that has the value "2"
in whatever field is the uniqueKey?  Does the schema have a uniqueKey
defined?

Can you find the entry in solr.log that logs the query and share that
entire log entry?

Code:

public static void main(String[] args) throws SolrServerException,
IOException
{
  String baseUrl = "http://localhost:8983/solr/techproducts";;
  SolrClient client = new HttpSolrClient.Builder(baseUrl).build();
  SolrDocument doc = client.getById("SP2514N");
  System.out.println(doc.getFieldValue("name"));
}

Console log from that code:

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for
further details.
Samsung SpinPoint P120 SP2514N - hard drive - 250 GB - ATA-133

Including the collection/core name in the URL is an older way of writing
SolrJ code.  It works well, but multiple collections can be accessed
through one client object if you change it and your SolrJ version is new
enough.

Thanks,
Shawn

Re: Newbie Question

2018-01-08 Thread Deepak Goel

Got it . Thank You for your help



Deepak
"Please stop cruelty to Animals, help by becoming a Vegan"
+91 73500 12833
deic...@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

On Mon, Jan 8, 2018 at 11:48 PM, Deepak Goel  wrote:

> *Is this right?*
>
> SolrClient client = new HttpSolrClient.Builder("http:/
> /localhost:8983/solr/shakespeare/select").build();
>
> SolrQuery query = new SolrQuery();
> query.setQuery("henry");
> query.setFields("text_entry");
> query.setStart(0);
>
> queryResponse = client.query(query);
>
> *This is still returning NULL*
>
>
>
> 
>  Virus-free.
> www.avg.com
> 
> <#m_-1646772333528808550_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
>
>
> Deepak
> "Please stop cruelty to Animals, help by becoming a Vegan"
> +91 73500 12833
> deic...@gmail.com
>
> Facebook: https://www.facebook.com/deicool
> LinkedIn: www.linkedin.com/in/deicool
>
> "Plant a Tree, Go Green"
>
> On Mon, Jan 8, 2018 at 10:55 PM, Alexandre Rafalovitch  > wrote:
>
>> I think you are missing /query handler endpoint in the URL. Plus actual
>> search parameters.
>>
>> You may try using the admin UI to build your queries first.
>>
>> Regards,
>> Alex
>>
>> On Jan 8, 2018 12:23 PM, "Deepak Goel"  wrote:
>>
>> > Hello
>> >
>> > *I am trying to search for documents in my collection (Shakespeare). The
>> > code is as follows:*
>> >
>> > SolrClient client = new HttpSolrClient.Builder("
>> > http://localhost:8983/solr/shakespeare";).build();
>> >
>> > SolrDocument doc = client.getById("2");
>> > *However this does not return any document. What mistake am I making?*
>> >
>> > Thank You
>> > Deepak
>> >
>> > Deepak
>> > "Please stop cruelty to Animals, help by becoming a Vegan"
>> > +91 73500 12833
>> > deic...@gmail.com
>> >
>> > Facebook: https://www.facebook.com/deicool
>> > LinkedIn: www.linkedin.com/in/deicool
>> >
>> > "Plant a Tree, Go Green"
>> >
>> > > > utm_source=link&utm_campaign=sig-email&utm_content=webmail>
>> > Virus-free.
>> > www.avg.com
>> > > > utm_source=link&utm_campaign=sig-email&utm_content=webmail>
>> > <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>> >
>>
>
>

Re: Newbie Question

2018-01-08 Thread Deepak Goel

*Is this right?*

SolrClient client = new HttpSolrClient.Builder("
http://localhost:8983/solr/shakespeare/select";).build();

SolrQuery query = new SolrQuery();
query.setQuery("henry");
query.setFields("text_entry");
query.setStart(0);

queryResponse = client.query(query);

*This is still returning NULL*



Virus-free.
www.avg.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>



Deepak
"Please stop cruelty to Animals, help by becoming a Vegan"
+91 73500 12833
deic...@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

On Mon, Jan 8, 2018 at 10:55 PM, Alexandre Rafalovitch 
wrote:

> I think you are missing /query handler endpoint in the URL. Plus actual
> search parameters.
>
> You may try using the admin UI to build your queries first.
>
> Regards,
> Alex
>
> On Jan 8, 2018 12:23 PM, "Deepak Goel"  wrote:
>
> > Hello
> >
> > *I am trying to search for documents in my collection (Shakespeare). The
> > code is as follows:*
> >
> > SolrClient client = new HttpSolrClient.Builder("
> > http://localhost:8983/solr/shakespeare";).build();
> >
> > SolrDocument doc = client.getById("2");
> > *However this does not return any document. What mistake am I making?*
> >
> > Thank You
> > Deepak
> >
> > Deepak
> > "Please stop cruelty to Animals, help by becoming a Vegan"
> > +91 73500 12833
> > deic...@gmail.com
> >
> > Facebook: https://www.facebook.com/deicool
> > LinkedIn: www.linkedin.com/in/deicool
> >
> > "Plant a Tree, Go Green"
> >
> >  > utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> > Virus-free.
> > www.avg.com
> >  > utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> > <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
> >
>

Re: Newbie Question

2018-01-08 Thread Alexandre Rafalovitch

I think you are missing /query handler endpoint in the URL. Plus actual
search parameters.

You may try using the admin UI to build your queries first.

Regards,
Alex

On Jan 8, 2018 12:23 PM, "Deepak Goel"  wrote:

> Hello
>
> *I am trying to search for documents in my collection (Shakespeare). The
> code is as follows:*
>
> SolrClient client = new HttpSolrClient.Builder("
> http://localhost:8983/solr/shakespeare";).build();
>
> SolrDocument doc = client.getById("2");
> *However this does not return any document. What mistake am I making?*
>
> Thank You
> Deepak
>
> Deepak
> "Please stop cruelty to Animals, help by becoming a Vegan"
> +91 73500 12833
> deic...@gmail.com
>
> Facebook: https://www.facebook.com/deicool
> LinkedIn: www.linkedin.com/in/deicool
>
> "Plant a Tree, Go Green"
>
>  utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> Virus-free.
> www.avg.com
>  utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>

Newbie Question

2018-01-08 Thread Deepak Goel

Hello

*I am trying to search for documents in my collection (Shakespeare). The
code is as follows:*

SolrClient client = new HttpSolrClient.Builder("
http://localhost:8983/solr/shakespeare";).build();

SolrDocument doc = client.getById("2");
*However this does not return any document. What mistake am I making?*

Thank You
Deepak

Deepak
"Please stop cruelty to Animals, help by becoming a Vegan"
+91 73500 12833
deic...@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"


Virus-free.
www.avg.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

Re: Newbie question about why represent timestamps as "float" values

2017-10-10 Thread Erick Erickson

Hold it. "date", "tdate", "pdate" _are_ primitive types. Under the
covers date/tdate are just a tlong type, newer Solrs have a "pdate"
which is a point numeric type. All that these types do is some parsing
up front so you can send human-readable data (and get it back). But
under the covers it's still a primitive.

And the idea of making it a float is _certainly_ worse than a long.
Last time I checked, floats were more expensive to work with than
longs. If this  was done for "efficiency" it wasn't done correctly.

It's vaguely possible that if this was done for efficiency, it was
done lng ago when dates could be strings. Certainly there's a
performance argument there, but that hasn't been the case for a very
long time.


Erick

On Tue, Oct 10, 2017 at 2:24 AM, Michael Kuhlmann  wrote:
> While you're generally right, in this case it might make sense to stick
> to a primitive type.
>
> I see "unixtime" as a technical information, probably from
> System.currentTimeMillis(). As long as it's not used as a "real world"
> date but only for sorting based on latest updates, or chosing which
> document is more recent, it's totally okay to index it as a long value.
>
> But definitely not as a float.
>
> -Michael
>
> Am 10.10.2017 um 10:55 schrieb alessandro.benedetti:
>> There was time ago a Solr installation which had the same problem, and the
>> author explained me that the choice was made for performance reasons.
>> Apparently he was sure that handling everything as primitive types would
>> give a boost to the Solr searching/faceting performance.
>> I never agreed ( and one of the reasons is that you need to transform back
>> from float to dates to actually render them in a readable format).
>>
>> Furthermore I tend to rely on standing on the shoulders of giants, so if a
>> community ( not just a single developer) spent time implementing a date type
>> ( with the different available implementations) to manage specifically date
>> information, I tend to thrust them and believe that the best approach to
>> manage dates is to use that ad hoc date type ( in its variants, depending on
>> the use cases).
>>
>> As a plus, using the right data type gives you immense power in debugging
>> and understanding better your data.
>> For proper maintenance , it is another good reason to stick with standards.
>>
>>
>>
>> -
>> ---
>> Alessandro Benedetti
>> Search Consultant, R&D Software Engineer, Director
>> Sease Ltd. - www.sease.io
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>
>

Re: Newbie question about why represent timestamps as "float" values

2017-10-10 Thread Michael Kuhlmann

While you're generally right, in this case it might make sense to stick
to a primitive type.

I see "unixtime" as a technical information, probably from
System.currentTimeMillis(). As long as it's not used as a "real world"
date but only for sorting based on latest updates, or chosing which
document is more recent, it's totally okay to index it as a long value.

But definitely not as a float.

-Michael

Am 10.10.2017 um 10:55 schrieb alessandro.benedetti:
> There was time ago a Solr installation which had the same problem, and the
> author explained me that the choice was made for performance reasons.
> Apparently he was sure that handling everything as primitive types would
> give a boost to the Solr searching/faceting performance.
> I never agreed ( and one of the reasons is that you need to transform back
> from float to dates to actually render them in a readable format).
> 
> Furthermore I tend to rely on standing on the shoulders of giants, so if a
> community ( not just a single developer) spent time implementing a date type
> ( with the different available implementations) to manage specifically date
> information, I tend to thrust them and believe that the best approach to
> manage dates is to use that ad hoc date type ( in its variants, depending on
> the use cases).
> 
> As a plus, using the right data type gives you immense power in debugging
> and understanding better your data.
> For proper maintenance , it is another good reason to stick with standards.
> 
> 
> 
> -
> ---
> Alessandro Benedetti
> Search Consultant, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>

Re: Newbie question about why represent timestamps as "float" values

2017-10-10 Thread alessandro.benedetti

There was time ago a Solr installation which had the same problem, and the
author explained me that the choice was made for performance reasons.
Apparently he was sure that handling everything as primitive types would
give a boost to the Solr searching/faceting performance.
I never agreed ( and one of the reasons is that you need to transform back
from float to dates to actually render them in a readable format).

Furthermore I tend to rely on standing on the shoulders of giants, so if a
community ( not just a single developer) spent time implementing a date type
( with the different available implementations) to manage specifically date
information, I tend to thrust them and believe that the best approach to
manage dates is to use that ad hoc date type ( in its variants, depending on
the use cases).

As a plus, using the right data type gives you immense power in debugging
and understanding better your data.
For proper maintenance , it is another good reason to stick with standards.



-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Newbie question about why represent timestamps as "float" values

2017-10-09 Thread Erick Erickson

What Hoss said, and in addition somewhere some
custom code has to be translating things back and
forth. For dates, Solr wants -MM-DDTHH:MM:SSZ
as a date string it knows how to deal with. That simply
couldn't parse as a float type so there's some custom
code that transforms dates into a float at ingest
time and converts from float to something recognizable
as a date on output.

On Mon, Oct 9, 2017 at 2:06 PM, Chris Hostetter
 wrote:
>
> : Here is my question.  In schema.xml, there is this field:
> :
> : 
> :
> : Question:  why is this declared as a float datatype?  I'm just looking
> : for an explanation of what is there – any changes come later, after I
> : understand things better.
>
> You would hvae to ask the creator of that schema.xml file why they made
> that choice ... to the best of my knowledge, no sample/example schema that
> has ever shipped with any version of solr has ever included a "unixdate"
> field -- let alone one that suggested "float" would be a logically correct
> data type for storing that type of information.
>
>
> -Hoss
> http://www.lucidworks.com/

Re: Newbie question about why represent timestamps as "float" values

2017-10-09 Thread Chris Hostetter


: Here is my question.  In schema.xml, there is this field:
: 
: 
: 
: Question:  why is this declared as a float datatype?  I'm just looking 
: for an explanation of what is there – any changes come later, after I 
: understand things better.

You would hvae to ask the creator of that schema.xml file why they made 
that choice ... to the best of my knowledge, no sample/example schema that 
has ever shipped with any version of solr has ever included a "unixdate" 
field -- let alone one that suggested "float" would be a logically correct 
data type for storing that type of information.


-Hoss
http://www.lucidworks.com/

Newbie question about why represent timestamps as "float" values

2017-10-09 Thread William Torcaso



I have inherited a working SOLR installation, that has not been upgraded since 
solr 4.0.  My task is to bring it forward (at least 6.x, maybe 7.x).  I am 
brand new to SOLR.

Here is my question.  In schema.xml, there is this field:



Question:  why is this declared as a float datatype?  I'm just looking for an 
explanation of what is there – any changes come later, after I understand 
things better.

I understand about milliseconds from the epoch.  I would expect that the author 
would have used an integer or a long integer to hold such a millisecond count, 
or a DateField or TrieDateField.
I wonder if there is some Solr magic at work.

Thanks,

  ---  Bill

Re: newbie question re solr.PatternReplaceFilterFactory

2017-05-10 Thread Erick Erickson

First use PatternReplaceCharFilterFactory. The difference is that
PatternReplaceCharFilterFactoryworks on the entire input whereas
PatternReplaceFilterFactory works only on the tokens emitted by the
tokenizer. Concrete example using WhitespeceTokenizerFactory would be
this [is some ] text
PatternReplaceFilterFactory would see 5 tokens, "this", "[is", "some",
"]", and "text". So it would be very hard to do what you want.

patternReplaceCharFilterFactory will see the entire input as one
string and operate on it, _then" send it through the tokenizer.

And also don't be fooled by the fact that the _stored_ data will still
contain the removed words. So when you get the doc back from solr
you'll see the original input, brackets and all. In the above example,
if you returned the field you'd still see

this [is some ] text

when the doc matched. This doc would be found when searching for
"this" or "text", but _not_ when searching for "is" or "some".

You want some pattern like

Best,
Erick

On Wed, May 10, 2017 at 6:08 PM, Michael Tobias  wrote:
> I am sure this is very simple but I cannot get the pattern right.
>
> How can I use solr.PatternReplaceFilterFactory to remove all words in 
> brackets from being indexed?
>
> eg [ignore this]
>
> thanks
>
> Michael
>

newbie question re solr.PatternReplaceFilterFactory

2017-05-10 Thread Michael Tobias

I am sure this is very simple but I cannot get the pattern right.

How can I use solr.PatternReplaceFilterFactory to remove all words in brackets 
from being indexed?

eg [ignore this]

thanks

Michael

Re: newbie question

2016-09-07 Thread John Bickerstaff

the /solr is a "chroot" -- if used, everything for solr goes into
zookeeper's /solr "directory"

It isn't required, but is very useful for keeping things separated.  I use
it to handle different Solr versions for upgrading (/solr5_4 or /solr6_2)

If not used, everything you put into Zookeeper (some microservices, Kafka,
etc) will all end up at the "root" and it'll be a cluster...

On Wed, Sep 7, 2016 at 1:26 PM, Erick Erickson 
wrote:

> Well, first off the ZK ensemble string is usually specified as
> dayrhegapd016.enterprisenet.org:2181,host2:2181,host3:2181/solr
> (note that the /solr is only at the end, not every node).
>
> Second, I always get confused whether the /solr is necessary or not.
>
> Again, though, the Cloudera user's list is probably a place to get better
> answers.
>
> Best,
> Erick
>
> On Wed, Sep 7, 2016 at 12:15 PM, Darshan Pandya 
> wrote:
> > Thank Erik,
> > So seems like the problem is that when I upload the configs to zookeeper
> > and then inspect zookeeper-client and ls /solr/configs it is showing to
> be
> > empty.
> >
> > I executed the following command to upload the config
> >
> > solrctl --zk
> > dayrhegapd016.enterprisenet.org:2181/solr,host2:2181/solr,
> host3:2181/solr
> > --solr host4:8983/solr/ instancedir --create config1 $HOME/config1/
> >
> >
> > Zookeeper-client ls /solr/configs/
> >
> > does not show this configuration present there.
> >
> >
> >
> > On Wed, Sep 7, 2016 at 2:02 PM, Erick Erickson 
> > wrote:
> >
> >> I'm a bit rusty on solrctl (and you might get faster/more up-to-date
> >> responses on the Cloudera lists). But to create a collection, you
> >> first need to have uploaded the configs to Zookeeper, things like
> >> schema.xml, solrconfig.xml etc. I forget
> >> what the solrctl command is, but something like "upconfig" IIRC.
> >>
> >> Once that's done, you either
> >> 1> specify the collection name exactly the same as the config name
> >> you uploaded
> >> or
> >> 2> use one of the other parameters to tell collectionX to use configsetY
> >> with the collection create command. solrctl help should show you all
> these
> >> options...
> >>
> >> Best,
> >> Erick
> >>
> >> On Wed, Sep 7, 2016 at 11:43 AM, Darshan Pandya <
> darshanpan...@gmail.com>
> >> wrote:
> >> > Gonzalo,
> >> > Thanks for responding,
> >> > executed the parameters you suggested, it still shows me the same
> error.
> >> > Sincerely,
> >> > darshan
> >> >
> >> > On Wed, Sep 7, 2016 at 1:13 PM, Gonzalo Rodriguez <
> >> > grodrig...@searchtechnologies.com> wrote:
> >> >
> >> >> Hi Darshan,
> >> >>
> >> >> It looks like you are listing the instanceDir's name twice in the
> create
> >> >> collection command, it should be
> >> >>
> >> >> $ solrctl --zk  host:2181/solr --solr host:8983/solr/ collection
> >> --create
> >> >> Catalog_search_index -s 10 -c Catalog_search_index
> >> >>
> >> >> Without the extra ". Catalog_search_index" at the end. Also, because
> >> your
> >> >> new collection's name is the same as the instanceDir's, you could
> just
> >> omit
> >> >> that parameter and it should work ok.
> >> >>
> >> >> Try that and see if it works.
> >> >>
> >> >> Good luck,
> >> >> Gonzalo
> >> >>
> >> >> -Original Message-
> >> >> From: Darshan Pandya [mailto:darshanpan...@gmail.com]
> >> >> Sent: Wednesday, September 7, 2016 12:02 PM
> >> >> To: solr-user@lucene.apache.org
> >> >> Subject: newbie question
> >> >>
> >> >> hello,
> >> >>
> >> >> I am using solr cloud with cloudera. When I try to create a
> collection,
> >> it
> >> >> fails with the following error.
> >> >> Any hints / answers will be helpful.
> >> >>
> >> >>
> >> >> $ solrctl --zk  host:2181/solr instancedir --list
> >> >>
> >> >> Catalog_search_index
> >> >>
> >> >> $ solrctl --zk  shot:2181/solr --solr host:8983/solr/ collection
> >> --create
> >> >> Catalog_search_index -s 10 -c Catalog_search_index.Catalog_
> search_index
> >> >>
> >> >> Error: A call to SolrCloud WEB APIs failed: HTTP/1.1 200 OK
> >> >>
> >> >> Server: Apache-Coyote/1.1
> >> >>
> >> >> Content-Type: application/xml;charset=UTF-8
> >> >>
> >> >> Transfer-Encoding: chunked
> >> >>
> >> >> Date: Wed, 07 Sep 2016 17:58:13 GMT
> >> >>
> >> >>
> >> >> 
> >> >>
> >> >>
> >> >> 
> >> >>
> >> >>
> >> >> 
> >> >>
> >> >> 
> >> >>
> >> >> 0
> >> >>
> >> >> 
> >> >>
> >> >> 1165
> >> >>
> >> >> 
> >> >>
> >> >> 
> >> >>
> >> >> 
> >> >>
> >> >> org.apache.solr.client.solrj.impl.HttpSolrServer$
> >> RemoteSolrException:Error
> >> >> CREATEing SolrCore 'Catalog_search_index_shard1_replica1': Unable to
> >> >> create core [Catalog_search_index_shard1_replica1] Caused by:
> Specified
> >> >> config does not exist in ZooKeeper:Catalog_search_
> >> >> index.dataCatalog_search_index
> >> >>
> >> >>
> >> >> --
> >> >> Sincerely,
> >> >> Darshan
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Sincerely,
> >> > Darshan
> >>
> >
> >
> >
> > --
> > Sincerely,
> > Darshan
>

Re: newbie question

2016-09-07 Thread Darshan Pandya

Many Thanks! I will move this to a cloudera list.

On Wed, Sep 7, 2016 at 2:26 PM, Erick Erickson 
wrote:

> Well, first off the ZK ensemble string is usually specified as
> dayrhegapd016.enterprisenet.org:2181,host2:2181,host3:2181/solr
> (note that the /solr is only at the end, not every node).
>
> Second, I always get confused whether the /solr is necessary or not.
>
> Again, though, the Cloudera user's list is probably a place to get better
> answers.
>
> Best,
> Erick
>
> On Wed, Sep 7, 2016 at 12:15 PM, Darshan Pandya 
> wrote:
> > Thank Erik,
> > So seems like the problem is that when I upload the configs to zookeeper
> > and then inspect zookeeper-client and ls /solr/configs it is showing to
> be
> > empty.
> >
> > I executed the following command to upload the config
> >
> > solrctl --zk
> > dayrhegapd016.enterprisenet.org:2181/solr,host2:2181/solr,
> host3:2181/solr
> > --solr host4:8983/solr/ instancedir --create config1 $HOME/config1/
> >
> >
> > Zookeeper-client ls /solr/configs/
> >
> > does not show this configuration present there.
> >
> >
> >
> > On Wed, Sep 7, 2016 at 2:02 PM, Erick Erickson 
> > wrote:
> >
> >> I'm a bit rusty on solrctl (and you might get faster/more up-to-date
> >> responses on the Cloudera lists). But to create a collection, you
> >> first need to have uploaded the configs to Zookeeper, things like
> >> schema.xml, solrconfig.xml etc. I forget
> >> what the solrctl command is, but something like "upconfig" IIRC.
> >>
> >> Once that's done, you either
> >> 1> specify the collection name exactly the same as the config name
> >> you uploaded
> >> or
> >> 2> use one of the other parameters to tell collectionX to use configsetY
> >> with the collection create command. solrctl help should show you all
> these
> >> options...
> >>
> >> Best,
> >> Erick
> >>
> >> On Wed, Sep 7, 2016 at 11:43 AM, Darshan Pandya <
> darshanpan...@gmail.com>
> >> wrote:
> >> > Gonzalo,
> >> > Thanks for responding,
> >> > executed the parameters you suggested, it still shows me the same
> error.
> >> > Sincerely,
> >> > darshan
> >> >
> >> > On Wed, Sep 7, 2016 at 1:13 PM, Gonzalo Rodriguez <
> >> > grodrig...@searchtechnologies.com> wrote:
> >> >
> >> >> Hi Darshan,
> >> >>
> >> >> It looks like you are listing the instanceDir's name twice in the
> create
> >> >> collection command, it should be
> >> >>
> >> >> $ solrctl --zk  host:2181/solr --solr host:8983/solr/ collection
> >> --create
> >> >> Catalog_search_index -s 10 -c Catalog_search_index
> >> >>
> >> >> Without the extra ". Catalog_search_index" at the end. Also, because
> >> your
> >> >> new collection's name is the same as the instanceDir's, you could
> just
> >> omit
> >> >> that parameter and it should work ok.
> >> >>
> >> >> Try that and see if it works.
> >> >>
> >> >> Good luck,
> >> >> Gonzalo
> >> >>
> >> >> -Original Message-
> >> >> From: Darshan Pandya [mailto:darshanpan...@gmail.com]
> >> >> Sent: Wednesday, September 7, 2016 12:02 PM
> >> >> To: solr-user@lucene.apache.org
> >> >> Subject: newbie question
> >> >>
> >> >> hello,
> >> >>
> >> >> I am using solr cloud with cloudera. When I try to create a
> collection,
> >> it
> >> >> fails with the following error.
> >> >> Any hints / answers will be helpful.
> >> >>
> >> >>
> >> >> $ solrctl --zk  host:2181/solr instancedir --list
> >> >>
> >> >> Catalog_search_index
> >> >>
> >> >> $ solrctl --zk  shot:2181/solr --solr host:8983/solr/ collection
> >> --create
> >> >> Catalog_search_index -s 10 -c Catalog_search_index.Catalog_
> search_index
> >> >>
> >> >> Error: A call to SolrCloud WEB APIs failed: HTTP/1.1 200 OK
> >> >>
> >> >> Server: Apache-Coyote/1.1
> >> >>
> >> >> Content-Type: application/xml;charset=UTF-8
> >> >>
> >> >> Transfer-Encoding: chunked
> >> >>
> >> >> Date: Wed, 07 Sep 2016 17:58:13 GMT
> >> >>
> >> >>
> >> >> 
> >> >>
> >> >>
> >> >> 
> >> >>
> >> >>
> >> >> 
> >> >>
> >> >> 
> >> >>
> >> >> 0
> >> >>
> >> >> 
> >> >>
> >> >> 1165
> >> >>
> >> >> 
> >> >>
> >> >> 
> >> >>
> >> >> 
> >> >>
> >> >> org.apache.solr.client.solrj.impl.HttpSolrServer$
> >> RemoteSolrException:Error
> >> >> CREATEing SolrCore 'Catalog_search_index_shard1_replica1': Unable to
> >> >> create core [Catalog_search_index_shard1_replica1] Caused by:
> Specified
> >> >> config does not exist in ZooKeeper:Catalog_search_
> >> >> index.dataCatalog_search_index
> >> >>
> >> >>
> >> >> --
> >> >> Sincerely,
> >> >> Darshan
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Sincerely,
> >> > Darshan
> >>
> >
> >
> >
> > --
> > Sincerely,
> > Darshan
>



-- 
Sincerely,
Darshan

Re: newbie question

2016-09-07 Thread Erick Erickson

Well, first off the ZK ensemble string is usually specified as
dayrhegapd016.enterprisenet.org:2181,host2:2181,host3:2181/solr
(note that the /solr is only at the end, not every node).

Second, I always get confused whether the /solr is necessary or not.

Again, though, the Cloudera user's list is probably a place to get better
answers.

Best,
Erick

On Wed, Sep 7, 2016 at 12:15 PM, Darshan Pandya  wrote:
> Thank Erik,
> So seems like the problem is that when I upload the configs to zookeeper
> and then inspect zookeeper-client and ls /solr/configs it is showing to be
> empty.
>
> I executed the following command to upload the config
>
> solrctl --zk
> dayrhegapd016.enterprisenet.org:2181/solr,host2:2181/solr,host3:2181/solr
> --solr host4:8983/solr/ instancedir --create config1 $HOME/config1/
>
>
> Zookeeper-client ls /solr/configs/
>
> does not show this configuration present there.
>
>
>
> On Wed, Sep 7, 2016 at 2:02 PM, Erick Erickson 
> wrote:
>
>> I'm a bit rusty on solrctl (and you might get faster/more up-to-date
>> responses on the Cloudera lists). But to create a collection, you
>> first need to have uploaded the configs to Zookeeper, things like
>> schema.xml, solrconfig.xml etc. I forget
>> what the solrctl command is, but something like "upconfig" IIRC.
>>
>> Once that's done, you either
>> 1> specify the collection name exactly the same as the config name
>> you uploaded
>> or
>> 2> use one of the other parameters to tell collectionX to use configsetY
>> with the collection create command. solrctl help should show you all these
>> options...
>>
>> Best,
>> Erick
>>
>> On Wed, Sep 7, 2016 at 11:43 AM, Darshan Pandya 
>> wrote:
>> > Gonzalo,
>> > Thanks for responding,
>> > executed the parameters you suggested, it still shows me the same error.
>> > Sincerely,
>> > darshan
>> >
>> > On Wed, Sep 7, 2016 at 1:13 PM, Gonzalo Rodriguez <
>> > grodrig...@searchtechnologies.com> wrote:
>> >
>> >> Hi Darshan,
>> >>
>> >> It looks like you are listing the instanceDir's name twice in the create
>> >> collection command, it should be
>> >>
>> >> $ solrctl --zk  host:2181/solr --solr host:8983/solr/ collection
>> --create
>> >> Catalog_search_index -s 10 -c Catalog_search_index
>> >>
>> >> Without the extra ". Catalog_search_index" at the end. Also, because
>> your
>> >> new collection's name is the same as the instanceDir's, you could just
>> omit
>> >> that parameter and it should work ok.
>> >>
>> >> Try that and see if it works.
>> >>
>> >> Good luck,
>> >> Gonzalo
>> >>
>> >> -Original Message-
>> >> From: Darshan Pandya [mailto:darshanpan...@gmail.com]
>> >> Sent: Wednesday, September 7, 2016 12:02 PM
>> >> To: solr-user@lucene.apache.org
>> >> Subject: newbie question
>> >>
>> >> hello,
>> >>
>> >> I am using solr cloud with cloudera. When I try to create a collection,
>> it
>> >> fails with the following error.
>> >> Any hints / answers will be helpful.
>> >>
>> >>
>> >> $ solrctl --zk  host:2181/solr instancedir --list
>> >>
>> >> Catalog_search_index
>> >>
>> >> $ solrctl --zk  shot:2181/solr --solr host:8983/solr/ collection
>> --create
>> >> Catalog_search_index -s 10 -c Catalog_search_index.Catalog_search_index
>> >>
>> >> Error: A call to SolrCloud WEB APIs failed: HTTP/1.1 200 OK
>> >>
>> >> Server: Apache-Coyote/1.1
>> >>
>> >> Content-Type: application/xml;charset=UTF-8
>> >>
>> >> Transfer-Encoding: chunked
>> >>
>> >> Date: Wed, 07 Sep 2016 17:58:13 GMT
>> >>
>> >>
>> >> 
>> >>
>> >>
>> >> 
>> >>
>> >>
>> >> 
>> >>
>> >> 
>> >>
>> >> 0
>> >>
>> >> 
>> >>
>> >> 1165
>> >>
>> >> 
>> >>
>> >> 
>> >>
>> >> 
>> >>
>> >> org.apache.solr.client.solrj.impl.HttpSolrServer$
>> RemoteSolrException:Error
>> >> CREATEing SolrCore 'Catalog_search_index_shard1_replica1': Unable to
>> >> create core [Catalog_search_index_shard1_replica1] Caused by: Specified
>> >> config does not exist in ZooKeeper:Catalog_search_
>> >> index.dataCatalog_search_index
>> >>
>> >>
>> >> --
>> >> Sincerely,
>> >> Darshan
>> >>
>> >
>> >
>> >
>> > --
>> > Sincerely,
>> > Darshan
>>
>
>
>
> --
> Sincerely,
> Darshan

Re: newbie question

2016-09-07 Thread Darshan Pandya

Thank Erik,
So seems like the problem is that when I upload the configs to zookeeper
and then inspect zookeeper-client and ls /solr/configs it is showing to be
empty.

I executed the following command to upload the config

solrctl --zk
dayrhegapd016.enterprisenet.org:2181/solr,host2:2181/solr,host3:2181/solr
--solr host4:8983/solr/ instancedir --create config1 $HOME/config1/


Zookeeper-client ls /solr/configs/

does not show this configuration present there.



On Wed, Sep 7, 2016 at 2:02 PM, Erick Erickson 
wrote:

> I'm a bit rusty on solrctl (and you might get faster/more up-to-date
> responses on the Cloudera lists). But to create a collection, you
> first need to have uploaded the configs to Zookeeper, things like
> schema.xml, solrconfig.xml etc. I forget
> what the solrctl command is, but something like "upconfig" IIRC.
>
> Once that's done, you either
> 1> specify the collection name exactly the same as the config name
> you uploaded
> or
> 2> use one of the other parameters to tell collectionX to use configsetY
> with the collection create command. solrctl help should show you all these
> options...
>
> Best,
> Erick
>
> On Wed, Sep 7, 2016 at 11:43 AM, Darshan Pandya 
> wrote:
> > Gonzalo,
> > Thanks for responding,
> > executed the parameters you suggested, it still shows me the same error.
> > Sincerely,
> > darshan
> >
> > On Wed, Sep 7, 2016 at 1:13 PM, Gonzalo Rodriguez <
> > grodrig...@searchtechnologies.com> wrote:
> >
> >> Hi Darshan,
> >>
> >> It looks like you are listing the instanceDir's name twice in the create
> >> collection command, it should be
> >>
> >> $ solrctl --zk  host:2181/solr --solr host:8983/solr/ collection
> --create
> >> Catalog_search_index -s 10 -c Catalog_search_index
> >>
> >> Without the extra ". Catalog_search_index" at the end. Also, because
> your
> >> new collection's name is the same as the instanceDir's, you could just
> omit
> >> that parameter and it should work ok.
> >>
> >> Try that and see if it works.
> >>
> >> Good luck,
> >> Gonzalo
> >>
> >> -Original Message-
> >> From: Darshan Pandya [mailto:darshanpan...@gmail.com]
> >> Sent: Wednesday, September 7, 2016 12:02 PM
> >> To: solr-user@lucene.apache.org
> >> Subject: newbie question
> >>
> >> hello,
> >>
> >> I am using solr cloud with cloudera. When I try to create a collection,
> it
> >> fails with the following error.
> >> Any hints / answers will be helpful.
> >>
> >>
> >> $ solrctl --zk  host:2181/solr instancedir --list
> >>
> >> Catalog_search_index
> >>
> >> $ solrctl --zk  shot:2181/solr --solr host:8983/solr/ collection
> --create
> >> Catalog_search_index -s 10 -c Catalog_search_index.Catalog_search_index
> >>
> >> Error: A call to SolrCloud WEB APIs failed: HTTP/1.1 200 OK
> >>
> >> Server: Apache-Coyote/1.1
> >>
> >> Content-Type: application/xml;charset=UTF-8
> >>
> >> Transfer-Encoding: chunked
> >>
> >> Date: Wed, 07 Sep 2016 17:58:13 GMT
> >>
> >>
> >> 
> >>
> >>
> >> 
> >>
> >>
> >> 
> >>
> >> 
> >>
> >> 0
> >>
> >> 
> >>
> >> 1165
> >>
> >> 
> >>
> >> 
> >>
> >> 
> >>
> >> org.apache.solr.client.solrj.impl.HttpSolrServer$
> RemoteSolrException:Error
> >> CREATEing SolrCore 'Catalog_search_index_shard1_replica1': Unable to
> >> create core [Catalog_search_index_shard1_replica1] Caused by: Specified
> >> config does not exist in ZooKeeper:Catalog_search_
> >> index.dataCatalog_search_index
> >>
> >>
> >> --
> >> Sincerely,
> >> Darshan
> >>
> >
> >
> >
> > --
> > Sincerely,
> > Darshan
>



-- 
Sincerely,
Darshan

Re: newbie question

2016-09-07 Thread Erick Erickson

I'm a bit rusty on solrctl (and you might get faster/more up-to-date
responses on the Cloudera lists). But to create a collection, you
first need to have uploaded the configs to Zookeeper, things like
schema.xml, solrconfig.xml etc. I forget
what the solrctl command is, but something like "upconfig" IIRC.

Once that's done, you either
1> specify the collection name exactly the same as the config name
you uploaded
or
2> use one of the other parameters to tell collectionX to use configsetY
with the collection create command. solrctl help should show you all these
options...

Best,
Erick

On Wed, Sep 7, 2016 at 11:43 AM, Darshan Pandya  wrote:
> Gonzalo,
> Thanks for responding,
> executed the parameters you suggested, it still shows me the same error.
> Sincerely,
> darshan
>
> On Wed, Sep 7, 2016 at 1:13 PM, Gonzalo Rodriguez <
> grodrig...@searchtechnologies.com> wrote:
>
>> Hi Darshan,
>>
>> It looks like you are listing the instanceDir's name twice in the create
>> collection command, it should be
>>
>> $ solrctl --zk  host:2181/solr --solr host:8983/solr/ collection --create
>> Catalog_search_index -s 10 -c Catalog_search_index
>>
>> Without the extra ". Catalog_search_index" at the end. Also, because your
>> new collection's name is the same as the instanceDir's, you could just omit
>> that parameter and it should work ok.
>>
>> Try that and see if it works.
>>
>> Good luck,
>> Gonzalo
>>
>> -Original Message-
>> From: Darshan Pandya [mailto:darshanpan...@gmail.com]
>> Sent: Wednesday, September 7, 2016 12:02 PM
>> To: solr-user@lucene.apache.org
>> Subject: newbie question
>>
>> hello,
>>
>> I am using solr cloud with cloudera. When I try to create a collection, it
>> fails with the following error.
>> Any hints / answers will be helpful.
>>
>>
>> $ solrctl --zk  host:2181/solr instancedir --list
>>
>> Catalog_search_index
>>
>> $ solrctl --zk  shot:2181/solr --solr host:8983/solr/ collection --create
>> Catalog_search_index -s 10 -c Catalog_search_index.Catalog_search_index
>>
>> Error: A call to SolrCloud WEB APIs failed: HTTP/1.1 200 OK
>>
>> Server: Apache-Coyote/1.1
>>
>> Content-Type: application/xml;charset=UTF-8
>>
>> Transfer-Encoding: chunked
>>
>> Date: Wed, 07 Sep 2016 17:58:13 GMT
>>
>>
>> 
>>
>>
>> 
>>
>>
>> 
>>
>> 
>>
>> 0
>>
>> 
>>
>> 1165
>>
>> 
>>
>> 
>>
>> 
>>
>> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
>> CREATEing SolrCore 'Catalog_search_index_shard1_replica1': Unable to
>> create core [Catalog_search_index_shard1_replica1] Caused by: Specified
>> config does not exist in ZooKeeper:Catalog_search_
>> index.dataCatalog_search_index
>>
>>
>> --
>> Sincerely,
>> Darshan
>>
>
>
>
> --
> Sincerely,
> Darshan

Re: newbie question

2016-09-07 Thread Darshan Pandya

Gonzalo,
Thanks for responding,
executed the parameters you suggested, it still shows me the same error.
Sincerely,
darshan

On Wed, Sep 7, 2016 at 1:13 PM, Gonzalo Rodriguez <
grodrig...@searchtechnologies.com> wrote:

> Hi Darshan,
>
> It looks like you are listing the instanceDir's name twice in the create
> collection command, it should be
>
> $ solrctl --zk  host:2181/solr --solr host:8983/solr/ collection --create
> Catalog_search_index -s 10 -c Catalog_search_index
>
> Without the extra ". Catalog_search_index" at the end. Also, because your
> new collection's name is the same as the instanceDir's, you could just omit
> that parameter and it should work ok.
>
> Try that and see if it works.
>
> Good luck,
> Gonzalo
>
> -Original Message-
> From: Darshan Pandya [mailto:darshanpan...@gmail.com]
> Sent: Wednesday, September 7, 2016 12:02 PM
> To: solr-user@lucene.apache.org
> Subject: newbie question
>
> hello,
>
> I am using solr cloud with cloudera. When I try to create a collection, it
> fails with the following error.
> Any hints / answers will be helpful.
>
>
> $ solrctl --zk  host:2181/solr instancedir --list
>
> Catalog_search_index
>
> $ solrctl --zk  shot:2181/solr --solr host:8983/solr/ collection --create
> Catalog_search_index -s 10 -c Catalog_search_index.Catalog_search_index
>
> Error: A call to SolrCloud WEB APIs failed: HTTP/1.1 200 OK
>
> Server: Apache-Coyote/1.1
>
> Content-Type: application/xml;charset=UTF-8
>
> Transfer-Encoding: chunked
>
> Date: Wed, 07 Sep 2016 17:58:13 GMT
>
>
> 
>
>
> 
>
>
> 
>
> 
>
> 0
>
> 
>
> 1165
>
> 
>
> 
>
> 
>
> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
> CREATEing SolrCore 'Catalog_search_index_shard1_replica1': Unable to
> create core [Catalog_search_index_shard1_replica1] Caused by: Specified
> config does not exist in ZooKeeper:Catalog_search_
> index.dataCatalog_search_index
>
>
> --
> Sincerely,
> Darshan
>



-- 
Sincerely,
Darshan

RE: newbie question

2016-09-07 Thread Gonzalo Rodriguez

Hi Darshan,

It looks like you are listing the instanceDir's name twice in the create 
collection command, it should be

$ solrctl --zk  host:2181/solr --solr host:8983/solr/ collection --create 
Catalog_search_index -s 10 -c Catalog_search_index

Without the extra ". Catalog_search_index" at the end. Also, because your new 
collection's name is the same as the instanceDir's, you could just omit that 
parameter and it should work ok.

Try that and see if it works.

Good luck,
Gonzalo

-Original Message-
From: Darshan Pandya [mailto:darshanpan...@gmail.com] 
Sent: Wednesday, September 7, 2016 12:02 PM
To: solr-user@lucene.apache.org
Subject: newbie question

hello,

I am using solr cloud with cloudera. When I try to create a collection, it 
fails with the following error.
Any hints / answers will be helpful.


$ solrctl --zk  host:2181/solr instancedir --list

Catalog_search_index

$ solrctl --zk  shot:2181/solr --solr host:8983/solr/ collection --create 
Catalog_search_index -s 10 -c Catalog_search_index.Catalog_search_index

Error: A call to SolrCloud WEB APIs failed: HTTP/1.1 200 OK

Server: Apache-Coyote/1.1

Content-Type: application/xml;charset=UTF-8

Transfer-Encoding: chunked

Date: Wed, 07 Sep 2016 17:58:13 GMT












0



1165







org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
CREATEing SolrCore 'Catalog_search_index_shard1_replica1': Unable to create 
core [Catalog_search_index_shard1_replica1] Caused by: Specified config does 
not exist in ZooKeeper:Catalog_search_index.dataCatalog_search_index


--
Sincerely,
Darshan

newbie question

2016-09-07 Thread Darshan Pandya

hello,

I am using solr cloud with cloudera. When I try to create a collection, it
fails with the following error.
Any hints / answers will be helpful.


$ solrctl --zk  host:2181/solr instancedir --list

Catalog_search_index

$ solrctl --zk  shot:2181/solr --solr host:8983/solr/ collection --create
Catalog_search_index -s 10 -c Catalog_search_index.Catalog_search_index

Error: A call to SolrCloud WEB APIs failed: HTTP/1.1 200 OK

Server: Apache-Coyote/1.1

Content-Type: application/xml;charset=UTF-8

Transfer-Encoding: chunked

Date: Wed, 07 Sep 2016 17:58:13 GMT












0



1165







org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
CREATEing SolrCore 'Catalog_search_index_shard1_replica1': Unable to create
core [Catalog_search_index_shard1_replica1] Caused by: Specified config
does not exist in
ZooKeeper:Catalog_search_index.dataCatalog_search_index


-- 
Sincerely,
Darshan

Re: [Newbie question] in SOLR 5, would I have a "master-to-slave" relationship for two servers?

2015-11-05 Thread Erick Erickson

To pile on to Chris' comment. In the M/S situation
you describe, all the query traffic goes to the slave.

True, this relieves the slave from doing the work of
indexing, but it _also_ prevents the master from
answering queries. So going to SolrCloud trades
off indexing on _both_ machines to also querying on
_both_ machines.

And this doesn't even take into account the issues
involved in recovering if one or the other (especially
the master) goes down, which is automatically
handled in SolrCloud.

Add to that the fact that memory management is
_very_ significantly improved starting with Solr
4x (see: 
https://lucidworks.com/blog/2012/04/06/memory-comparisons-between-solr-3x-and-trunk/)
and my claim is that you are _far_ better off
using SolrCloud than M/S in 5x.

As always, YMMV of course.

Best,
Erick


On Thu, Nov 5, 2015 at 1:12 PM, Chris Hostetter
 wrote:
>
> : The database of server 2 is considered the "master" and it is replicated
> : regularly to server 1, the "slave".
> :
> : The advantage is the responsiveness of server 1 is not impacted with server
> : 2 gets busy with lots of indexing.
> :
> : QUESTION: When deploying a SOLR 5 setup, do I set things up the same way?
> : Or do I cluster bother servers together into one "cloud"?   That is, in
> : SOLR 5, how do I ensure the indexing process will not impact the
> : performance of the web app?
>
> There is nothing preventing you from using a master slave setup with Solr
> 5...
>
> https://cwiki.apache.org/confluence/display/solr/Index+Replication
>
> ...however if you do so you have to take responsibility for the same
> risks/tradeoffs that existed with this type of setup in Solr 3...
>
> 1) if the "query slave" goes down, you can't serve quiers w/o manually
> redirecting traffic to your "indexing master"
>
> 2) if the "indexing master" goes down you can't accept index updates w/o
> manually redirecting update to your "query slave" -- and manually
> rectifying the descrepencies if/when your master comes back online.
>
>
> When using a cloud based setup these types of problems go away because
> there is no single "master", clients can send updates/queries to any node
> (and if you use SolrJ your clients will be "ZK aware" and know
> automatically if/when a node is down or new nodes are added) ...
> many people concerned about performance/reliability consider these
> benefits more important then the risks/tradeoffs of performance impacts of
> indexing directy to nodes that are serving queries -- especially with
> other NRT (Near Real Time) related improvements to Solr over the years
> (Soft Commits, DocValues instead of FieldCache, etc...)
>
>
> -Hoss
> http://www.lucidworks.com/

Re: [Newbie question] in SOLR 5, would I have a "master-to-slave" relationship for two servers?

2015-11-05 Thread Chris Hostetter


: The database of server 2 is considered the "master" and it is replicated
: regularly to server 1, the "slave".
: 
: The advantage is the responsiveness of server 1 is not impacted with server
: 2 gets busy with lots of indexing.
: 
: QUESTION: When deploying a SOLR 5 setup, do I set things up the same way?
: Or do I cluster bother servers together into one "cloud"?   That is, in
: SOLR 5, how do I ensure the indexing process will not impact the
: performance of the web app?

There is nothing preventing you from using a master slave setup with Solr 
5...

https://cwiki.apache.org/confluence/display/solr/Index+Replication

...however if you do so you have to take responsibility for the same 
risks/tradeoffs that existed with this type of setup in Solr 3...

1) if the "query slave" goes down, you can't serve quiers w/o manually 
redirecting traffic to your "indexing master"

2) if the "indexing master" goes down you can't accept index updates w/o 
manually redirecting update to your "query slave" -- and manually 
rectifying the descrepencies if/when your master comes back online.


When using a cloud based setup these types of problems go away because 
there is no single "master", clients can send updates/queries to any node 
(and if you use SolrJ your clients will be "ZK aware" and know 
automatically if/when a node is down or new nodes are added) ... 
many people concerned about performance/reliability consider these 
benefits more important then the risks/tradeoffs of performance impacts of 
indexing directy to nodes that are serving queries -- especially with 
other NRT (Near Real Time) related improvements to Solr over the years 
(Soft Commits, DocValues instead of FieldCache, etc...)


-Hoss
http://www.lucidworks.com/

Re: [Newbie question] what is a "core" and are they different from 3.x to 5.x ?

2015-11-05 Thread Chris Hostetter


: I can see there is something called a "core" ... it appears there can be
: many cores for a single SOLR server.
: 
: Can someone "explain like I'm five" -- what is a core?

https://cwiki.apache.org/confluence/display/solr/Solr+Cores+and+solr.xml

"In Solr, the term core is used to refer to a single index and associated 
transaction log and configuration files (including schema.xml and 
solrconfig.xml, among others). Your Solr installation can have multiple 
cores if needed, which allows you to index data with different structures 
in the same server, and maintain more control over how your data is 
presented to different audiences."

: And how do "cores" differ from 3.x to 5.x.


The only fundemental differences between "cores" in Solr 3.x vs 5.x are:

1) in 3.x there was a concept known as the "default core" (if you didn't 
explicitly use multiple cores)  with 5.x every request (updates or 
queries) must be to an explicit core (or collection)

2) when using SolrCloud in 5.x, you should think (logically) in terms of 
the higher level concept of "collections" which (depending on the settings 
when the collection is created) may be *implemented* by multiple cores 
that are managed under the covers for you...

https://cwiki.apache.org/confluence/display/solr/SolrCloud
https://cwiki.apache.org/confluence/display/solr/Nodes%2C+Cores%2C+Clusters+and+Leaders


-Hoss
http://www.lucidworks.com/

[Newbie question] what is a "core" and are they different from 3.x to 5.x ?

2015-11-05 Thread Robert Hume

Trying to learn about SOLR.

I can see there is something called a "core" ... it appears there can be
many cores for a single SOLR server.

Can someone "explain like I'm five" -- what is a core?

And how do "cores" differ from 3.x to 5.x.

Any pointers in the right direction are helpful!

Thanks!
Rob

[Newbie question] in SOLR 5, would I have a "master-to-slave" relationship for two servers?

2015-11-05 Thread Robert Hume

Hi,

In my SOLR 3 deployment (inherited it), I have (1) one SOLR server that is
used by my web application, and (2) a second SOLR server that is used to
index documents via a customer datasource.

The database of server 2 is considered the "master" and it is replicated
regularly to server 1, the "slave".

The advantage is the responsiveness of server 1 is not impacted with server
2 gets busy with lots of indexing.

QUESTION: When deploying a SOLR 5 setup, do I set things up the same way?
Or do I cluster bother servers together into one "cloud"?   That is, in
SOLR 5, how do I ensure the indexing process will not impact the
performance of the web app?

Any help is greatly appreciated!!

Rob

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Shawn Heisey


On 3/19/2014 4:55 AM, Colin R wrote:

My question is an architecture one.

These photos are currently indexed and searched in three ways.

1: The 14M pictures from above are split into a few hundred indexes that
feed a single website. This means index sizes of between 100 and 500,000
entries each.

2: 95% of these same photos are also wanted for searching on a global site.
Index size of 12M plus.

3: 80% of these same photos are also required for smaller group sites. Index
sizes of between 400K and 4M.

We currently make changes the single indexes and then merge into groups and
global. Due to the size of the numbers, is it worth changing or not.

Is it quicker/better to just have one big 14M index and filter the
complexities for each website or is it better to still maintain hundreds of
indexes so we are searching smaller one. Bear in mind, we get thousands of
changes a day PLUS very busy search servers.


My primary use for Solr is an archive of 92 million documents, most of 
which are photos.  We have thousands of new photos every day.  I haven't 
been cleared to mention what company it's for.


This screenshot of my status servlet page answers tons of questions 
about my index, but if you have additional questions, ask:


https://www.dropbox.com/s/6p1puq1gq3j8nln/solr-status-servlet.png

Here are some details about each host that you cannot see in the 
screenshot: 6 SATA disks in RAID10 with 3TB of usable space.  64GB of 
RAM.  Dual quad-core Intel E54xx series CPUs.Chain A is running Solr 
4.2.1 on Java 6, chain B is running Solr 4.6.1 on Java 7, with some 
additional plugin software that increases the index size.  There is one 
Solr process per host, with a 6GB heap.


As long as you index fields that can be used to filter searches 
according to what a user is allowed to see, I don't see any problem with 
putting all of your data into one index.The main thing you'll want to be 
sure of is that you have enough RAM to effectively cache your index.  
Because you have SSD, you probably don't need to have enough RAM to 
cache ALL of the index data, but it wouldn't hurt.  With 36GB of RAM per 
machine, you will probably have enough.


Thanks,
Shawn

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Erick Erickson

Oh My. 2(something) is ancient, I second your move
to scrap the current situation and start over. I'm
really curious what the _reason_ for such a complex
setup are/were.

I second Toke's comments. This is actually
quite small by modern Solr/Lucene standards.

Personally I would index them all to a single index,
include something like a 'source' field that allowed
one to restrict the returned documents by a filter
query (fq) clause.

Toke makes the point that you will get subtly different
search results because the tf/idf calculations are
slightly different across your entire corpus than
within various sub-sections, but I suspect that you
won't notice it. Test and see, you can change later.

One thing to look at is the new hard/soft commit
distinction, see:
http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

The short form is you want to define your hard
autocommit to be fairly short (maybe 1 minute?)
with openSearcher=false for durability and your
soft commit whatever latency you need for being
able to search the newly-added docs.

I don't know how you're feeding docs to Solr, but
if you're using the ExtractingRequestHandler,
you are
1> transmitting the entire document over the wire,
only to throw most of it away. I'm guessing your 1.5K
of data is just a few percent of the total file size.
2> you're putting the extraction work on the same
box running Solr.

If that machine is overloaded, consider moving the Tika
processing over to one or more clients and only
sending the data you actually want to index over to Solr,
See:
http://searchhub.org/2012/02/14/indexing-with-solrj/

Best,
Erick

On Wed, Mar 19, 2014 at 7:02 AM, Colin R  wrote:
> Hi Toke
>
> Our current configuration Lucene 2.(something) with RAILO/CFML app server.
>
> 10K drives, Quad Core, 16GB, Two servers. But the indexing and searching are
> starting to fail and our developer is no longer with us so it is quicker to
> rebuild than fix all the code.
>
> Our existing config is lots of indexes with merges into the larger ones.
>
> They are still running very fast but indexing is causing us issues.
>
> Thanks
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Newbie-Question-Master-Index-or-100s-Small-Index-tp4125407p4125447.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Colin R

Hi Toke

Our current configuration Lucene 2.(something) with RAILO/CFML app server.

10K drives, Quad Core, 16GB, Two servers. But the indexing and searching are
starting to fail and our developer is no longer with us so it is quicker to
rebuild than fix all the code.

Our existing config is lots of indexes with merges into the larger ones.

They are still running very fast but indexing is causing us issues.

Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Newbie-Question-Master-Index-or-100s-Small-Index-tp4125407p4125447.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Toke Eskildsen

On Wed, 2014-03-19 at 13:28 +0100, Colin R wrote:
> My question is really regarding index architecture. One big or many small
> (with merged big ones)

One difference is that having a single index/collection gives you better
ranked searches within each collection. If you only use date/filename
sorting, that is of course irrelevant.

> In terms of bytes, each photo has a up to 1.5KB of data.

So about 20GB for the full index?

> Special requirements are search by date range, text, date range and text.
> Plus some boolean filtering. All results can be sorted by date or filename.

With no faceting, grouping or similar aggregating processing,
(re)opening of an index searcher should be very fast. The only thing
that takes a moment is the initial date or filename sorting. Asking for
minute-level data updates is thus very modest. With the information you
have given, you could aim for a few seconds.

None of the things you have said gives any cause for concern about
performance and even though you have an existing system running and is
upgrading to a presumably faster one, you sound concerned. Do you
currently have performance problems, and if so, what is your current
hardware?

- Toke Eskildsen, State and University Library, Denmark

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Colin R

Hi Toke

Thanks for replying.

My question is really regarding index architecture. One big or many small
(with merged big ones)

We probably get 5-10K photos added each day. Others are updated, some are
deleted.

Updates need to happen quite fast (e.g. within minutes of our Databases
receiving them).

In terms of bytes, each photo has a up to 1.5KB of data.

Special requirements are search by date range, text, date range and text.
Plus some boolean filtering. All results can be sorted by date or filename.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Newbie-Question-Master-Index-or-100s-Small-Index-tp4125407p4125429.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Toke Eskildsen

On Wed, 2014-03-19 at 11:55 +0100, Colin R wrote:
> We run a central database of 14M (and growing) photos with dates, captions,
> keywords, etc. 
> 
> We currently upgrading from old Lucene Servers to latest Solr running with a
> couple of dedicated  servers (6 core, 36GB, 500SSD). Planning on using Solr
> Cloud.

What hardware are your past experiences based on? If they have less
cores, lower memory and spinning drives, I foresee that your question
can be reduced to which architecture you prefer from a logistic point of
view, rather than performance.

> We take in thousands of changes each day (big and small) so indexing may be
> a bigger problem than searching.

Thousands of updates in a day is a very low number. Do you have hard
requirements for update time, perform heavy faceting or do anything
special for this to be a cause of concern?

> Is it quicker/better to just have one big 14M index and filter the
> complexities for each website or is it better to still maintain hundreds of
> indexes so we are searching smaller one.

All else being equal, a search in a specific small index will be faster
than filtering on the large one. But as we know, all else is never
equal. A 14M document index in itself is not really a challenge for
Lucene/Solr, but this depends a lot on your specific setup. How large is
the 14M index in terms of bytes?

> Bear in mind, we get thousands of changes a day PLUS very busy search servers.

How many queries/second are we talking about here? What is a typical
query (faceting, grouping, special processing...)?

Regards,
Toke Eskildsen, State and University Library, Denmark

Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Colin R

We run a central database of 14M (and growing) photos with dates, captions,
keywords, etc. 

We currently upgrading from old Lucene Servers to latest Solr running with a
couple of dedicated  servers (6 core, 36GB, 500SSD). Planning on using Solr
Cloud.

We take in thousands of changes each day (big and small) so indexing may be
a bigger problem than searching.

My question is an architecture one.

These photos are currently indexed and searched in three ways.

1: The 14M pictures from above are split into a few hundred indexes that
feed a single website. This means index sizes of between 100 and 500,000
entries each.

2: 95% of these same photos are also wanted for searching on a global site.
Index size of 12M plus.

3: 80% of these same photos are also required for smaller group sites. Index
sizes of between 400K and 4M.

We currently make changes the single indexes and then merge into groups and
global. Due to the size of the numbers, is it worth changing or not.

Is it quicker/better to just have one big 14M index and filter the
complexities for each website or is it better to still maintain hundreds of
indexes so we are searching smaller one. Bear in mind, we get thousands of
changes a day PLUS very busy search servers.

Thanks

Col



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Newbie-Question-Master-Index-or-100s-Small-Index-tp4125407.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie question on Deduplication overWriteDupes flag

2014-02-06 Thread Alexandre Rafalovitch

A follow up question on this (as it is kind of new functionality).

What happens if several documents are submitted and one of them fails
due to that? Do they get rolled back or only one?

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Thu, Feb 6, 2014 at 11:17 PM, Chris Hostetter
 wrote:
>
> : How do I achieve, add if not there, fail if duplicate is found. I though
>
> You can use the optimistic concurrency features to do this, by including a
> _version_=-1 field value in the document.
>
> this will instruct solr that the update should only be processed if the
> document does not already exist...
>
> https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents
>
>
>
>
> -Hoss
> http://www.lucidworks.com/

Re: Newbie question on Deduplication overWriteDupes flag

2014-02-06 Thread Chris Hostetter


: How do I achieve, add if not there, fail if duplicate is found. I though

You can use the optimistic concurrency features to do this, by including a 
_version_=-1 field value in the document.

this will instruct solr that the update should only be processed if the 
document does not already exist...

https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents




-Hoss
http://www.lucidworks.com/

Newbie question on Deduplication overWriteDupes flag

2014-02-04 Thread aagrawal75

I had a configuration where I had "overwriteDupes"=false. Result: I got
duplicate documents in the index. 

When I changed to "overwriteDupes"=false, the duplicate documents started
overwriting the older documents. 

How do I achieve, add if not there, fail if duplicate is found. I though
that "overwriteDupes"=false would do that. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Newbie-question-on-Deduplication-overWriteDupes-flag-tp4115212.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie question on recurring theme: Dynamic Fields

2013-02-20 Thread Erik Dybdahl

Excellent, works like a charm!
Though embarassing, it's still a good thing the only problem was me being
blind :-)

Thank you, Toke and Erik.

On Wed, Feb 20, 2013 at 11:47 AM, Toke Eskildsen 
wrote:

> On Wed, 2013-02-20 at 10:06 +0100, Erik Dybdahl wrote:
> > However, after definining
> > > stored="true" multiValued="true"/>
>
> Seems like a typo to me: You need to write " "
> Regards,
> Toke Eskildsen
>
>

Re: Newbie question on recurring theme: Dynamic Fields

2013-02-20 Thread Toke Eskildsen

On Wed, 2013-02-20 at 10:06 +0100, Erik Dybdahl wrote:
> However, after definining
> stored="true" multiValued="true"/>

Seems like a typo to me: You need to write "

Re: Newbie question on recurring theme: Dynamic Fields

2013-02-20 Thread Erik Hatcher

You need to use  not , that's all :)

 Erik

On Feb 20, 2013, at 4:06, Erik Dybdahl  wrote:

> Hi,
> I'm currently assessing lucene/solr as a search front end for documents
> currently stored in an rdbms.
> The data has been made searchable to clients, in a way so that each
> client/customer may define what elements of their documents are searchable,
> by defining a field name, and a reference from that name into the data
> which is subsequently used to extract the value from the client documents
> and for each document associate it with the customer chosen field name.
> The concept of dynamic fields seemed to be exactly what I was looking for.
> Under Solr Features, Detailed Features, lucene.apache.org/solr says
> "Dynamic Fields enables on-the-fly addition of new fields".
> And, in more depth, at wiki.apache.org/solr/SchemaXml:
> "One of the powerful features of Lucene is that you don't have to
> pre-define every field when you first create your index.
> :
> For example the following dynamic field declaration tells Solr that
> whenever it sees a field name ending in "_i" which is not an explicitly
> defined field, then it should dynamically create an integer field with that
> name...
> stored="true"/>"
> Cool.
> However, after definining
>stored="true" multiValued="true"/>
> then trying to add the dynamic fields from the values in the db using solrj
> thus
> 
> solrInputDocument.addField("customerField_"+searchFieldResultSet.getString("name"),
> searchFieldResultSet.getString("value"));
> (i.e. attempting to create a dynamic field for e.g. the customer field
> FirstName) yields
> 
> org.apache.solr.common.SolrException: ERROR: [doc=62485318] unknown field
> 'customerField_FirstName'
>at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:404)
> 
> How come? The documentation states clearly that solr should dynamically
> create a field in such cases.
> I am aware that similar problems have been discussed in several threads
> before, but the appearant complexity of the issue confuses me.
> 
> Is what is stated in the documentation correct?
> In that case, what am I doing wrong, and what is the correct way?
> If the Dynamic Field concept is just a way of specifying common
> characteristics of multiple defined fields, then I think the documentation
> ought to be changed.
> 
> I was able to solve the problem by following a suggestion in one of the
> threads, namely to create just one generic field, and then concatenate name
> and value of the customer field into it, like this:
>multiValued="true"/>
> and
>solrInputDocument.addField("customerField_s",
> searchFieldResultSet.getString("name")+"$"+searchFieldResultSet.getString("value"));
> which enables queries like:
>  customerField_s:LastName$PETTERSON
> but that's not a very elegant solution.
> Also, attempting to use customerField_* in the query does not work, and
> from reading the documentation I do not understand why.
> 
> Anyway, the response time when searching for these values is superb, when
> compared to doing the search directly in the database :-)
> 
> Regards
> Erik

AW: newbie question

2012-07-30 Thread Markus Klose

You can add the parameter -Djetty.port=8984 to start your second solr on port 
8984.

java -Djetty.port=8984  -jar start.jar


Viele Grüße aus Augsburg

Markus Klose
SHI Elektronische Medien GmbH 
 

-Ursprüngliche Nachricht-
Von: Kate deBethune [mailto:kdebeth...@gmail.com] 
Gesendet: Montag, 30. Juli 2012 22:43
An: solr-user@lucene.apache.org
Betreff: newbie question

Hi,

I have been able to set up the SOLR demo environment as described in SOLR
3.6.1 tutorial:
http://lucene.apache.org/solr/api-3_6_1/doc-files/tutorial.html.

Actually, I set it up while it was still SOLR 3.6.0.

The developer I am working with has created a custom SOLR instance using
3.6.1 and has packaged it up in the same manner as the demo. However, when I 
run the java -jar start.jar command in the example directory of my SOLR
3.6.1 instance and I open the admin interface on my local host:
http://localhost:8983/solr/admin/, the admin webpage points to the 3.6.0 
instance.  The log says something like the JVM is already in use for port 8983.

How can I open my 3.6.1 instance?

I hope this question is not too elementary.

Many thanks in advance for any help you can provide.

Thanks,
Kate

newbie question

2012-07-30 Thread Kate deBethune

Hi,

I have been able to set up the SOLR demo environment as described in SOLR
3.6.1 tutorial:
http://lucene.apache.org/solr/api-3_6_1/doc-files/tutorial.html.

Actually, I set it up while it was still SOLR 3.6.0.

The developer I am working with has created a custom SOLR instance using
3.6.1 and has packaged it up in the same manner as the demo. However, when
I run the java -jar start.jar command in the example directory of my SOLR
3.6.1 instance and I open the admin interface on my local host:
http://localhost:8983/solr/admin/, the admin webpage points to the 3.6.0
instance.  The log says something like the JVM is already in use for port
8983.

How can I open my 3.6.1 instance?

I hope this question is not too elementary.

Many thanks in advance for any help you can provide.

Thanks,
Kate

Re: Newbie question on sorting

2012-05-02 Thread Jacek

Erick, I'll do that. Thank you very much.

Regards,
Jacek

On Tue, May 1, 2012 at 7:19 AM, Erick Erickson wrote:

> The easiest way is to do that in the app. That is, return the top
> 10 to the app (by score) then re-order them there. There's nothing
> in Solr that I know of that does what you want out of the box.
>
> Best
> Erick
>
> On Mon, Apr 30, 2012 at 11:10 AM, Jacek  wrote:
> > Hello all,
> >
> > I'm facing this simple problem, yet impossible to resolve for me (I'm a
> > newbie in Solr).
> > I need to sort the results by score (it is simple, of course), but then
> > what I need is to take top 10 results, and re-order it (only those top 10
> > results) by a date field.
> > It's not the same as sort=score,creationdate
> >
> > Any suggestions will be greatly appreciated!
>

Re: Newbie question on sorting

2012-05-01 Thread Erick Erickson

The easiest way is to do that in the app. That is, return the top
10 to the app (by score) then re-order them there. There's nothing
in Solr that I know of that does what you want out of the box.

Best
Erick

On Mon, Apr 30, 2012 at 11:10 AM, Jacek  wrote:
> Hello all,
>
> I'm facing this simple problem, yet impossible to resolve for me (I'm a
> newbie in Solr).
> I need to sort the results by score (it is simple, of course), but then
> what I need is to take top 10 results, and re-order it (only those top 10
> results) by a date field.
> It's not the same as sort=score,creationdate
>
> Any suggestions will be greatly appreciated!

Newbie question on sorting

2012-04-30 Thread Jacek

Hello all,

I'm facing this simple problem, yet impossible to resolve for me (I'm a
newbie in Solr).
I need to sort the results by score (it is simple, of course), but then
what I need is to take top 10 results, and re-order it (only those top 10
results) by a date field.
It's not the same as sort=score,creationdate

Any suggestions will be greatly appreciated!

Re: how to transform a URL (newbie question)

2011-11-20 Thread Erick Erickson

I think you're confusing Solr with a web app 

Solr itself has nothing to do whatsoever with presenting
things to the user. It just returns, as you have seen,
XML (or JSON or ) formatted replies. It's up to the
application layer to do something intelligent with those.

That said, the /browse request handler that ships with the
example code uses something
called the VelocityResponseWriter to render pages, where
the VeolcityResponseWriter interacts with the templates
Erik Hatcher mentioned to show you pages. So think of
all the Velocity stuff as your app engine for demo purposes.

Erik is directing you at that code if you want to hack the
Solr example to display stuff.

Hope that helps
Erick (not Hatcher )

On Sun, Nov 20, 2011 at 2:15 PM, Bent Jensen  wrote:
> Erik,
> OK, I will look at that. Basically, what I amtrying to do is to index a
> document with lots of URLs. I also index the url and give it a field type.
> Don't know much about solr yet, but though maybe I can transform the url to
> an active link, i.e. ''. I tried putting the href into the xml
> document, but it just prints out as text in html. I also could not find any
> xslt transform or schema.
>
> thanks
> Ben
>
> -Original Message-
> From: Erik Hatcher [mailto:erik.hatc...@gmail.com]
> Sent: Sunday, November 20, 2011 9:05 AM
> To: solr-user@lucene.apache.org
> Subject: Re: how to transform a URL (newbie question)
>
> Ben,
>
> Not quite sure how to interpret what you're asking here.  Are you speaking
> of the /browse view?  If so, you can tweak the templates under conf/velocity
> to make links out of things.
>
> But generally, it's the end application that would take the results from
> Solr and render links as appropriate.
>
>        Erik
>
> On Nov 20, 2011, at 11:53 , Bent Jensen wrote:
>
>> I am a beginner to solr and need to ask the following:
>> Using the apache-solr example, how can I display an url in the xml
> document
>> as an active link/url in http? Do i need to add some special transform in
>> the example.xslt file?
>>
>> thanks
>> Ben
>> -
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11
>>
>
> -
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11
> -
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11
>
>

RE: how to transform a URL (newbie question)

2011-11-20 Thread Bent Jensen

Erik,
OK, I will look at that. Basically, what I amtrying to do is to index a
document with lots of URLs. I also index the url and give it a field type.
Don't know much about solr yet, but though maybe I can transform the url to
an active link, i.e. ''. I tried putting the href into the xml
document, but it just prints out as text in html. I also could not find any
xslt transform or schema.

thanks
Ben

-Original Message-
From: Erik Hatcher [mailto:erik.hatc...@gmail.com] 
Sent: Sunday, November 20, 2011 9:05 AM
To: solr-user@lucene.apache.org
Subject: Re: how to transform a URL (newbie question)

Ben, 

Not quite sure how to interpret what you're asking here.  Are you speaking
of the /browse view?  If so, you can tweak the templates under conf/velocity
to make links out of things.

But generally, it's the end application that would take the results from
Solr and render links as appropriate.

Erik

On Nov 20, 2011, at 11:53 , Bent Jensen wrote:

> I am a beginner to solr and need to ask the following:
> Using the apache-solr example, how can I display an url in the xml
document
> as an active link/url in http? Do i need to add some special transform in
> the example.xslt file? 
> 
> thanks
> Ben
> -
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11
> 

-
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11
-
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11

Re: how to transform a URL (newbie question)

2011-11-20 Thread Erik Hatcher

Ben, 

Not quite sure how to interpret what you're asking here.  Are you speaking of 
the /browse view?  If so, you can tweak the templates under conf/velocity to 
make links out of things.

But generally, it's the end application that would take the results from Solr 
and render links as appropriate.

Erik

On Nov 20, 2011, at 11:53 , Bent Jensen wrote:

> I am a beginner to solr and need to ask the following:
> Using the apache-solr example, how can I display an url in the xml document
> as an active link/url in http? Do i need to add some special transform in
> the example.xslt file? 
> 
> thanks
> Ben
> -
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11
>

how to transform a URL (newbie question)

2011-11-20 Thread Bent Jensen

I am a beginner to solr and need to ask the following:
Using the apache-solr example, how can I display an url in the xml document
as an active link/url in http? Do i need to add some special transform in
the example.xslt file? 

thanks
Ben
-
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2012.0.1869 / Virus Database: 2092/4628 - Release Date: 11/20/11

a newbie question reagarding keyword count in each document

2011-11-11 Thread zeek

Hi All,

I am realtively new to Solr/Lucene and need some help.

- I am basically storing documents where each document represents an Entity
(a thing, a place etc)
- each Entity has some unique features that i need to store in a filed(s)
- also, i need to store the mention of those features (based on information
extracted from some other sources)
- when i query these documents, i need to be able to retrieve the x most
talked about features of that entity. 

can i do such thing in solr/lucene?  I was thinking that I create a
multi-valued field where i add the feature every time it was mentioned in my
sources.  but how would i get most mentioned features (based on the count)
for that particular entity?  If not possible in solr, i was thinking of
storing that information in a database but I really want to avoid such
option

Any help would be greatly appreciated.

Thanks,

Zeek

--
View this message in context: 
http://lucene.472066.n3.nabble.com/a-newbie-question-reagarding-keyword-count-in-each-document-tp3500489p3500489.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie question

2011-11-01 Thread Chris Hostetter


: If using CommonsHttpSolrServer query() method with parameter wt=json, when
: retrieving QueryResponse, how to do to get JSON result output stream ?

when you are using the CommonsHttpSolrServer level of API, the client 
takes care of parsing the response (which is typically in an efficient 
binary representation) into the basic data structures.

if you just want the raw response (in json or xml, or whatever) as a java 
String, then it may not be the API you really want: just use a basic 
HttpClient.


Alternately: you could consider writing your own subclass of 
ResponseParser that just slurps the InputStream that CommonsHttpSolrServer 
fetches for you.


-Hoss

Newbie question

2011-10-11 Thread darul

If using CommonsHttpSolrServer query() method with parameter wt=json, when
retrieving QueryResponse, how to do to get JSON result output stream ?

I do not understand, I can get response.getResults() etc...but no way to
find just JSON output stream.

Thanks,

Jul

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Newbie-question-tp3413106p3413106.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Newbie question, ant target for packaging source files from local copy?

2011-08-29 Thread syyang

Hi Steve,

I've filed a new JIRA issue along with the patch, which can be found at
<https://issues.apache.org/jira/browse/LUCENE-3406>;.

Please let me know if you see any problem.

Thanks!
-Sid

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Newbie-question-ant-target-for-packaging-source-files-from-local-copy-tp3282787p3294320.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie Question, can I store structured sub elements?

2011-08-25 Thread Paul Libbrecht

Whether multi-valued or token-streams, the question is search, not 
(de)serialization: that's opaque to Solr which will take and give it to you as 
needed.

paul


Le 25 août 2011 à 21:24, Zac Tolley a écrit :

> My search is very simple, mainly on titles, actors, show times and channels.
> Having multiple lists of values is probably better for that, and as the
> order is kept the same its relatively simple to map the response back onto
> pojos for my presentation layer.

Re: Newbie Question, can I store structured sub elements?

2011-08-25 Thread Zac Tolley

My search is very simple, mainly on titles, actors, show times and channels.
Having multiple lists of values is probably better for that, and as the
order is kept the same its relatively simple to map the response back onto
pojos for my presentation layer.

On Thu, Aug 25, 2011 at 8:18 PM, Paul Libbrecht  wrote:

> Delimited text is the baby form of lists.
> Text can be made very very structured (think XML, ontologies...).
> I think the crux is your search needs.
>
> For example, with Lucene, I made a search for formulæ (including sub-terms)
> by converting the OpenMath-encoded terms into rows of tokens and querying
> with SpanQueries. Quite structured to my taste.
>
> What you don't have is the freedom of joins which brings a very flexible
> query mechanism almost independent of the schema... but this often can be
> circumvented by the flat solr and lucene storage whose performance is really
> amazing.
>
> paul
>
>
> Le 25 août 2011 à 21:07, Zac Tolley a écrit :
>
> > have come to that conclusion  so had to choose between multiple fields
> with
> > multiple vales or a field with delimited text, gone for the former
> >
> > On Thu, Aug 25, 2011 at 7:58 PM, Erick Erickson  >wrote:
> >
> >> nope, it's not easy. Solr docs are flat, flat, flat with the tiny
> >> exception that multiValued fields are returned as a lists.
> >>
> >> However, you can count on multi-valued fields being returned
> >> in the order they were added, so it might work out for you to
> >> treat these as parallel arrays in Solr documents.
> >>
> >> Best
> >> Erick
> >>
> >> On Thu, Aug 25, 2011 at 3:10 AM, Zac Tolley  wrote:
> >>> I know I can have multi value on them but that doesn't let me see that
> >>> a showing instance happens at a particular time on a particular
> >>> channel, just that it shows on a range of channels at a range of times
> >>>
> >>> Starting to think I will have to either store a formatted string that
> >>> combines them or keep it flat just for indexing, retrieve ids and use
> >>> them to get data out of the RDBMS
> >>>
> >>>
> >>> On 24 Aug 2011, at 23:09, dan whelan  wrote:
> >>>
>  You could change starttime and channelname to multiValued=true and use
> >> these fields to store all the values for those fields.
> 
>  showing.movie_id and showing.id probably isn't needed in a solr
> record.
> 
> 
> 
>  On 8/24/11 7:53 AM, Zac Tolley wrote:
> > I have a very scenario in which I have a film and showings, each film
> >> has
> > multiple showings at set times on set channels, so I have:
> >
> > Movie
> > -
> > id
> > title
> > description
> > duration
> >
> >
> > Showing
> > -
> > id
> > movie_id
> > starttime
> > channelname
> >
> >
> >
> > I want to know can I store this in solr so that I keep this stucture?
> >
> > I did try to do an initial import with the DIH using this config:
> >
> > 
> >   
> >   
> >   
> >
> >   
> > 
> > 
> > 
> >   
> > 
> >
> > I was hoping, for each movie to get a sub entity with the showing
> like:
> >
> > 
> >   .
> >   
> >  >
> >
> >
> > but instead all the fields are flattened down to the top level.
> >
> > I know this must be easy, what am I missing... ?
> >
> 
> >>>
> >>
>
>

Re: Newbie Question, can I store structured sub elements?

2011-08-25 Thread Paul Libbrecht

Delimited text is the baby form of lists.
Text can be made very very structured (think XML, ontologies...).
I think the crux is your search needs.

For example, with Lucene, I made a search for formulæ (including sub-terms) by 
converting the OpenMath-encoded terms into rows of tokens and querying with 
SpanQueries. Quite structured to my taste.

What you don't have is the freedom of joins which brings a very flexible query 
mechanism almost independent of the schema... but this often can be 
circumvented by the flat solr and lucene storage whose performance is really 
amazing.

paul


Le 25 août 2011 à 21:07, Zac Tolley a écrit :

> have come to that conclusion  so had to choose between multiple fields with
> multiple vales or a field with delimited text, gone for the former
> 
> On Thu, Aug 25, 2011 at 7:58 PM, Erick Erickson 
> wrote:
> 
>> nope, it's not easy. Solr docs are flat, flat, flat with the tiny
>> exception that multiValued fields are returned as a lists.
>> 
>> However, you can count on multi-valued fields being returned
>> in the order they were added, so it might work out for you to
>> treat these as parallel arrays in Solr documents.
>> 
>> Best
>> Erick
>> 
>> On Thu, Aug 25, 2011 at 3:10 AM, Zac Tolley  wrote:
>>> I know I can have multi value on them but that doesn't let me see that
>>> a showing instance happens at a particular time on a particular
>>> channel, just that it shows on a range of channels at a range of times
>>> 
>>> Starting to think I will have to either store a formatted string that
>>> combines them or keep it flat just for indexing, retrieve ids and use
>>> them to get data out of the RDBMS
>>> 
>>> 
>>> On 24 Aug 2011, at 23:09, dan whelan  wrote:
>>> 
 You could change starttime and channelname to multiValued=true and use
>> these fields to store all the values for those fields.
 
 showing.movie_id and showing.id probably isn't needed in a solr record.
 
 
 
 On 8/24/11 7:53 AM, Zac Tolley wrote:
> I have a very scenario in which I have a film and showings, each film
>> has
> multiple showings at set times on set channels, so I have:
> 
> Movie
> -
> id
> title
> description
> duration
> 
> 
> Showing
> -
> id
> movie_id
> starttime
> channelname
> 
> 
> 
> I want to know can I store this in solr so that I keep this stucture?
> 
> I did try to do an initial import with the DIH using this config:
> 
> 
>   
>   
>   
> 
>   
> 
> 
> 
>   
> 
> 
> I was hoping, for each movie to get a sub entity with the showing like:
> 
> 
>   .
>   
>  
> 
> 
> but instead all the fields are flattened down to the top level.
> 
> I know this must be easy, what am I missing... ?
> 
 
>>> 
>>

Re: Newbie Question, can I store structured sub elements?

2011-08-25 Thread Zac Tolley

have come to that conclusion  so had to choose between multiple fields with
multiple vales or a field with delimited text, gone for the former

On Thu, Aug 25, 2011 at 7:58 PM, Erick Erickson wrote:

> nope, it's not easy. Solr docs are flat, flat, flat with the tiny
> exception that multiValued fields are returned as a lists.
>
> However, you can count on multi-valued fields being returned
> in the order they were added, so it might work out for you to
> treat these as parallel arrays in Solr documents.
>
> Best
> Erick
>
> On Thu, Aug 25, 2011 at 3:10 AM, Zac Tolley  wrote:
> > I know I can have multi value on them but that doesn't let me see that
> > a showing instance happens at a particular time on a particular
> > channel, just that it shows on a range of channels at a range of times
> >
> > Starting to think I will have to either store a formatted string that
> > combines them or keep it flat just for indexing, retrieve ids and use
> > them to get data out of the RDBMS
> >
> >
> > On 24 Aug 2011, at 23:09, dan whelan  wrote:
> >
> >> You could change starttime and channelname to multiValued=true and use
> these fields to store all the values for those fields.
> >>
> >> showing.movie_id and showing.id probably isn't needed in a solr record.
> >>
> >>
> >>
> >> On 8/24/11 7:53 AM, Zac Tolley wrote:
> >>> I have a very scenario in which I have a film and showings, each film
> has
> >>> multiple showings at set times on set channels, so I have:
> >>>
> >>> Movie
> >>> -
> >>> id
> >>> title
> >>> description
> >>> duration
> >>>
> >>>
> >>> Showing
> >>> -
> >>> id
> >>> movie_id
> >>> starttime
> >>> channelname
> >>>
> >>>
> >>>
> >>> I want to know can I store this in solr so that I keep this stucture?
> >>>
> >>> I did try to do an initial import with the DIH using this config:
> >>>
> >>> 
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>  
> >>>  
> >>>  
> >>>
> >>> 
> >>>
> >>> I was hoping, for each movie to get a sub entity with the showing like:
> >>>
> >>> 
> >>>.
> >>>
> >>>   >>>
> >>>
> >>>
> >>> but instead all the fields are flattened down to the top level.
> >>>
> >>> I know this must be easy, what am I missing... ?
> >>>
> >>
> >
>

Re: Newbie Question, can I store structured sub elements?

2011-08-25 Thread Erick Erickson

nope, it's not easy. Solr docs are flat, flat, flat with the tiny
exception that multiValued fields are returned as a lists.

However, you can count on multi-valued fields being returned
in the order they were added, so it might work out for you to
treat these as parallel arrays in Solr documents.

Best
Erick

On Thu, Aug 25, 2011 at 3:10 AM, Zac Tolley  wrote:
> I know I can have multi value on them but that doesn't let me see that
> a showing instance happens at a particular time on a particular
> channel, just that it shows on a range of channels at a range of times
>
> Starting to think I will have to either store a formatted string that
> combines them or keep it flat just for indexing, retrieve ids and use
> them to get data out of the RDBMS
>
>
> On 24 Aug 2011, at 23:09, dan whelan  wrote:
>
>> You could change starttime and channelname to multiValued=true and use these 
>> fields to store all the values for those fields.
>>
>> showing.movie_id and showing.id probably isn't needed in a solr record.
>>
>>
>>
>> On 8/24/11 7:53 AM, Zac Tolley wrote:
>>> I have a very scenario in which I have a film and showings, each film has
>>> multiple showings at set times on set channels, so I have:
>>>
>>> Movie
>>> -
>>> id
>>> title
>>> description
>>> duration
>>>
>>>
>>> Showing
>>> -
>>> id
>>> movie_id
>>> starttime
>>> channelname
>>>
>>>
>>>
>>> I want to know can I store this in solr so that I keep this stucture?
>>>
>>> I did try to do an initial import with the DIH using this config:
>>>
>>> 
>>>    
>>>    
>>>    
>>>
>>>    
>>>      
>>>      
>>>      
>>>    
>>> 
>>>
>>> I was hoping, for each movie to get a sub entity with the showing like:
>>>
>>> 
>>>    .
>>>    
>>>      >>
>>>
>>>
>>> but instead all the fields are flattened down to the top level.
>>>
>>> I know this must be easy, what am I missing... ?
>>>
>>
>

RE: Newbie question, ant target for packaging source files from local copy?

2011-08-25 Thread Steven A Rowe

Hi sid,

The current source packaging scheme aims to *avoid* including local changes :), 
so yes, there is no support currently for what you want to do.

Prior to <https://issues.apache.org/jira/browse/LUCENE-2973>, the source 
packaging scheme used the current sources rather than pulling from Subversion.  
If you check out trunk revision 1083212 or earlier, or branch_3x revision 
1083234 or earlier, you can see how it used to be done.

If you want to resurrect the previous source packaging scheme as a new Ant 
target (maybe named "package-local-src-tgz"?), and you make a new JIRA issue 
and post a patch, and I'll help you get it committed (assuming nobody objects). 
 If you haven't seen the Solr Wiki HowToContribute page 
<http://wiki.apache.org/solr/HowToContribute>, it may be of use to you for this.

Steve

> -Original Message-
> From: syyang [mailto:syyan...@gmail.com]
> Sent: Wednesday, August 24, 2011 10:07 PM
> To: solr-user@lucene.apache.org
> Subject: Newbie question, ant target for packaging source files from
> local copy?
> 
> Hi all,
> 
> I am trying to package source files containing local changes. While
> running
> ant dist creates a war file containing the local changes, running ant
> package-src-tgz exports files straight from svn repository, and does not
> pick up any of the local changes.
> 
> Is there an ant target that I can use to package local copy of the source
> files? Or are are we expected to just write our own?
> 
> Thanks,
> -Sid
> 
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Newbie-
> question-ant-target-for-packaging-source-files-from-local-copy-
> tp3282787p3282787.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie Question, can I store structured sub elements?

2011-08-25 Thread Zac Tolley

I know I can have multi value on them but that doesn't let me see that
a showing instance happens at a particular time on a particular
channel, just that it shows on a range of channels at a range of times

Starting to think I will have to either store a formatted string that
combines them or keep it flat just for indexing, retrieve ids and use
them to get data out of the RDBMS


On 24 Aug 2011, at 23:09, dan whelan  wrote:

> You could change starttime and channelname to multiValued=true and use these 
> fields to store all the values for those fields.
>
> showing.movie_id and showing.id probably isn't needed in a solr record.
>
>
>
> On 8/24/11 7:53 AM, Zac Tolley wrote:
>> I have a very scenario in which I have a film and showings, each film has
>> multiple showings at set times on set channels, so I have:
>>
>> Movie
>> -
>> id
>> title
>> description
>> duration
>>
>>
>> Showing
>> -
>> id
>> movie_id
>> starttime
>> channelname
>>
>>
>>
>> I want to know can I store this in solr so that I keep this stucture?
>>
>> I did try to do an initial import with the DIH using this config:
>>
>> 
>>
>>
>>
>>
>>
>>  
>>  
>>  
>>
>> 
>>
>> I was hoping, for each movie to get a sub entity with the showing like:
>>
>> 
>>.
>>
>>  >
>>
>>
>> but instead all the fields are flattened down to the top level.
>>
>> I know this must be easy, what am I missing... ?
>>
>

Re: Newbie Question, can I store structured sub elements?

2011-08-24 Thread dan whelan

You could change starttime and channelname to multiValued=true and use 
these fields to store all the values for those fields.


showing.movie_id and showing.id probably isn't needed in a solr record.



On 8/24/11 7:53 AM, Zac Tolley wrote:

I have a very scenario in which I have a film and showings, each film has
multiple showings at set times on set channels, so I have:

Movie
-
id
title
description
duration


Showing
-
id
movie_id
starttime
channelname



I want to know can I store this in solr so that I keep this stucture?

I did try to do an initial import with the DIH using this config:







  
  
  



I was hoping, for each movie to get a sub entity with the showing like:


.

Newbie Question, can I store structured sub elements?

2011-08-24 Thread Zac Tolley

I have a very scenario in which I have a film and showings, each film has
multiple showings at set times on set channels, so I have:

Movie
-
id
title
description
duration


Showing
-
id
movie_id
starttime
channelname



I want to know can I store this in solr so that I keep this stucture?

I did try to do an initial import with the DIH using this config:


   
   
   

   
 
 
 
   


I was hoping, for each movie to get a sub entity with the showing like:


   .

Re: Newbie question: how to deal with different # of search results per page due to pagination then grouping

2011-06-02 Thread Michael Sokolov

Just keep one extra facet value hidden; ie request one more than you 
need to show the current page.  If you get it, there are more (show the 
next button), otherwise there aren't.  You can't page arbitrarily deep 
like this, but you can have a next button reliably enabled or disabled.


On 6/1/2011 5:57 PM, Robert Petersen wrote:

Yes that is exactly the issue... we're thinking just maybe always have a
next button and if you go too far you just get zero results.  User gets
what the user asks for, and so user could simply back up if desired to
where the facet still has values.  Could also detect an empty facet
results on the front end.  You can also only expand one facet only to
allow paging only the facet pane and not the whole page using an ajax
call.



-Original Message-
From: Jonathan Rochkind [mailto:rochk...@jhu.edu]
Sent: Wednesday, June 01, 2011 2:30 PM
To: solr-user@lucene.apache.org
Cc: Robert Petersen
Subject: Re: Newbie question: how to deal with different # of search
results per page due to pagination then grouping

How do you know whether to provide a 'next' button, or whether you are
the end of your facet list?

On 6/1/2011 4:47 PM, Robert Petersen wrote:

I think facet.offset allows facet paging nicely by letting you index
into the list of facet values.  It is working for me...

http://wiki.apache.org/solr/SimpleFacetParameters#facet.offset


-Original Message-
From: Jonathan Rochkind [mailto:rochk...@jhu.edu]
Sent: Wednesday, June 01, 2011 12:41 PM
To: solr-user@lucene.apache.org
Subject: Re: Newbie question: how to deal with different # of search
results per page due to pagination then grouping

There's no great way to do that.

One approach would be using facets, but that will just get you the
author names (as stored in fields), and not the documents under it. If
you really only want to show the author names, facets could work. One
issue with facets though is Solr won't tell you the total number of
facet values for your query, so it's tricky to provide next/prev

paging

through them.

There is also a 'field collapsing' feature that I think is not in a
released Solr, but may be in the Solr repo. I'm not sure it will quite
do what you want either though, although it's related and worth a

look.

http://wiki.apache.org/solr/FieldCollapsing

Another vaguely related thing that is also not yet in a released Solr,
is a 'join' function. That could possibly be used to do what you want,
although it'd be tricky too.
https://issues.apache.org/jira/browse/SOLR-2272

Jonathan

On 6/1/2011 2:56 PM, beccax wrote:

Apologize if this question has already been raised.  I tried

searching

but

couldn't find the relevant posts.

We've indexed a bunch of documents by different authors.  Then for

search

results, we'd like to show the authors that have 1 or more documents
matching the search keywords.

The problem is right now our solr search method first paginates

results to

100 documents per page, then we take the results and group by

authors.

This

results in different number of authors per page.  (Some authors may

only

have one matching document and others 5 or 10.)

How do we change it to somehow show the same number of authors (say

25) per

page?

I mean alternatively we could just show all the documents themselves

ordered

by author, but it's not the user experience we're looking for.

Thanks so much.  And please let me know if you need more details not
provided here.
B

--
View this message in context:

http://lucene.472066.n3.nabble.com/Newbie-question-how-to-deal-with-diff
erent-of-search-results-per-page-due-to-pagination-then-grouping-tp30121

68p3012168.html

Sent from the Solr - User mailing list archive at Nabble.com.

RE: Newbie question: how to deal with different # of search results per page due to pagination then grouping

2011-06-01 Thread Robert Petersen

Yes that is exactly the issue... we're thinking just maybe always have a
next button and if you go too far you just get zero results.  User gets
what the user asks for, and so user could simply back up if desired to
where the facet still has values.  Could also detect an empty facet
results on the front end.  You can also only expand one facet only to
allow paging only the facet pane and not the whole page using an ajax
call.



-Original Message-
From: Jonathan Rochkind [mailto:rochk...@jhu.edu] 
Sent: Wednesday, June 01, 2011 2:30 PM
To: solr-user@lucene.apache.org
Cc: Robert Petersen
Subject: Re: Newbie question: how to deal with different # of search
results per page due to pagination then grouping

How do you know whether to provide a 'next' button, or whether you are 
the end of your facet list?

On 6/1/2011 4:47 PM, Robert Petersen wrote:
> I think facet.offset allows facet paging nicely by letting you index
> into the list of facet values.  It is working for me...
>
> http://wiki.apache.org/solr/SimpleFacetParameters#facet.offset
>
>
> -Original Message-
> From: Jonathan Rochkind [mailto:rochk...@jhu.edu]
> Sent: Wednesday, June 01, 2011 12:41 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Newbie question: how to deal with different # of search
> results per page due to pagination then grouping
>
> There's no great way to do that.
>
> One approach would be using facets, but that will just get you the
> author names (as stored in fields), and not the documents under it. If
> you really only want to show the author names, facets could work. One
> issue with facets though is Solr won't tell you the total number of
> facet values for your query, so it's tricky to provide next/prev
paging
> through them.
>
> There is also a 'field collapsing' feature that I think is not in a
> released Solr, but may be in the Solr repo. I'm not sure it will quite
> do what you want either though, although it's related and worth a
look.
> http://wiki.apache.org/solr/FieldCollapsing
>
> Another vaguely related thing that is also not yet in a released Solr,
> is a 'join' function. That could possibly be used to do what you want,
> although it'd be tricky too.
> https://issues.apache.org/jira/browse/SOLR-2272
>
> Jonathan
>
> On 6/1/2011 2:56 PM, beccax wrote:
>> Apologize if this question has already been raised.  I tried
searching
> but
>> couldn't find the relevant posts.
>>
>> We've indexed a bunch of documents by different authors.  Then for
> search
>> results, we'd like to show the authors that have 1 or more documents
>> matching the search keywords.
>>
>> The problem is right now our solr search method first paginates
> results to
>> 100 documents per page, then we take the results and group by
authors.
> This
>> results in different number of authors per page.  (Some authors may
> only
>> have one matching document and others 5 or 10.)
>>
>> How do we change it to somehow show the same number of authors (say
> 25) per
>> page?
>>
>> I mean alternatively we could just show all the documents themselves
> ordered
>> by author, but it's not the user experience we're looking for.
>>
>> Thanks so much.  And please let me know if you need more details not
>> provided here.
>> B
>>
>> --
>> View this message in context:
>
http://lucene.472066.n3.nabble.com/Newbie-question-how-to-deal-with-diff
>
erent-of-search-results-per-page-due-to-pagination-then-grouping-tp30121
> 68p3012168.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>

Re: Newbie question: how to deal with different # of search results per page due to pagination then grouping

2011-06-01 Thread Jonathan Rochkind

How do you know whether to provide a 'next' button, or whether you are 
the end of your facet list?


On 6/1/2011 4:47 PM, Robert Petersen wrote:

I think facet.offset allows facet paging nicely by letting you index
into the list of facet values.  It is working for me...

http://wiki.apache.org/solr/SimpleFacetParameters#facet.offset


-Original Message-
From: Jonathan Rochkind [mailto:rochk...@jhu.edu]
Sent: Wednesday, June 01, 2011 12:41 PM
To: solr-user@lucene.apache.org
Subject: Re: Newbie question: how to deal with different # of search
results per page due to pagination then grouping

There's no great way to do that.

One approach would be using facets, but that will just get you the
author names (as stored in fields), and not the documents under it. If
you really only want to show the author names, facets could work. One
issue with facets though is Solr won't tell you the total number of
facet values for your query, so it's tricky to provide next/prev paging
through them.

There is also a 'field collapsing' feature that I think is not in a
released Solr, but may be in the Solr repo. I'm not sure it will quite
do what you want either though, although it's related and worth a look.
http://wiki.apache.org/solr/FieldCollapsing

Another vaguely related thing that is also not yet in a released Solr,
is a 'join' function. That could possibly be used to do what you want,
although it'd be tricky too.
https://issues.apache.org/jira/browse/SOLR-2272

Jonathan

On 6/1/2011 2:56 PM, beccax wrote:

Apologize if this question has already been raised.  I tried searching

but

couldn't find the relevant posts.

We've indexed a bunch of documents by different authors.  Then for

search

results, we'd like to show the authors that have 1 or more documents
matching the search keywords.

The problem is right now our solr search method first paginates

results to

100 documents per page, then we take the results and group by authors.

This

results in different number of authors per page.  (Some authors may

only

have one matching document and others 5 or 10.)

How do we change it to somehow show the same number of authors (say

25) per

page?

I mean alternatively we could just show all the documents themselves

ordered

by author, but it's not the user experience we're looking for.

Thanks so much.  And please let me know if you need more details not
provided here.
B

--
View this message in context:

http://lucene.472066.n3.nabble.com/Newbie-question-how-to-deal-with-diff
erent-of-search-results-per-page-due-to-pagination-then-grouping-tp30121
68p3012168.html

Sent from the Solr - User mailing list archive at Nabble.com.

RE: Newbie question: how to deal with different # of search results per page due to pagination then grouping

2011-06-01 Thread Robert Petersen

I think facet.offset allows facet paging nicely by letting you index
into the list of facet values.  It is working for me...

http://wiki.apache.org/solr/SimpleFacetParameters#facet.offset

-Original Message-
From: Jonathan Rochkind [mailto:rochk...@jhu.edu] 
Sent: Wednesday, June 01, 2011 12:41 PM
To: solr-user@lucene.apache.org
Subject: Re: Newbie question: how to deal with different # of search
results per page due to pagination then grouping

There's no great way to do that.

One approach would be using facets, but that will just get you the 
author names (as stored in fields), and not the documents under it. If 
you really only want to show the author names, facets could work. One 
issue with facets though is Solr won't tell you the total number of 
facet values for your query, so it's tricky to provide next/prev paging 
through them.

There is also a 'field collapsing' feature that I think is not in a 
released Solr, but may be in the Solr repo. I'm not sure it will quite 
do what you want either though, although it's related and worth a look. 
http://wiki.apache.org/solr/FieldCollapsing

Another vaguely related thing that is also not yet in a released Solr, 
is a 'join' function. That could possibly be used to do what you want, 
although it'd be tricky too.
https://issues.apache.org/jira/browse/SOLR-2272

Jonathan

On 6/1/2011 2:56 PM, beccax wrote:
> Apologize if this question has already been raised.  I tried searching
but
> couldn't find the relevant posts.
>
> We've indexed a bunch of documents by different authors.  Then for
search
> results, we'd like to show the authors that have 1 or more documents
> matching the search keywords.
>
> The problem is right now our solr search method first paginates
results to
> 100 documents per page, then we take the results and group by authors.
This
> results in different number of authors per page.  (Some authors may
only
> have one matching document and others 5 or 10.)
>
> How do we change it to somehow show the same number of authors (say
25) per
> page?
>
> I mean alternatively we could just show all the documents themselves
ordered
> by author, but it's not the user experience we're looking for.
>
> Thanks so much.  And please let me know if you need more details not
> provided here.
> B
>
> --
> View this message in context:
http://lucene.472066.n3.nabble.com/Newbie-question-how-to-deal-with-diff
erent-of-search-results-per-page-due-to-pagination-then-grouping-tp30121
68p3012168.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

RE: Newbie question: how to deal with different # of search results per page due to pagination then grouping

2011-06-01 Thread Robert Petersen

Don't manually group by author from your results, the list will always
be incomplete...  use faceting instead to show the authors of the books
you have found in your search.

http://wiki.apache.org/solr/SolrFacetingOverview

-Original Message-
From: beccax [mailto:bec...@gmail.com] 
Sent: Wednesday, June 01, 2011 11:56 AM
To: solr-user@lucene.apache.org
Subject: Newbie question: how to deal with different # of search results
per page due to pagination then grouping

Apologize if this question has already been raised.  I tried searching
but
couldn't find the relevant posts.

We've indexed a bunch of documents by different authors.  Then for
search
results, we'd like to show the authors that have 1 or more documents
matching the search keywords.  

The problem is right now our solr search method first paginates results
to
100 documents per page, then we take the results and group by authors.
This
results in different number of authors per page.  (Some authors may only
have one matching document and others 5 or 10.)

How do we change it to somehow show the same number of authors (say 25)
per
page?

I mean alternatively we could just show all the documents themselves
ordered
by author, but it's not the user experience we're looking for.

Thanks so much.  And please let me know if you need more details not
provided here.
B

--
View this message in context:
http://lucene.472066.n3.nabble.com/Newbie-question-how-to-deal-with-diff
erent-of-search-results-per-page-due-to-pagination-then-grouping-tp30121
68p3012168.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie question: how to deal with different # of search results per page due to pagination then grouping

2011-06-01 Thread Jonathan Rochkind

There's no great way to do that.

One approach would be using facets, but that will just get you the
author names (as stored in fields), and not the documents under it. If
you really only want to show the author names, facets could work. One
issue with facets though is Solr won't tell you the total number of
facet values for your query, so it's tricky to provide next/prev paging
through them.

There is also a 'field collapsing' feature that I think is not in a
released Solr, but may be in the Solr repo. I'm not sure it will quite
do what you want either though, although it's related and worth a look.
http://wiki.apache.org/solr/FieldCollapsing

Another vaguely related thing that is also not yet in a released Solr,
is a 'join' function. That could possibly be used to do what you want,
although it'd be tricky too. https://issues.apache.org/jira/browse/SOLR-2272

Jonathan

On 6/1/2011 2:56 PM, beccax wrote:

Apologize if this question has already been raised. I tried searching but
couldn't find the relevant posts.

We've indexed a bunch of documents by different authors. Then for search
results, we'd like to show the authors that have 1 or more documents
matching the search keywords.

The problem is right now our solr search method first paginates results to
100 documents per page, then we take the results and group by authors. This
results in different number of authors per page. (Some authors may only
have one matching document and others 5 or 10.)

How do we change it to somehow show the same number of authors (say 25) per
page?

I mean alternatively we could just show all the documents themselves ordered
by author, but it's not the user experience we're looking for.

Thanks so much. And please let me know if you need more details not
provided here.
B

--
View this message in context:
http://lucene.472066.n3.nabble.com/Newbie-question-how-to-deal-with-different-of-search-results-per-page-due-to-pagination-then-grouping-tp3012168p3012168.html
Sent from the Solr - User mailing list archive at Nabble.com.

Newbie question: how to deal with different # of search results per page due to pagination then grouping

2011-06-01 Thread beccax

Apologize if this question has already been raised.  I tried searching but
couldn't find the relevant posts.

We've indexed a bunch of documents by different authors.  Then for search
results, we'd like to show the authors that have 1 or more documents
matching the search keywords.  

The problem is right now our solr search method first paginates results to
100 documents per page, then we take the results and group by authors.  This
results in different number of authors per page.  (Some authors may only
have one matching document and others 5 or 10.)

How do we change it to somehow show the same number of authors (say 25) per
page?

I mean alternatively we could just show all the documents themselves ordered
by author, but it's not the user experience we're looking for.

Thanks so much.  And please let me know if you need more details not
provided here.
B

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Newbie-question-how-to-deal-with-different-of-search-results-per-page-due-to-pagination-then-grouping-tp3012168p3012168.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: newbie question for DataImportHandler

2011-05-31 Thread Kevin Bootz

In the op it's stated that the index was deleted. I'm guessing that means the 
physical files, /data/  
quote
populate the table 
> with another million rows of data.
> I remove the index that solr previously create. I restart solr and go 
> to
the
> data import handler development console and do the full import again.
endquote

Is there a separate cache that could be causing the issue? I'm a newbie as well 
and it seems that if I delete the index there shouldn't be any vestige info 
left anywhere

Thanks

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Sunday, May 29, 2011 9:00 PM
To: solr-user@lucene.apache.org
Subject: Re: newbie question for DataImportHandler

This trips up a lot of folks. Sold just marks docs as deleted, the terms etc 
are left in the index until an optimize is performed, or the segments are 
merged. This latter isn't very predictable, so just do an optimize.

The docs aren't returned as results though.

Best
Erick
On May 24, 2011 10:22 PM, "antoniosi"  wrote:
> Hi,
>
> I am new to Solr; apologize in advance if this is a stupid question.
>
> I have created a simple database, with only 1 table with 3 columns, 
> id, name, and last_update fields.
>
> I populate the database with 1 million test rows.
> I run solr, go to the data import handler development console and do a
full
> import. I use the "Luke" tool to look at the content of the lucene index.
>
> This all works fine so far.
>
> I remove all the 1 million rows from my table and populate the table 
> with another million rows of data.
> I remove the index that solr previously create. I restart solr and go 
> to
the
> data import handler development console and do the full import again.
>
> I use the "Luke" tool to look at the content of the lucene index. 
> However,
I
> am seeing the old data in my new index.
>
> Doe Solr keeps a cached copy of the index somewhere?
>
> I hope I have described my problem clearly.
>
> Thanks in advance.
>
> --
> View this message in context:
http://lucene.472066.n3.nabble.com/newbie-question-for-DataImportHandler-tp2982277p2982277.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: newbie question for DataImportHandler

2011-05-29 Thread Erick Erickson

This trips up a lot of folks. Sold just marks docs as deleted, the terms etc
are left in the index until an optimize is performed, or the segments are
merged. This latter isn't very predictable, so just do an optimize.

The docs aren't returned as results though.

Best
Erick
On May 24, 2011 10:22 PM, "antoniosi"  wrote:
> Hi,
>
> I am new to Solr; apologize in advance if this is a stupid question.
>
> I have created a simple database, with only 1 table with 3 columns, id,
> name, and last_update fields.
>
> I populate the database with 1 million test rows.
> I run solr, go to the data import handler development console and do a
full
> import. I use the "Luke" tool to look at the content of the lucene index.
>
> This all works fine so far.
>
> I remove all the 1 million rows from my table and populate the table with
> another million rows of data.
> I remove the index that solr previously create. I restart solr and go to
the
> data import handler development console and do the full import again.
>
> I use the "Luke" tool to look at the content of the lucene index. However,
I
> am seeing the old data in my new index.
>
> Doe Solr keeps a cached copy of the index somewhere?
>
> I hope I have described my problem clearly.
>
> Thanks in advance.
>
> --
> View this message in context:
http://lucene.472066.n3.nabble.com/newbie-question-for-DataImportHandler-tp2982277p2982277.html
> Sent from the Solr - User mailing list archive at Nabble.com.

RE: newbie question for DataImportHandler

2011-05-24 Thread Zac Smith

Sounds like you might not be committing the delete. How are you deleting it?
If you run the data import handler with clean=true (which is the default) it 
will delete the data for you anyway so you don't need to delete it yourself.

Hope that helps.

-Original Message-
From: antoniosi [mailto:antonio...@gmail.com] 
Sent: Tuesday, May 24, 2011 4:43 PM
To: solr-user@lucene.apache.org
Subject: newbie question for DataImportHandler

Hi,

I am new to Solr; apologize in advance if this is a stupid question.

I have created a simple database, with only 1 table with 3 columns, id, name, 
and last_update fields.

I populate the database with 1 million test rows.
I run solr, go to the data import handler development console and do a full 
import. I use the "Luke" tool to look at the content of the lucene index.

This all works fine so far.

I remove all the 1 million rows from my table and populate the table with 
another million rows of data.
I remove the index that solr previously create. I restart solr and go to the 
data import handler development console and do the full import again.

I use the "Luke" tool to look at the content of the lucene index. However, I am 
seeing the old data in my new index.

Doe Solr keeps a cached copy of the index somewhere?

I hope I have described my problem clearly.

Thanks in advance.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/newbie-question-for-DataImportHandler-tp2982277p2982277.html
Sent from the Solr - User mailing list archive at Nabble.com.

newbie question for DataImportHandler

2011-05-24 Thread antoniosi

Hi,

I am new to Solr; apologize in advance if this is a stupid question.

I have created a simple database, with only 1 table with 3 columns, id,
name, and last_update fields.

I populate the database with 1 million test rows.
I run solr, go to the data import handler development console and do a full
import. I use the "Luke" tool to look at the content of the lucene index.

This all works fine so far.

I remove all the 1 million rows from my table and populate the table with
another million rows of data.
I remove the index that solr previously create. I restart solr and go to the
data import handler development console and do the full import again.

I use the "Luke" tool to look at the content of the lucene index. However, I
am seeing the old data in my new index.

Doe Solr keeps a cached copy of the index somewhere?

I hope I have described my problem clearly.

Thanks in advance.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/newbie-question-for-DataImportHandler-tp2982277p2982277.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: A Newbie Question

2010-11-15 Thread Lance Norskog

"There is no current feature" is what I meant. Yes, it would be very 
handy to do this.


I handled this problem in the DIH by creating two documents, both with 
the same unique ID. The first doc just had the metadata. The second 
document parsed the input with Tika, but had 'skip doc on error' set. 
So, if the parsing worked, the parsed document overwrote the first 
document. If parsing failed, the metadata-only document went in.


Works quite well!

Ken Krugler wrote:


On Nov 14, 2010, at 3:02pm, Lance Norskog wrote:


Yes, the ExtractingRequestHandler uses Tika to parse many file formats.

Solr 1.4.1 uses a previous version of Tika (0.6 or 0.7).

Here's the problem with Tika and extraction utilities in general: 
they are not perfect. They will fail on some files. In the 
ExtractingRequestHandler's case, there is no way to let it fail in 
parsing but save the document's metadata anyway with a notation: 
"sorry not parsed".


By "there is no way" do you mean in configuring the current 
ExtractingRequestHandler? Or is there some fundamental issue with how 
Solr uses Tika that prevents ExtractingRequestHandler from being 
modified to work this way (which seems like a useful configuration 
settings)?


Regards,

-- Ken

I would rather have the unix 'strings' command parse my documents 
(thanks to a co-worker for this).


K. Seshadri Iyer wrote:

Thanks for all the responses.

Govind: To answer your question, yes, all I want to search is plain 
text
files. They are located in NFS directories across multiple 
Solaris/Linux

storage boxes. The total storage is in hundreds of terabytes.

I have just got started with Solr and my understanding is that I will
somehow need Tika to help stream/upload files to Solr. I don't know 
anything
about Java programming, being a system admin. So far, I have read 
that the
autodetect parser in Tika will somehow detect the file type and I 
can use
the stream to populate Solr. How, that is still a mystery to me - 
working on

it. Any tips appreciated; thanks in advance.

Sesh



On 13 November 2010 15:24, Govind Kanshi  
wrote:



Another pov you might want to think about - what kind of search you 
want.

Just plain - full text search or there is something more to those text
files. Are they grouped in folders? Do the folders imply certain 
kind of

grouping/hierarchy/tagging?

I recently was trying to help somebody who had files across lot of 
places
grouped by date/subject/author - he wanted to ensure these are 
"fields"

which too can act as filters/navigators.

Just an input - ignore it if you just want plain full text search.

On Sat, Nov 13, 2010 at 11:25 AM, Lance Norskog  
wrote:



About web servers: Solr is a servlet war file and needs a Java web 
server

"container" to run. The example/ folder in the Solr disribution uses
'Jetty', and this is fine for small production-quality projects.  
You can

just copy the example/ directory somewhere to set up your own running


Solr;


that's what I always do.

About indexing programs: if you know Unix scripting, it may be 
easiest to

walk the file system yourself with the 'find' program and create Solr


input


XML files.

But yes, you definitely want the Solr 1.4 Enterprise manual. I spent


months

learning this stuff very slowly, and the book would have been 
great back

then.

Lance


Erick Erickson wrote:



Think of the data import handler (DIH) as Solr pulling data to index
from some source based on configuration. So, once you set up
your DIH config to point to your file system, you issue a command
to solr like "OK, do your data import thing". See the
FileListEntityProcessor.
http://wiki.apache.org/solr/DataImportHandler

SolrJ is a clent 
library

you'd use to push data to Solr. Basically, you
write a Java program that uses SolrJ to walk the file system, find
documents, create a Solr document and sent that to Solr. It's not
nearly as complex as it sounds. See:
http://wiki.apache.org/solr/Solrj

It's probably worth your while to


get


a
copy of "Solr 1.4, Enterprise Search Server"
by Erik Pugh and David Smiley.

Best
Erick

On Fri, Nov 12, 2010 at 8:37 AM, K. Seshadri 
Iyer


wrote:






Hi Lance,

Thank you very much for responding (not sure how I reply to the 
group,

so,
writing to you).

Can you please expand on your suggestion? I am not a web guy and 
so,

don't
know where to start.

What is the difference between SolrJ and DataImportHandler? Do I 
need



to


set
up web servers on all my storage boxes?

Apologies for the basic level of questions, but hope I can get 
started

and
implement this before the year end (you know why :o)

Thanks,

Sesh

On 12 November 2010 13:31, Lance Norskog   
wrote:






Using 'curl' is fine. There is a library called SolrJ for Java and
other libraries for other scripting languages that let you 
upload with
more control. There is a thing in Solr called the 
DataImportHandler

that lets you script w

Re: A Newbie Question

2010-11-14 Thread Ken Krugler



On Nov 14, 2010, at 3:02pm, Lance Norskog wrote:

Yes, the ExtractingRequestHandler uses Tika to parse many file  
formats.


Solr 1.4.1 uses a previous version of Tika (0.6 or 0.7).

Here's the problem with Tika and extraction utilities in general:  
they are not perfect. They will fail on some files. In the  
ExtractingRequestHandler's case, there is no way to let it fail in  
parsing but save the document's metadata anyway with a notation:  
"sorry not parsed".


By "there is no way" do you mean in configuring the current  
ExtractingRequestHandler? Or is there some fundamental issue with how  
Solr uses Tika that prevents ExtractingRequestHandler from being  
modified to work this way (which seems like a useful configuration  
settings)?


Regards,

-- Ken

I would rather have the unix 'strings' command parse my documents  
(thanks to a co-worker for this).


K. Seshadri Iyer wrote:

Thanks for all the responses.

Govind: To answer your question, yes, all I want to search is plain  
text
files. They are located in NFS directories across multiple Solaris/ 
Linux

storage boxes. The total storage is in hundreds of terabytes.

I have just got started with Solr and my understanding is that I will
somehow need Tika to help stream/upload files to Solr. I don't know  
anything
about Java programming, being a system admin. So far, I have read  
that the
autodetect parser in Tika will somehow detect the file type and I  
can use
the stream to populate Solr. How, that is still a mystery to me -  
working on

it. Any tips appreciated; thanks in advance.

Sesh



On 13 November 2010 15:24, Govind Kanshi   
wrote:



Another pov you might want to think about - what kind of search  
you want.
Just plain - full text search or there is something more to those  
text
files. Are they grouped in folders? Do the folders imply certain  
kind of

grouping/hierarchy/tagging?

I recently was trying to help somebody who had files across lot of  
places
grouped by date/subject/author - he wanted to ensure these are  
"fields"

which too can act as filters/navigators.

Just an input - ignore it if you just want plain full text search.

On Sat, Nov 13, 2010 at 11:25 AM, Lance  
Norskog  wrote:



About web servers: Solr is a servlet war file and needs a Java  
web server
"container" to run. The example/ folder in the Solr disribution  
uses
'Jetty', and this is fine for small production-quality projects.   
You can
just copy the example/ directory somewhere to set up your own  
running



Solr;


that's what I always do.

About indexing programs: if you know Unix scripting, it may be  
easiest to
walk the file system yourself with the 'find' program and create  
Solr



input


XML files.

But yes, you definitely want the Solr 1.4 Enterprise manual. I  
spent



months

learning this stuff very slowly, and the book would have been  
great back

then.

Lance


Erick Erickson wrote:


Think of the data import handler (DIH) as Solr pulling data to  
index

from some source based on configuration. So, once you set up
your DIH config to point to your file system, you issue a command
to solr like "OK, do your data import thing". See the
FileListEntityProcessor.
http://wiki.apache.org/solr/DataImportHandler

SolrJ is a clent  
library

you'd use to push data to Solr. Basically, you
write a Java program that uses SolrJ to walk the file system, find
documents, create a Solr document and sent that to Solr. It's not
nearly as complex as it sounds. See:
http://wiki.apache.org/solr/Solrj

It's probably worth your  
while to



get


a
copy of "Solr 1.4, Enterprise Search Server"
by Erik Pugh and David Smiley.

Best
Erick

On Fri, Nov 12, 2010 at 8:37 AM, K. Seshadri Iyer
wrote:






Hi Lance,

Thank you very much for responding (not sure how I reply to the  
group,

so,
writing to you).

Can you please expand on your suggestion? I am not a web guy  
and so,

don't
know where to start.

What is the difference between SolrJ and DataImportHandler? Do  
I need



to


set
up web servers on all my storage boxes?

Apologies for the basic level of questions, but hope I can get  
started

and
implement this before the year end (you know why :o)

Thanks,

Sesh

On 12 November 2010 13:31, Lance Norskog
wrote:





Using 'curl' is fine. There is a library called SolrJ for Java  
and
other libraries for other scripting languages that let you  
upload with
more control. There is a thing in Solr called the  
DataImportHandler

that lets you script walking a file system.

On Thu, Nov 11, 2010 at 8:38 PM, K. Seshadri Iyer<


seshadri...@gmail.com


wrote:




Hi,

Pardon me if this sounds very elementary, but I have a very  
basic





question




regarding Solr search. I have about 10 storage devices running


Solaris





with



hundreds of thousands of text files (there are other files,  
as well,





but





my



target is these text files). The directories on t

Re: A Newbie Question

2010-11-14 Thread Lance Norskog


Yes, the ExtractingRequestHandler uses Tika to parse many file formats.

Solr 1.4.1 uses a previous version of Tika (0.6 or 0.7).

Here's the problem with Tika and extraction utilities in general: they 
are not perfect. They will fail on some files. In the 
ExtractingRequestHandler's case, there is no way to let it fail in 
parsing but save the document's metadata anyway with a notation: "sorry 
not parsed".  I would rather have the unix 'strings' command parse my 
documents (thanks to a co-worker for this).


K. Seshadri Iyer wrote:

Thanks for all the responses.

Govind: To answer your question, yes, all I want to search is plain text
files. They are located in NFS directories across multiple Solaris/Linux
storage boxes. The total storage is in hundreds of terabytes.

I have just got started with Solr and my understanding is that I will
somehow need Tika to help stream/upload files to Solr. I don't know anything
about Java programming, being a system admin. So far, I have read that the
autodetect parser in Tika will somehow detect the file type and I can use
the stream to populate Solr. How, that is still a mystery to me - working on
it. Any tips appreciated; thanks in advance.

Sesh



On 13 November 2010 15:24, Govind Kanshi  wrote:

   

Another pov you might want to think about - what kind of search you want.
Just plain - full text search or there is something more to those text
files. Are they grouped in folders? Do the folders imply certain kind of
grouping/hierarchy/tagging?

I recently was trying to help somebody who had files across lot of places
grouped by date/subject/author - he wanted to ensure these are "fields"
which too can act as filters/navigators.

Just an input - ignore it if you just want plain full text search.

On Sat, Nov 13, 2010 at 11:25 AM, Lance Norskog  wrote:

 

About web servers: Solr is a servlet war file and needs a Java web server
"container" to run. The example/ folder in the Solr disribution uses
'Jetty', and this is fine for small production-quality projects.  You can
just copy the example/ directory somewhere to set up your own running
   

Solr;
 

that's what I always do.

About indexing programs: if you know Unix scripting, it may be easiest to
walk the file system yourself with the 'find' program and create Solr
   

input
 

XML files.

But yes, you definitely want the Solr 1.4 Enterprise manual. I spent
   

months
 

learning this stuff very slowly, and the book would have been great back
then.

Lance


Erick Erickson wrote:

   

Think of the data import handler (DIH) as Solr pulling data to index
from some source based on configuration. So, once you set up
your DIH config to point to your file system, you issue a command
to solr like "OK, do your data import thing". See the
FileListEntityProcessor.
http://wiki.apache.org/solr/DataImportHandler

SolrJ is a clent library
you'd use to push data to Solr. Basically, you
write a Java program that uses SolrJ to walk the file system, find
documents, create a Solr document and sent that to Solr. It's not
nearly as complex as it sounds. See:
http://wiki.apache.org/solr/Solrj

It's probably worth your while to
 

get
 

a
copy of "Solr 1.4, Enterprise Search Server"
by Erik Pugh and David Smiley.

Best
Erick

On Fri, Nov 12, 2010 at 8:37 AM, K. Seshadri Iyer 

wrote:
   



 

Hi Lance,

Thank you very much for responding (not sure how I reply to the group,
so,
writing to you).

Can you please expand on your suggestion? I am not a web guy and so,
don't
know where to start.

What is the difference between SolrJ and DataImportHandler? Do I need
   

to
 

set
up web servers on all my storage boxes?

Apologies for the basic level of questions, but hope I can get started
and
implement this before the year end (you know why :o)

Thanks,

Sesh

On 12 November 2010 13:31, Lance Norskog   wrote:



   

Using 'curl' is fine. There is a library called SolrJ for Java and
other libraries for other scripting languages that let you upload with
more control. There is a thing in Solr called the DataImportHandler
that lets you script walking a file system.

On Thu, Nov 11, 2010 at 8:38 PM, K. Seshadri Iyer<
 

seshadri...@gmail.com
 

wrote:


 

Hi,

Pardon me if this sounds very elementary, but I have a very basic


   

question


 

regarding Solr search. I have about 10 storage devices running
   

Solaris
 


   

with


 

hundreds of thousands of text files (there are other files, as well,


   

but
 


   

my


 

target is these text files). The directories on the Solaris boxes are
exported and are available as NFS mounts.

I have installed Solr 1.4 on a Linux box and have tested the


   

installation,

Re: A Newbie Question

2010-11-14 Thread K. Seshadri Iyer

Thanks for all the responses.

Govind: To answer your question, yes, all I want to search is plain text
files. They are located in NFS directories across multiple Solaris/Linux
storage boxes. The total storage is in hundreds of terabytes.

I have just got started with Solr and my understanding is that I will
somehow need Tika to help stream/upload files to Solr. I don't know anything
about Java programming, being a system admin. So far, I have read that the
autodetect parser in Tika will somehow detect the file type and I can use
the stream to populate Solr. How, that is still a mystery to me - working on
it. Any tips appreciated; thanks in advance.

Sesh



On 13 November 2010 15:24, Govind Kanshi  wrote:

> Another pov you might want to think about - what kind of search you want.
> Just plain - full text search or there is something more to those text
> files. Are they grouped in folders? Do the folders imply certain kind of
> grouping/hierarchy/tagging?
>
> I recently was trying to help somebody who had files across lot of places
> grouped by date/subject/author - he wanted to ensure these are "fields"
> which too can act as filters/navigators.
>
> Just an input - ignore it if you just want plain full text search.
>
> On Sat, Nov 13, 2010 at 11:25 AM, Lance Norskog  wrote:
>
> > About web servers: Solr is a servlet war file and needs a Java web server
> > "container" to run. The example/ folder in the Solr disribution uses
> > 'Jetty', and this is fine for small production-quality projects.  You can
> > just copy the example/ directory somewhere to set up your own running
> Solr;
> > that's what I always do.
> >
> > About indexing programs: if you know Unix scripting, it may be easiest to
> > walk the file system yourself with the 'find' program and create Solr
> input
> > XML files.
> >
> > But yes, you definitely want the Solr 1.4 Enterprise manual. I spent
> months
> > learning this stuff very slowly, and the book would have been great back
> > then.
> >
> > Lance
> >
> >
> > Erick Erickson wrote:
> >
> >> Think of the data import handler (DIH) as Solr pulling data to index
> >> from some source based on configuration. So, once you set up
> >> your DIH config to point to your file system, you issue a command
> >> to solr like "OK, do your data import thing". See the
> >> FileListEntityProcessor.
> >> http://wiki.apache.org/solr/DataImportHandler
> >>
> >> SolrJ is a clent library
> >> you'd use to push data to Solr. Basically, you
> >> write a Java program that uses SolrJ to walk the file system, find
> >> documents, create a Solr document and sent that to Solr. It's not
> >> nearly as complex as it sounds. See:
> >> http://wiki.apache.org/solr/Solrj
> >>
> >> It's probably worth your while to
> get
> >> a
> >> copy of "Solr 1.4, Enterprise Search Server"
> >> by Erik Pugh and David Smiley.
> >>
> >> Best
> >> Erick
> >>
> >> On Fri, Nov 12, 2010 at 8:37 AM, K. Seshadri Iyer >> >wrote:
> >>
> >>
> >>
> >>> Hi Lance,
> >>>
> >>> Thank you very much for responding (not sure how I reply to the group,
> >>> so,
> >>> writing to you).
> >>>
> >>> Can you please expand on your suggestion? I am not a web guy and so,
> >>> don't
> >>> know where to start.
> >>>
> >>> What is the difference between SolrJ and DataImportHandler? Do I need
> to
> >>> set
> >>> up web servers on all my storage boxes?
> >>>
> >>> Apologies for the basic level of questions, but hope I can get started
> >>> and
> >>> implement this before the year end (you know why :o)
> >>>
> >>> Thanks,
> >>>
> >>> Sesh
> >>>
> >>> On 12 November 2010 13:31, Lance Norskog  wrote:
> >>>
> >>>
> >>>
>  Using 'curl' is fine. There is a library called SolrJ for Java and
>  other libraries for other scripting languages that let you upload with
>  more control. There is a thing in Solr called the DataImportHandler
>  that lets you script walking a file system.
> 
>  On Thu, Nov 11, 2010 at 8:38 PM, K. Seshadri Iyer<
> seshadri...@gmail.com
> 
>  wrote:
> 
> 
> > Hi,
> >
> > Pardon me if this sounds very elementary, but I have a very basic
> >
> >
>  question
> 
> 
> > regarding Solr search. I have about 10 storage devices running
> Solaris
> >
> >
>  with
> 
> 
> > hundreds of thousands of text files (there are other files, as well,
> >
> >
>  but
> >>>
> >>>
>  my
> 
> 
> > target is these text files). The directories on the Solaris boxes are
> > exported and are available as NFS mounts.
> >
> > I have installed Solr 1.4 on a Linux box and have tested the
> >
> >
>  installation,
> 
> 
> > using curl to post  documents. However, the manual says that curl is
> >
> >
>  not
> >>>
> >>>
>  the
> 
> 
> > recommended way of posting documents to Solr. Could someone please
> tell
> >
>

Re: A Newbie Question

2010-11-13 Thread Govind Kanshi

Another pov you might want to think about - what kind of search you want.
Just plain - full text search or there is something more to those text
files. Are they grouped in folders? Do the folders imply certain kind of
grouping/hierarchy/tagging?

I recently was trying to help somebody who had files across lot of places
grouped by date/subject/author - he wanted to ensure these are "fields"
which too can act as filters/navigators.

Just an input - ignore it if you just want plain full text search.

On Sat, Nov 13, 2010 at 11:25 AM, Lance Norskog  wrote:

> About web servers: Solr is a servlet war file and needs a Java web server
> "container" to run. The example/ folder in the Solr disribution uses
> 'Jetty', and this is fine for small production-quality projects.  You can
> just copy the example/ directory somewhere to set up your own running Solr;
> that's what I always do.
>
> About indexing programs: if you know Unix scripting, it may be easiest to
> walk the file system yourself with the 'find' program and create Solr input
> XML files.
>
> But yes, you definitely want the Solr 1.4 Enterprise manual. I spent months
> learning this stuff very slowly, and the book would have been great back
> then.
>
> Lance
>
>
> Erick Erickson wrote:
>
>> Think of the data import handler (DIH) as Solr pulling data to index
>> from some source based on configuration. So, once you set up
>> your DIH config to point to your file system, you issue a command
>> to solr like "OK, do your data import thing". See the
>> FileListEntityProcessor.
>> http://wiki.apache.org/solr/DataImportHandler
>>
>> SolrJ is a clent library
>> you'd use to push data to Solr. Basically, you
>> write a Java program that uses SolrJ to walk the file system, find
>> documents, create a Solr document and sent that to Solr. It's not
>> nearly as complex as it sounds. See:
>> http://wiki.apache.org/solr/Solrj
>>
>> It's probably worth your while to get
>> a
>> copy of "Solr 1.4, Enterprise Search Server"
>> by Erik Pugh and David Smiley.
>>
>> Best
>> Erick
>>
>> On Fri, Nov 12, 2010 at 8:37 AM, K. Seshadri Iyer> >wrote:
>>
>>
>>
>>> Hi Lance,
>>>
>>> Thank you very much for responding (not sure how I reply to the group,
>>> so,
>>> writing to you).
>>>
>>> Can you please expand on your suggestion? I am not a web guy and so,
>>> don't
>>> know where to start.
>>>
>>> What is the difference between SolrJ and DataImportHandler? Do I need to
>>> set
>>> up web servers on all my storage boxes?
>>>
>>> Apologies for the basic level of questions, but hope I can get started
>>> and
>>> implement this before the year end (you know why :o)
>>>
>>> Thanks,
>>>
>>> Sesh
>>>
>>> On 12 November 2010 13:31, Lance Norskog  wrote:
>>>
>>>
>>>
 Using 'curl' is fine. There is a library called SolrJ for Java and
 other libraries for other scripting languages that let you upload with
 more control. There is a thing in Solr called the DataImportHandler
 that lets you script walking a file system.

 On Thu, Nov 11, 2010 at 8:38 PM, K. Seshadri Iyer>>>
 wrote:


> Hi,
>
> Pardon me if this sounds very elementary, but I have a very basic
>
>
 question


> regarding Solr search. I have about 10 storage devices running Solaris
>
>
 with


> hundreds of thousands of text files (there are other files, as well,
>
>
 but
>>>
>>>
 my


> target is these text files). The directories on the Solaris boxes are
> exported and are available as NFS mounts.
>
> I have installed Solr 1.4 on a Linux box and have tested the
>
>
 installation,


> using curl to post  documents. However, the manual says that curl is
>
>
 not
>>>
>>>
 the


> recommended way of posting documents to Solr. Could someone please tell
>
>
 me


> what is the preferred approach in such an environment? I am not a
>
>
 programmer


> and would appreciate some hand-holding here :o)
>
> Thanks in advance,
>
> Sesh
>
>
>


 --
 Lance Norskog
 goks...@gmail.com



>>>
>>>
>>
>>
>

Re: A Newbie Question

2010-11-12 Thread Lance Norskog

About web servers: Solr is a servlet war file and needs a Java web 
server "container" to run. The example/ folder in the Solr disribution 
uses 'Jetty', and this is fine for small production-quality projects.  
You can just copy the example/ directory somewhere to set up your own 
running Solr; that's what I always do.


About indexing programs: if you know Unix scripting, it may be easiest 
to walk the file system yourself with the 'find' program and create Solr 
input XML files.


But yes, you definitely want the Solr 1.4 Enterprise manual. I spent 
months learning this stuff very slowly, and the book would have been 
great back then.


Lance

Erick Erickson wrote:

Think of the data import handler (DIH) as Solr pulling data to index
from some source based on configuration. So, once you set up
your DIH config to point to your file system, you issue a command
to solr like "OK, do your data import thing". See the
FileListEntityProcessor.
http://wiki.apache.org/solr/DataImportHandler

SolrJ is a clent library
you'd use to push data to Solr. Basically, you
write a Java program that uses SolrJ to walk the file system, find
documents, create a Solr document and sent that to Solr. It's not
nearly as complex as it sounds. See:
http://wiki.apache.org/solr/Solrj

It's probably worth your while to get a
copy of "Solr 1.4, Enterprise Search Server"
by Erik Pugh and David Smiley.

Best
Erick

On Fri, Nov 12, 2010 at 8:37 AM, K. Seshadri Iyerwrote:

   

Hi Lance,

Thank you very much for responding (not sure how I reply to the group, so,
writing to you).

Can you please expand on your suggestion? I am not a web guy and so, don't
know where to start.

What is the difference between SolrJ and DataImportHandler? Do I need to
set
up web servers on all my storage boxes?

Apologies for the basic level of questions, but hope I can get started and
implement this before the year end (you know why :o)

Thanks,

Sesh

On 12 November 2010 13:31, Lance Norskog  wrote:

 

Using 'curl' is fine. There is a library called SolrJ for Java and
other libraries for other scripting languages that let you upload with
more control. There is a thing in Solr called the DataImportHandler
that lets you script walking a file system.

On Thu, Nov 11, 2010 at 8:38 PM, K. Seshadri Iyer   

Hi,

Pardon me if this sounds very elementary, but I have a very basic
 

question
   

regarding Solr search. I have about 10 storage devices running Solaris
 

with
   

hundreds of thousands of text files (there are other files, as well,
 

but
 

my
   

target is these text files). The directories on the Solaris boxes are
exported and are available as NFS mounts.

I have installed Solr 1.4 on a Linux box and have tested the
 

installation,
   

using curl to post  documents. However, the manual says that curl is
 

not
 

the
   

recommended way of posting documents to Solr. Could someone please tell
 

me
   

what is the preferred approach in such an environment? I am not a
 

programmer
   

and would appreciate some hand-holding here :o)

Thanks in advance,

Sesh

 



--
Lance Norskog
goks...@gmail.com

Re: A Newbie Question

2010-11-12 Thread Erick Erickson

Think of the data import handler (DIH) as Solr pulling data to index
from some source based on configuration. So, once you set up
your DIH config to point to your file system, you issue a command
to solr like "OK, do your data import thing". See the
FileListEntityProcessor.
http://wiki.apache.org/solr/DataImportHandler

SolrJ is a clent library
you'd use to push data to Solr. Basically, you
write a Java program that uses SolrJ to walk the file system, find
documents, create a Solr document and sent that to Solr. It's not
nearly as complex as it sounds . See:
http://wiki.apache.org/solr/Solrj

It's probably worth your while to get a
copy of "Solr 1.4, Enterprise Search Server"
by Erik Pugh and David Smiley.

Best
Erick

On Fri, Nov 12, 2010 at 8:37 AM, K. Seshadri Iyer wrote:

> Hi Lance,
>
> Thank you very much for responding (not sure how I reply to the group, so,
> writing to you).
>
> Can you please expand on your suggestion? I am not a web guy and so, don't
> know where to start.
>
> What is the difference between SolrJ and DataImportHandler? Do I need to
> set
> up web servers on all my storage boxes?
>
> Apologies for the basic level of questions, but hope I can get started and
> implement this before the year end (you know why :o)
>
> Thanks,
>
> Sesh
>
> On 12 November 2010 13:31, Lance Norskog  wrote:
>
> > Using 'curl' is fine. There is a library called SolrJ for Java and
> > other libraries for other scripting languages that let you upload with
> > more control. There is a thing in Solr called the DataImportHandler
> > that lets you script walking a file system.
> >
> > On Thu, Nov 11, 2010 at 8:38 PM, K. Seshadri Iyer  >
> > wrote:
> > > Hi,
> > >
> > > Pardon me if this sounds very elementary, but I have a very basic
> > question
> > > regarding Solr search. I have about 10 storage devices running Solaris
> > with
> > > hundreds of thousands of text files (there are other files, as well,
> but
> > my
> > > target is these text files). The directories on the Solaris boxes are
> > > exported and are available as NFS mounts.
> > >
> > > I have installed Solr 1.4 on a Linux box and have tested the
> > installation,
> > > using curl to post  documents. However, the manual says that curl is
> not
> > the
> > > recommended way of posting documents to Solr. Could someone please tell
> > me
> > > what is the preferred approach in such an environment? I am not a
> > programmer
> > > and would appreciate some hand-holding here :o)
> > >
> > > Thanks in advance,
> > >
> > > Sesh
> > >
> >
> >
> >
> > --
> > Lance Norskog
> > goks...@gmail.com
> >
>

Re: A Newbie Question

2010-11-12 Thread K. Seshadri Iyer

Hi Lance,

Thank you very much for responding (not sure how I reply to the group, so,
writing to you).

Can you please expand on your suggestion? I am not a web guy and so, don't
know where to start.

What is the difference between SolrJ and DataImportHandler? Do I need to set
up web servers on all my storage boxes?

Apologies for the basic level of questions, but hope I can get started and
implement this before the year end (you know why :o)

Thanks,

Sesh

On 12 November 2010 13:31, Lance Norskog  wrote:

> Using 'curl' is fine. There is a library called SolrJ for Java and
> other libraries for other scripting languages that let you upload with
> more control. There is a thing in Solr called the DataImportHandler
> that lets you script walking a file system.
>
> On Thu, Nov 11, 2010 at 8:38 PM, K. Seshadri Iyer 
> wrote:
> > Hi,
> >
> > Pardon me if this sounds very elementary, but I have a very basic
> question
> > regarding Solr search. I have about 10 storage devices running Solaris
> with
> > hundreds of thousands of text files (there are other files, as well, but
> my
> > target is these text files). The directories on the Solaris boxes are
> > exported and are available as NFS mounts.
> >
> > I have installed Solr 1.4 on a Linux box and have tested the
> installation,
> > using curl to post  documents. However, the manual says that curl is not
> the
> > recommended way of posting documents to Solr. Could someone please tell
> me
> > what is the preferred approach in such an environment? I am not a
> programmer
> > and would appreciate some hand-holding here :o)
> >
> > Thanks in advance,
> >
> > Sesh
> >
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>

Re: A Newbie Question

2010-11-12 Thread Lance Norskog

Using 'curl' is fine. There is a library called SolrJ for Java and
other libraries for other scripting languages that let you upload with
more control. There is a thing in Solr called the DataImportHandler
that lets you script walking a file system.

On Thu, Nov 11, 2010 at 8:38 PM, K. Seshadri Iyer  wrote:
> Hi,
>
> Pardon me if this sounds very elementary, but I have a very basic question
> regarding Solr search. I have about 10 storage devices running Solaris with
> hundreds of thousands of text files (there are other files, as well, but my
> target is these text files). The directories on the Solaris boxes are
> exported and are available as NFS mounts.
>
> I have installed Solr 1.4 on a Linux box and have tested the installation,
> using curl to post  documents. However, the manual says that curl is not the
> recommended way of posting documents to Solr. Could someone please tell me
> what is the preferred approach in such an environment? I am not a programmer
> and would appreciate some hand-holding here :o)
>
> Thanks in advance,
>
> Sesh
>



-- 
Lance Norskog
goks...@gmail.com

A Newbie Question

2010-11-11 Thread K. Seshadri Iyer

Hi,

Pardon me if this sounds very elementary, but I have a very basic question
regarding Solr search. I have about 10 storage devices running Solaris with
hundreds of thousands of text files (there are other files, as well, but my
target is these text files). The directories on the Solaris boxes are
exported and are available as NFS mounts.

I have installed Solr 1.4 on a Linux box and have tested the installation,
using curl to post  documents. However, the manual says that curl is not the
recommended way of posting documents to Solr. Could someone please tell me
what is the preferred approach in such an environment? I am not a programmer
and would appreciate some hand-holding here :o)

Thanks in advance,

Sesh

Re: Newbie question: no search results

2010-09-05 Thread BobG


Hi Lance and Gora,

Thanks for your support!

I have changed 
 



 
Into
 



 

In the schema.xml, then restarted Tomcat. 
Also I used quotation marks for my search string and it works fine now!

Problem solved!

Best regards,
Bob

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Newbie-question-no-search-results-tp1416482p1422211.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Newbie question: no search results

2010-09-04 Thread Lance Norskog

More directly: if the 'Artikel' field is a "string", only the whole 
string will match:

Artikel:"Kerstman baardstel".
Or you can use a wildcard:  Kerstmann*  or just Kerst*

If it is a "text" field, it is chopped into words and 
q=Artikel:Kerstmann would work.


Gora Mohanty wrote:

On Sat, 4 Sep 2010 01:15:11 -0700 (PDT)
BobG  wrote:

   

Hi,
I am trying to set up a new SOLR search engine on a windows
platform. It seems like I managed to fill an index with the
contents of my SQL server table.

When I use the default *.* query I get a nice result:
 

[...]
   

However when I try to query the table with a search term I get no
results.
 

[...]

   

What am I doing wrong here?
 

[...]

Could you post your schema.xml. In particular, how is the "Artikel"
field being processed at index/query time?

Regards,
Gora

Re: Newbie question: no search results

2010-09-04 Thread Gora Mohanty

On Sat, 4 Sep 2010 01:15:11 -0700 (PDT)
BobG  wrote:

> 
> Hi,
> I am trying to set up a new SOLR search engine on a windows
> platform. It seems like I managed to fill an index with the
> contents of my SQL server table.
> 
> When I use the default *.* query I get a nice result:
[...]
> However when I try to query the table with a search term I get no
> results.
[...]

> What am I doing wrong here?
[...]

Could you post your schema.xml. In particular, how is the "Artikel"
field being processed at index/query time?

Regards,
Gora

Newbie question: no search results

2010-09-04 Thread BobG


Hi,
I am trying to set up a new SOLR search engine on a windows platform. It
seems like I managed to fill an index with the contents of my SQL server
table.

When I use the default *.* query I get a nice result:

   
- 
- 
  0 
  0 
- 
  *:* 
  
  
- 
- 
  Kerstman baardstel 
  Baardstel kerstman. Pruik en baard met
snor. 
  26DF49BF-A3AE-11D5-AAE9-C50EF305 
  
- 
  Kerstman baardstel D 
  Baardstel D kerstman. Pruik en baard met
snor. 
  09D8A643-6714-11D5-AAD5-0050DADF6D86 
  
- 
  Kerstman baardstel Santa 
  Baardstel kerstman Santa. Pruik en baard met
snor. 
  09D8A649-6714-11D5-AAD5-0050DADF6D86 
  
--more results 

However when I try to query the table with a search term I get no results.

   
- 
- 
  0 
  0 
- 
  Artikel:Kerstman 
  
  
   
  

What am I doing wrong here?
Thanks for your help.

Bob
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Newbie-question-no-search-results-tp1416482p1416482.html
Sent from the Solr - User mailing list archive at Nabble.com.

1 2 3 >

1 - 100 of 218 matches

Mail list logo