Re: Delete in Solr based on foreign key (like SQL delete from … where id in (select id from…)

2014-10-09 Thread Matthew Nigl
I was going to say that the below should do what you are asking:

{!join from=docid_s
to=foreign_key_docid_s}(message_state_ts:[* TO 2014-10-05T00:00:00Z} AND
message_state_ts:{2014-10-01T00:00:00Z TO *])

But I get the same response as in
https://issues.apache.org/jira/browse/SOLR-6357

I can't think of any other queries at the moment. You might consider using
the above query (which should work as a normal select query) to get the
IDs, then delete them in a separate query.


On 10 October 2014 07:31, Luis Festas Matos  wrote:

> Given the following Solr data:
>
> 
> 1008rs1cz0icl2pk
> 2014-10-07T14:18:29.784Z
> h60fmtybz0i7sx87
> 1481314421768716288
> u42xyz1cz0i7sx87
> h60fmtybz0i7sx87
> 1481314421768716288
>u42xyz1cz0i7sx87
>h60fmtybz0i7sx87
>1481314421448900608
>
> I would like to know how to *DELETE documents* above on the Solr console or
> using a script that achieves the same result as issuing the following
> statement in SQL (assuming all of these columns existed in a table called x
> ):
>
> DELETE FROM x WHERE foreign_key_docid_s in (select docid_s from x
> where message_state_ts < '2014-10-05' and message_state_ts >
> '2014-10-01')
>
> Basically, delete all derived documents whose foreign key is the same as
> the primary key where the primary key is selected between 2 dates.
>
> Question originally posted on stackoverflow.com ->
>
> http://stackoverflow.com/questions/26248372/delete-in-solr-based-on-foreign-key-like-sql-delete-from-where-id-in-selec
>


Re: Is it possible to replicate just the solrconfig.xml file

2014-10-09 Thread Erick Erickson
You can set up a config files section of in solrconfig on the master,
something like:
schema.xml,stopwords.txt

I'm not totally sure whether this only replicates the files if they've
changed, but even if not it's not that much network traffic that I'd
worry about.

bq: ... schema.xml, solrconfig.xml, etc. I have this set up and it works well...

OK, how is this set up? With the confFiles as above? If so, what
evidence do you have that this doesn't do exactly what you want, i.e.
just replicate the changed files?

Best,
Erick

On Thu, Oct 9, 2014 at 4:31 PM, Tang, Rebecca  wrote:
> I have a master-slave set up.  Most of the times when I replicate, I want to 
> replicate the index as well as some of the config files like schema.xml, 
> solrconfig.xml, etc.
> I have this set up and it works well.
>
> But sometimes, I make a small tweak to solrconfig.xml and deploy it to the 
> master.  After I test it, I want to just replicate the solrconfig.xml over to 
> slave and nothing else.  Is it possible to do this?
>
> Thanks,
> Rebecca Tang
> Applications Developer, UCSF CKM
> Legacy Tobacco Document Library
> E: rebecca.t...@ucsf.edu


Is it possible to replicate just the solrconfig.xml file

2014-10-09 Thread Tang, Rebecca
I have a master-slave set up.  Most of the times when I replicate, I want to 
replicate the index as well as some of the config files like schema.xml, 
solrconfig.xml, etc.
I have this set up and it works well.

But sometimes, I make a small tweak to solrconfig.xml and deploy it to the 
master.  After I test it, I want to just replicate the solrconfig.xml over to 
slave and nothing else.  Is it possible to do this?

Thanks,
Rebecca Tang
Applications Developer, UCSF CKM
Legacy Tobacco Document Library
E: rebecca.t...@ucsf.edu


Re: Data Import Handler for CSV file

2014-10-09 Thread Ahmet Arslan
Hi,

I think you can define field names in the first line of csv. Why don't you use 
curl to index csv?

I don't have full working example with DIH but I have following example that 
indexed every line as a separate solr scoument.

You need to add a transformer that splits each line according to comma.




   
 
 
 
  






On Friday, October 10, 2014 12:26 AM, nabil Kouici  wrote:
Hi Ahmet,

Thank you for this replay. Agree with you that csv update handler is fast but 
we need always to specify columns in the http request. In addition, I don't 
find documentation how to use csv update from solrj.

Could you please send me an example of DIH to load CSV file?

Regards,
Nabil.





Le Jeudi 9 octobre 2014 21h05, Ahmet Arslan  a écrit 
:



Hi Nabil,

whats wrong with csv update handler? It is quite fast.

By the way DIH has line entity processor, yes it is doable with existing DIH 
components.

Ahmet



On Thursday, October 9, 2014 9:58 PM, nabil Kouici  wrote:





Hi All,

Is it possible to have in solr a DIH to load from CSV file. Actually I'm using 
update/csv handler but not responding to my need.

Regards,
NKI.


Re: Data Import Handler for CSV file

2014-10-09 Thread Alexandre Rafalovitch
You could always define the parameters in the solrconfig.XML on a custom
handler. Don't have to pass the same values over and over again.

Regards,
 Alex
On 09/10/2014 5:26 pm, "nabil Kouici"  wrote:

> Hi Ahmet,
>
> Thank you for this replay. Agree with you that csv update handler is fast
> but we need always to specify columns in the http request. In addition, I
> don't find documentation how to use csv update from solrj.
>
> Could you please send me an example of DIH to load CSV file?
>
> Regards,
> Nabil.
>
>
> Le Jeudi 9 octobre 2014 21h05, Ahmet Arslan  a
> écrit :
>
>
>
> Hi Nabil,
>
> whats wrong with csv update handler? It is quite fast.
>
> By the way DIH has line entity processor, yes it is doable with existing
> DIH components.
>
> Ahmet
>
>
>
> On Thursday, October 9, 2014 9:58 PM, nabil Kouici 
> wrote:
>
>
>
>
>
> Hi All,
>
> Is it possible to have in solr a DIH to load from CSV file. Actually I'm
> using update/csv handler but not responding to my need.
>
> Regards,
> NKI.


Re: Stripping html from text before indexing to solr

2014-10-09 Thread Ahmet Arslan
Yes, your plain string queries will automatically match in index.
This is always true.


If you don't strip html, html tags are considered part of the document and 
would cause false matches.
For example q=bold,code,class, etc.



On Friday, October 10, 2014 12:35 AM, Vishal Sharma  
wrote:
I think I dint get you completely. I am really sorry for asking this again.
New to solr world :)

Are you saying if I don't strip html my plain string queries will
automatically match in index?

*Vishal Sharma**TL, Grazitti Interactive*T: +1 650­ 641 1754
E: vish...@grazitti.com
www.grazitti.com [image: Description: LinkedIn]
[image: Description:
Twitter] [image: fbook]
*dreamforce®*Oct 13-16,
2014 *Meet
us at the Cloud Expo*
Booth N2341 Moscone North,
San Francisco
Schedule a Meeting

   |   Follow us ZakCalendar
Dreamforce® Featured
App










On Thu, Oct 9, 2014 at 2:05 PM, Ahmet Arslan 
wrote:

> It depends on you, if you strip html using a char filter, it won't match
> htm tags.
> But the original document, when requested using fl= parameter, will be
> html.
>
> If you do not strip html at all, q=html will return all documents.
>
> Ahmet
>
>
>
> On Friday, October 10, 2014 12:01 AM, Vishal Sharma 
> wrote:
> Ahmet,
>
> So if its not necessary to strip html. Are you saying that plain text query
> strings will automatically match the html content indexed to solr?
>
> *Vishal Sharma**TL, Grazitti Interactive*T: +1 650­ 641 1754
> E: vish...@grazitti.com
> www.grazitti.com [image: Description: LinkedIn]
> [image: Description:
> Twitter] [image: fbook]
> *dreamforce®*Oct 13-16,
> 2014 *Meet
> us at the Cloud Expo*
> Booth N2341 Moscone North,
> San Francisco
> Schedule a Meeting
> 
>|   Follow us ZakCalendar
> Dreamforce® Featured
> App
> <
> https://appexchange.salesforce.com/listingDetail?listingId=a0N300B5UPKEA3
> >
>
>
>
>
>
>
>
>
>
> On Thu, Oct 9, 2014 at 1:55 PM, Ahmet Arslan 
> wrote:
>
> > Hi Vishal,
> >
> > Stripping html is not mandatory. Solr indexes it just like other text.
> >
> > By the way, there are to places where you can strip html.
> > i) at analysis : char filter
> > ii) before analysis :  Update processor, html strip transformer
> >
> > Ahmet
> >
> >
> > On Thursday, October 9, 2014 11:50 PM, Vishal Sharma <
> vish...@grazitti.com>
> > wrote:
> > Is stripping html is always required before sending content to Solr or it
> > accepts html based data also?
> >
> > If yes, in that scenario how does the match happen?
> >
> > Looking for some best foolproof way of indexing html data to solr fields
> > where it would always be ready for match with query string
> >
> >
> >
> >
> >
> > *Vishal Sharma**TL, Grazitti Interactive*T: +1 650­ 641 1754
> > E: vish...@grazitti.com
> > www.grazitti.com [image: Description: LinkedIn]
> > [image:
> Description:
> > Twitter] [image: fbook]
> > *dreamforce®*Oct 13-16,
> > 2014 *Meet
> > us at the Cloud Expo*
> > Booth N2341 Moscone North,
> > San Francisco
> > Schedule a Meeting
> > 
> >|   Follow us ZakCalendar
> > Dreamforce® Featured
> > App
> > <
> >
> https://appexchange.salesforce.com/listingDetail?listingId=a0N300B5UPKEA3
> > >
> >
>


Re: Stripping html from text before indexing to solr

2014-10-09 Thread Vishal Sharma
I think I dint get you completely. I am really sorry for asking this again.
New to solr world :)

Are you saying if I don't strip html my plain string queries will
automatically match in index?

*Vishal Sharma**TL, Grazitti Interactive*T: +1 650­ 641 1754
E: vish...@grazitti.com
www.grazitti.com [image: Description: LinkedIn]
[image: Description:
Twitter] [image: fbook]
*dreamforce®*Oct 13-16,
2014 *Meet
us at the Cloud Expo*
Booth N2341 Moscone North,
San Francisco
Schedule a Meeting

   |   Follow us ZakCalendar
Dreamforce® Featured
App







On Thu, Oct 9, 2014 at 2:05 PM, Ahmet Arslan 
wrote:

> It depends on you, if you strip html using a char filter, it won't match
> htm tags.
> But the original document, when requested using fl= parameter, will be
> html.
>
> If you do not strip html at all, q=html will return all documents.
>
> Ahmet
>
>
>
> On Friday, October 10, 2014 12:01 AM, Vishal Sharma 
> wrote:
> Ahmet,
>
> So if its not necessary to strip html. Are you saying that plain text query
> strings will automatically match the html content indexed to solr?
>
> *Vishal Sharma**TL, Grazitti Interactive*T: +1 650­ 641 1754
> E: vish...@grazitti.com
> www.grazitti.com [image: Description: LinkedIn]
> [image: Description:
> Twitter] [image: fbook]
> *dreamforce®*Oct 13-16,
> 2014 *Meet
> us at the Cloud Expo*
> Booth N2341 Moscone North,
> San Francisco
> Schedule a Meeting
> 
>|   Follow us ZakCalendar
> Dreamforce® Featured
> App
> <
> https://appexchange.salesforce.com/listingDetail?listingId=a0N300B5UPKEA3
> >
>
>
>
>
>
>
>
>
>
> On Thu, Oct 9, 2014 at 1:55 PM, Ahmet Arslan 
> wrote:
>
> > Hi Vishal,
> >
> > Stripping html is not mandatory. Solr indexes it just like other text.
> >
> > By the way, there are to places where you can strip html.
> > i) at analysis : char filter
> > ii) before analysis :  Update processor, html strip transformer
> >
> > Ahmet
> >
> >
> > On Thursday, October 9, 2014 11:50 PM, Vishal Sharma <
> vish...@grazitti.com>
> > wrote:
> > Is stripping html is always required before sending content to Solr or it
> > accepts html based data also?
> >
> > If yes, in that scenario how does the match happen?
> >
> > Looking for some best foolproof way of indexing html data to solr fields
> > where it would always be ready for match with query string
> >
> >
> >
> >
> >
> > *Vishal Sharma**TL, Grazitti Interactive*T: +1 650­ 641 1754
> > E: vish...@grazitti.com
> > www.grazitti.com [image: Description: LinkedIn]
> > [image:
> Description:
> > Twitter] [image: fbook]
> > *dreamforce®*Oct 13-16,
> > 2014 *Meet
> > us at the Cloud Expo*
> > Booth N2341 Moscone North,
> > San Francisco
> > Schedule a Meeting
> > 
> >|   Follow us ZakCalendar
> > Dreamforce® Featured
> > App
> > <
> >
> https://appexchange.salesforce.com/listingDetail?listingId=a0N300B5UPKEA3
> > >
> >
>


Re: Data Import Handler for CSV file

2014-10-09 Thread nabil Kouici
Hi Ahmet,
 
Thank you for this replay. Agree with you that csv update handler is fast but 
we need always to specify columns in the http request. In addition, I don't 
find documentation how to use csv update from solrj.

Could you please send me an example of DIH to load CSV file?

Regards,
Nabil.


Le Jeudi 9 octobre 2014 21h05, Ahmet Arslan  a écrit 
:
 


Hi Nabil,

whats wrong with csv update handler? It is quite fast.

By the way DIH has line entity processor, yes it is doable with existing DIH 
components.

Ahmet



On Thursday, October 9, 2014 9:58 PM, nabil Kouici  wrote:





Hi All,

Is it possible to have in solr a DIH to load from CSV file. Actually I'm using 
update/csv handler but not responding to my need.

Regards,
NKI. 

Re: Stripping html from text before indexing to solr

2014-10-09 Thread Ahmet Arslan
It depends on you, if you strip html using a char filter, it won't match htm 
tags.
But the original document, when requested using fl= parameter, will be html.

If you do not strip html at all, q=html will return all documents.

Ahmet



On Friday, October 10, 2014 12:01 AM, Vishal Sharma  
wrote:
Ahmet,

So if its not necessary to strip html. Are you saying that plain text query
strings will automatically match the html content indexed to solr?

*Vishal Sharma**TL, Grazitti Interactive*T: +1 650­ 641 1754
E: vish...@grazitti.com
www.grazitti.com [image: Description: LinkedIn]
[image: Description:
Twitter] [image: fbook]
*dreamforce®*Oct 13-16,
2014 *Meet
us at the Cloud Expo*
Booth N2341 Moscone North,
San Francisco
Schedule a Meeting

   |   Follow us ZakCalendar
Dreamforce® Featured
App










On Thu, Oct 9, 2014 at 1:55 PM, Ahmet Arslan 
wrote:

> Hi Vishal,
>
> Stripping html is not mandatory. Solr indexes it just like other text.
>
> By the way, there are to places where you can strip html.
> i) at analysis : char filter
> ii) before analysis :  Update processor, html strip transformer
>
> Ahmet
>
>
> On Thursday, October 9, 2014 11:50 PM, Vishal Sharma 
> wrote:
> Is stripping html is always required before sending content to Solr or it
> accepts html based data also?
>
> If yes, in that scenario how does the match happen?
>
> Looking for some best foolproof way of indexing html data to solr fields
> where it would always be ready for match with query string
>
>
>
>
>
> *Vishal Sharma**TL, Grazitti Interactive*T: +1 650­ 641 1754
> E: vish...@grazitti.com
> www.grazitti.com [image: Description: LinkedIn]
> [image: Description:
> Twitter] [image: fbook]
> *dreamforce®*Oct 13-16,
> 2014 *Meet
> us at the Cloud Expo*
> Booth N2341 Moscone North,
> San Francisco
> Schedule a Meeting
> 
>|   Follow us ZakCalendar
> Dreamforce® Featured
> App
> <
> https://appexchange.salesforce.com/listingDetail?listingId=a0N300B5UPKEA3
> >
>


Re: Stripping html from text before indexing to solr

2014-10-09 Thread Vishal Sharma
Ahmet,

So if its not necessary to strip html. Are you saying that plain text query
strings will automatically match the html content indexed to solr?

*Vishal Sharma**TL, Grazitti Interactive*T: +1 650­ 641 1754
E: vish...@grazitti.com
www.grazitti.com [image: Description: LinkedIn]
[image: Description:
Twitter] [image: fbook]
*dreamforce®*Oct 13-16,
2014 *Meet
us at the Cloud Expo*
Booth N2341 Moscone North,
San Francisco
Schedule a Meeting

   |   Follow us ZakCalendar
Dreamforce® Featured
App







On Thu, Oct 9, 2014 at 1:55 PM, Ahmet Arslan 
wrote:

> Hi Vishal,
>
> Stripping html is not mandatory. Solr indexes it just like other text.
>
> By the way, there are to places where you can strip html.
> i) at analysis : char filter
> ii) before analysis :  Update processor, html strip transformer
>
> Ahmet
>
>
> On Thursday, October 9, 2014 11:50 PM, Vishal Sharma 
> wrote:
> Is stripping html is always required before sending content to Solr or it
> accepts html based data also?
>
> If yes, in that scenario how does the match happen?
>
> Looking for some best foolproof way of indexing html data to solr fields
> where it would always be ready for match with query string
>
>
>
>
>
> *Vishal Sharma**TL, Grazitti Interactive*T: +1 650­ 641 1754
> E: vish...@grazitti.com
> www.grazitti.com [image: Description: LinkedIn]
> [image: Description:
> Twitter] [image: fbook]
> *dreamforce®*Oct 13-16,
> 2014 *Meet
> us at the Cloud Expo*
> Booth N2341 Moscone North,
> San Francisco
> Schedule a Meeting
> 
>|   Follow us ZakCalendar
> Dreamforce® Featured
> App
> <
> https://appexchange.salesforce.com/listingDetail?listingId=a0N300B5UPKEA3
> >
>


Re: Stripping html from text before indexing to solr

2014-10-09 Thread Ahmet Arslan
Hi Vishal,

Stripping html is not mandatory. Solr indexes it just like other text.

By the way, there are to places where you can strip html.
i) at analysis : char filter
ii) before analysis :  Update processor, html strip transformer

Ahmet


On Thursday, October 9, 2014 11:50 PM, Vishal Sharma  
wrote:
Is stripping html is always required before sending content to Solr or it
accepts html based data also?

If yes, in that scenario how does the match happen?

Looking for some best foolproof way of indexing html data to solr fields
where it would always be ready for match with query string





*Vishal Sharma**TL, Grazitti Interactive*T: +1 650­ 641 1754
E: vish...@grazitti.com
www.grazitti.com [image: Description: LinkedIn]
[image: Description:
Twitter] [image: fbook]
*dreamforce®*Oct 13-16,
2014 *Meet
us at the Cloud Expo*
Booth N2341 Moscone North,
San Francisco
Schedule a Meeting

   |   Follow us ZakCalendar
Dreamforce® Featured
App



Stripping html from text before indexing to solr

2014-10-09 Thread Vishal Sharma
Is stripping html is always required before sending content to Solr or it
accepts html based data also?

If yes, in that scenario how does the match happen?

Looking for some best foolproof way of indexing html data to solr fields
where it would always be ready for match with query string





*Vishal Sharma**TL, Grazitti Interactive*T: +1 650­ 641 1754
E: vish...@grazitti.com
www.grazitti.com [image: Description: LinkedIn]
[image: Description:
Twitter] [image: fbook]
*dreamforce®*Oct 13-16,
2014 *Meet
us at the Cloud Expo*
Booth N2341 Moscone North,
San Francisco
Schedule a Meeting

   |   Follow us ZakCalendar
Dreamforce® Featured
App



Delete in Solr based on foreign key (like SQL delete from … where id in (select id from…)

2014-10-09 Thread Luis Festas Matos
Given the following Solr data:


1008rs1cz0icl2pk
2014-10-07T14:18:29.784Z
h60fmtybz0i7sx87
1481314421768716288
u42xyz1cz0i7sx87
h60fmtybz0i7sx87
1481314421768716288
   u42xyz1cz0i7sx87
   h60fmtybz0i7sx87
   1481314421448900608

I would like to know how to *DELETE documents* above on the Solr console or
using a script that achieves the same result as issuing the following
statement in SQL (assuming all of these columns existed in a table called x
):

DELETE FROM x WHERE foreign_key_docid_s in (select docid_s from x
where message_state_ts < '2014-10-05' and message_state_ts >
'2014-10-01')

Basically, delete all derived documents whose foreign key is the same as
the primary key where the primary key is selected between 2 dates.

Question originally posted on stackoverflow.com ->
http://stackoverflow.com/questions/26248372/delete-in-solr-based-on-foreign-key-like-sql-delete-from-where-id-in-selec


SuggestComponent in distributed (SolrCloud) environment

2014-10-09 Thread Frank Wesemann
Hi,
I'm about to integrate the SuggestCompont in our application and noticed
some behavior I didn't expect. My Solr version Solr 4.9.

1. The component returns common terms shards-n times.
2. Due to how the suggestions from each shard are collected, the
"exactMatchFirst" Parameter on the LookupImpl is practically ignored.

3. At least the Jaspell Lookup returns terms from deleted documents.

Is this expected behavior or am I missing something?
My config is quiet "defaulty" :

  
  FSTLookupFactory
  fst_mit_threshold
  HighFrequencyDictionaryFactory
  0.007
  suggestions/
  true
  suggest_context
  suggestContextAnalyzer

  

  

  true
  default
  fst_mit_threshold
  20
  /suggest
  json


  suggest

  

After a very short glimpse at the sources I think the two first issues
should be resolvable by plugging an other Queue implementation into
SuggestComponents finishStage()

I am quite unsure about no 3. At last these are suggestions, so nobody
guarantees to have results for the suggested terms, but it feels a little
strange from the users point of view.

Any thoughts on this?
If anybody is interested, I can open an Issue in JIRA and work on 1 and 2.


-- 
-- 
mit freundlichem Gruß,

Frank Wesemann
Fotofinder GmbH USt-IdNr. DE812854514
Software EntwicklungWeb: http://www.fotofinder.com/
Potsdamer Str. 96   Tel: +49 30 25 79 28 90
10785 BerlinFax: +49 30 25 79 28 999

Sitz: Berlin
Amtsgericht Berlin Charlottenburg (HRB 73099)
Geschäftsführer: Ali Paczensky


Re: Data Import Handler for CSV file

2014-10-09 Thread Ahmet Arslan
Hi Nabil,

whats wrong with csv update handler? It is quite fast.

By the way DIH has line entity processor, yes it is doable with existing DIH 
components.

Ahmet
 

On Thursday, October 9, 2014 9:58 PM, nabil Kouici  wrote:





Hi All,

Is it possible to have in solr a DIH to load from CSV file. Actually I'm using 
update/csv handler but not responding to my need.

Regards,
NKI. 


Data Import Handler for CSV file

2014-10-09 Thread nabil Kouici





Hi All,

Is it possible to have in solr a DIH to load from CSV file. Actually I'm using 
update/csv handler but not responding to my need.

Regards,
NKI.

Facets for Child Documents?

2014-10-09 Thread Edwards, Joshua
Is it possible to use a facet to filter parent documents based on a child 
field?  For example, if I have Authors as my main record, and Books as the 
child record, would it be possible to have a facet that filtered Authors by 
Book publication date (with publication date existing on the Book document, but 
not directly on the Author document)?

Thanks,
Josh Edwards


The information contained in this e-mail is confidential and/or proprietary to 
Capital One and/or its affiliates. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed.  If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.


Re: does one need to reindex when changing similarity class

2014-10-09 Thread Ahmet Arslan
How about SweetSpotSimilarity? Length norm is saved at index time?



On Thursday, October 9, 2014 5:44 PM, Jack Krupansky  
wrote:
The similarity class is only invoked at query time, so it doesn't 
participate in indexing.

-- Jack Krupansky




-Original Message- 
From: Markus Jelsma
Sent: Thursday, October 9, 2014 6:59 AM
To: solr-user@lucene.apache.org
Subject: RE: does one need to reindex when changing similarity class

Hi - no you don't have to, although maybe if you changed on how norms are 
encoded.
Markus



-Original message-
> From:elisabeth benoit 
> Sent: Thursday 9th October 2014 12:26
> To: solr-user@lucene.apache.org
> Subject: does one need to reindex when changing similarity class
>
> I've read somewhere that we do have to reindex when changing similarity
> class. Is that right?
>
> Thanks again,
> Elisabeth
> 


Re: SolrCloud - Cloud tab on admin dashboard not loading

2014-10-09 Thread Shawn Heisey
On 10/9/2014 9:35 AM, Erick Erickson wrote:
> Hmmm, works fine for me. But I'm a little puzzled where the /zookeeper
> is coming from in your URL, that isn't the URL sent by the admin API
> that I know of.
>
> Bottom line: It Works On My Machine.
>
> given that you do hvae 8080 in your URL I'm guessing you're on Tomcat
> or some such? Maybe there's some port confusion here

Assuming that the context path is /solr, the cloud tab on the UI will
hit /solr/zookeeper three times with various parameters to gather the
info it needs to display the graph.

If the context path is not /solr, the cluster must be informed of this
fact, either via solr.xml or with a system property on the startup
commandline.

http://wiki.apache.org/solr/SolrCloud#SolrCloud_Instance_Params
http://wiki.apache.org/solr/Solr.xml%204.4%20and%20beyond

Thanks,
Shawn



Re: [ANNOUNCE] Luke 4.10.1 released

2014-10-09 Thread Dmitry Kan
Hi Bernd,

Thanks for checking out these warnings. Would you like to create a pull
request on github? Or alternatively, create an issue there and describe
what you did to fix this.

Thanks,

Dmitry

On Thu, Oct 9, 2014 at 12:00 PM, Bernd Fehling <
bernd.fehl...@uni-bielefeld.de> wrote:

> Thanks for keeping this up to date.
>
> When starting luke-4.10.1.jar I get:
> SLF4J: Failed to load class “org.slf4j.impl.StaticLoggerBinder”
>
> May I suggest to add that class also directly to luke?
>
>
> And another one, I get a warning that log4j.properties should not
> use "org.apache.hadoop.metrics.jvm.EventCounter" any more.
> Instead it should use "org.apache.hadoop.log.metrics.EventCounter".
> After changing this in log4j.properties the warning is gone
> and everything runs fine.
>
> Again, thanks for your work.
>
> Regards,
> Bernd
>
>
> Am 07.10.2014 um 21:51 schrieb Dmitry Kan:
> > Hello,
> >
> > Luke 4.10.1 has been released. Download it here:
> >
> > https://github.com/DmitryKey/luke/releases/tag/luke-4.10.1
> >
> > The release has been tested against the solr-4.10.1 based index.
> >
> > Changes:
> > https://github.com/DmitryKey/luke/issues/5
> > https://github.com/DmitryKey/luke/issues/6
> >
> > Remember to pass the following JVM parameter when starting luke:
> >
> > java -XX:MaxPermSize=512m -jar luke-with-deps.jar
> >
> > or alternatively, use luke.bat or luke.sh to launch luke from the command
> > line.
> >
> > Enjoy,
> >
> > Dmitry Kan
> >
>



-- 
Dmitry Kan
Blog: http://dmitrykan.blogspot.com
Twitter: http://twitter.com/dmitrykan
SemanticAnalyzer: www.semanticanalyzer.info


Re: Solr Cloud has lower performance with more servers

2014-10-09 Thread Erick Erickson
Just to check: your index is NOT sharded, correct?

Assuming not sharded, is it SolrCloud? If not SolrCloud, how are the
indexes kept in synch? Master/slave? Manual copy?

But for an unchanging index, this is definitely odd.

Best,
Erick

On Thu, Oct 9, 2014 at 7:40 AM, Walter Underwood  wrote:
> Is this a production log of queries, with lots of repeats? If so, you may be 
> seeing the normal effect of lower cache hit rates.
>
> Check the hit rate for the query result cache in the two setups. With a 
> single machine, the second occurrence of a query will be a cache hit. With 
> two machines, it will not be if the two queries are routed to different 
> machines.
>
> I was running some benchmarks here. With one machine, the query cache had a 
> 50% hit rate. With eight machines, it was 20%.
>
> You can address this with a reverse proxy HTTP cache in front of the cluster, 
> something like Varnish.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/
>
>
> On Oct 9, 2014, at 7:21 AM, Yannick  wrote:
>
>> Hi Toke,
>>
>> thanks for your suggestion - definitely an interesting idea. But 
>> unfortunately no, no indexing job is running; those are static indexes being 
>> queried. The execution time is also very consistent in each condition, I did 
>> quite a few tests.
>>
>> Yann
>>
>>
>> On Thursday, October 9, 2014 3:56 PM, Toke Eskildsen 
>>  wrote:
>>
>>
>>
>> On Thu, 2014-10-09 at 15:06 +0200, Yannick wrote:
>>
>>
>>> I created a group of 2 Solr servers with a load-balancer in front
>>> (Haproxy). I have a batch client that sends requests (read-only)
>>> continuously to the load-balancer. The problem is: the performance is
>>> slower with 2 servers than it is with a single server (still via the
>>> load-balancer, with the second server down, so it's not the
>>> load-balancer itself causing the slowdown).
>>
>> (speculating a lot here:)
>>
>> Is another job updating the indexes while you are batch-searching?
>> If so, the slowdown could be explained by the servers disk caches being
>> flushed by the indexing job. When a request arrives some cache is
>> reclaimed, but is will be a battle between the update and the search
>> jobs. With more machines, there will be fewer request/machine, so the
>> search-cache has a lower chance of being used again before it is
>> reclaimed by the updater.
>>
>> Still, worse performance for 2 machines sounds pretty bad.
>>
>> - Toke Eskildsen, State and University Library, Denmark
>


Re: SolrCloud - Cloud tab on admin dashboard not loading

2014-10-09 Thread Erick Erickson
Hmmm, works fine for me. But I'm a little puzzled where the /zookeeper
is coming from in your URL, that isn't the URL sent by the admin API
that I know of.

Bottom line: It Works On My Machine.

given that you do hvae 8080 in your URL I'm guessing you're on Tomcat
or some such? Maybe there's some port confusion here

Best,
Erick

On Thu, Oct 9, 2014 at 7:34 AM, arild.nils...@gmail.com
 wrote:
> I'm trying to set up SolrCloud with embedded Zookeeper for Solr 4.10.1. The
> logs seems fine when starting up with a single Solr instance creating and
> using a embedded Zookeeper instance. I'm also able to create collections and
> query collections via curl. However, there is a HTTP 404 not found when
> loading "Cloud" tab on the admin dashboard. Specifically, the following url
> encounters a 404: http://localhost:8080/ root>/zookeeper?wt=json&_=1412863623868
>
> Here is the log:
> https://www.dropbox.com/s/emhqb88eb5jjsvp/solrcloud.txt?dl=0
>
> Everything else works as expected on the admin dashboard, including the data
> import handler.
>
> Anyone got ideas what might be wrong?
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/SolrCloud-Cloud-tab-on-admin-dashboard-not-loading-tp4163526.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Index to Helio Search

2014-10-09 Thread Yonik Seeley
Hmmm, I imagine this is due to the lucene back compat bugs that were
in 4.10, and the fact that the last release of heliosearch was
branched off of the 4x branch.

I just tried moving an index back and forth between my local
heliosearch copy and solr 4.10.1 and things worked fine.

Here's the snapshot I just tested that you can use until the next
release comes out:
https://www.dropbox.com/s/x9rs5yfousvkrnj/solr-hs_0.08snapshot.tgz?dl=0

-Yonik
http://heliosearch.org - native code faceting, facet functions,
sub-facets, off-heap data


On Thu, Oct 9, 2014 at 1:51 AM, Norgorn  wrote:
> When I try to simple copy index from native SOLR to Heliosearch, i get
> exception:
>
> Caused by: java.lang.IllegalArgumentException: A SPI class of type
> org.apache.lu
> cene.codecs.Codec with name 'Lucene410' does not exist. You need to add the
> corr
> esponding JAR file supporting this SPI to your classpath.The current
> classpath s
> upports the following names: [Lucene40, Lucene3x, Lucene41, Lucene42,
> Lucene45,
> Lucene46, Lucene49]
>
> Is there any proper way to add index from native SOLR to Heliosearch?
>
> The problem with native SOLR is that there are lot of OOM Exceptions (cause
> of large index).
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-Index-to-Helio-Search-tp4163446.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: does one need to reindex when changing similarity class

2014-10-09 Thread Jack Krupansky
The similarity class is only invoked at query time, so it doesn't 
participate in indexing.


-- Jack Krupansky

-Original Message- 
From: Markus Jelsma

Sent: Thursday, October 9, 2014 6:59 AM
To: solr-user@lucene.apache.org
Subject: RE: does one need to reindex when changing similarity class

Hi - no you don't have to, although maybe if you changed on how norms are 
encoded.

Markus



-Original message-

From:elisabeth benoit 
Sent: Thursday 9th October 2014 12:26
To: solr-user@lucene.apache.org
Subject: does one need to reindex when changing similarity class

I've read somewhere that we do have to reindex when changing similarity
class. Is that right?

Thanks again,
Elisabeth





Re: Solr Cloud has lower performance with more servers

2014-10-09 Thread Walter Underwood
Is this a production log of queries, with lots of repeats? If so, you may be 
seeing the normal effect of lower cache hit rates.

Check the hit rate for the query result cache in the two setups. With a single 
machine, the second occurrence of a query will be a cache hit. With two 
machines, it will not be if the two queries are routed to different machines.

I was running some benchmarks here. With one machine, the query cache had a 50% 
hit rate. With eight machines, it was 20%.

You can address this with a reverse proxy HTTP cache in front of the cluster, 
something like Varnish.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/


On Oct 9, 2014, at 7:21 AM, Yannick  wrote:

> Hi Toke,
> 
> thanks for your suggestion - definitely an interesting idea. But 
> unfortunately no, no indexing job is running; those are static indexes being 
> queried. The execution time is also very consistent in each condition, I did 
> quite a few tests.
> 
> Yann 
> 
> 
> On Thursday, October 9, 2014 3:56 PM, Toke Eskildsen 
>  wrote:
> 
> 
> 
> On Thu, 2014-10-09 at 15:06 +0200, Yannick wrote:
> 
> 
>> I created a group of 2 Solr servers with a load-balancer in front
>> (Haproxy). I have a batch client that sends requests (read-only)
>> continuously to the load-balancer. The problem is: the performance is
>> slower with 2 servers than it is with a single server (still via the
>> load-balancer, with the second server down, so it's not the
>> load-balancer itself causing the slowdown).
> 
> (speculating a lot here:)
> 
> Is another job updating the indexes while you are batch-searching?
> If so, the slowdown could be explained by the servers disk caches being
> flushed by the indexing job. When a request arrives some cache is
> reclaimed, but is will be a battle between the update and the search
> jobs. With more machines, there will be fewer request/machine, so the
> search-cache has a lower chance of being used again before it is
> reclaimed by the updater.
> 
> Still, worse performance for 2 machines sounds pretty bad.
> 
> - Toke Eskildsen, State and University Library, Denmark



Re: Solr Cloud has lower performance with more servers

2014-10-09 Thread Charlie Hull

On 09/10/2014 14:06, Yannick wrote:

Hello good Solr people,

I have the following surprising situation.

I created a group of 2 Solr servers with a load-balancer in front
(Haproxy). I have a batch client that sends requests (read-only)
continuously to the load-balancer. The problem is: the performance is
slower with 2 servers than it is with a single server (still via the
load-balancer, with the second server down, so it's not the
load-balancer itself causing the slowdown). My batch execution time
is about 5 minutes with a single server, and more than 6 minutes with
two servers.


What sort of queries are you doing? We're seeing some interesting 
effects of distributed facet queries with a current client - it seems 
there are some unexpected uses of caches.


Charlie


Both servers are VMs, hosted on two different physical computers,
with no resource sharing. I tried throwing in a third server, the
performance was even lower.

I'm trying to find ideas on what could cause this and/or what else I
could try? My goal is to decrease the execution times of these
batches (which may last for many hours), but this clearly seems to be
going the wrong way.

Thanks in advance,

Yann




--
Charlie Hull
Flax - Open Source Enterprise Search

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.flax.co.uk


SolrCloud - Cloud tab on admin dashboard not loading

2014-10-09 Thread arild.nils...@gmail.com
I'm trying to set up SolrCloud with embedded Zookeeper for Solr 4.10.1. The
logs seems fine when starting up with a single Solr instance creating and
using a embedded Zookeeper instance. I'm also able to create collections and
query collections via curl. However, there is a HTTP 404 not found when
loading "Cloud" tab on the admin dashboard. Specifically, the following url
encounters a 404: http://localhost:8080//zookeeper?wt=json&_=1412863623868

Here is the log:
https://www.dropbox.com/s/emhqb88eb5jjsvp/solrcloud.txt?dl=0

Everything else works as expected on the admin dashboard, including the data
import handler.

Anyone got ideas what might be wrong?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Cloud-tab-on-admin-dashboard-not-loading-tp4163526.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Cloud has lower performance with more servers

2014-10-09 Thread Yannick
Hi Toke,

thanks for your suggestion - definitely an interesting idea. But unfortunately 
no, no indexing job is running; those are static indexes being queried. The 
execution time is also very consistent in each condition, I did quite a few 
tests.

Yann 


On Thursday, October 9, 2014 3:56 PM, Toke Eskildsen  
wrote:
 


On Thu, 2014-10-09 at 15:06 +0200, Yannick wrote:


> I created a group of 2 Solr servers with a load-balancer in front
> (Haproxy). I have a batch client that sends requests (read-only)
> continuously to the load-balancer. The problem is: the performance is
> slower with 2 servers than it is with a single server (still via the
> load-balancer, with the second server down, so it's not the
> load-balancer itself causing the slowdown).

(speculating a lot here:)

Is another job updating the indexes while you are batch-searching?
If so, the slowdown could be explained by the servers disk caches being
flushed by the indexing job. When a request arrives some cache is
reclaimed, but is will be a battle between the update and the search
jobs. With more machines, there will be fewer request/machine, so the
search-cache has a lower chance of being used again before it is
reclaimed by the updater.

Still, worse performance for 2 machines sounds pretty bad.

- Toke Eskildsen, State and University Library, Denmark

Re: Solr Cloud has lower performance with more servers

2014-10-09 Thread Toke Eskildsen
On Thu, 2014-10-09 at 15:06 +0200, Yannick wrote:

> I created a group of 2 Solr servers with a load-balancer in front
> (Haproxy). I have a batch client that sends requests (read-only)
> continuously to the load-balancer. The problem is: the performance is
> slower with 2 servers than it is with a single server (still via the
> load-balancer, with the second server down, so it's not the
> load-balancer itself causing the slowdown).

(speculating a lot here:)

Is another job updating the indexes while you are batch-searching?
If so, the slowdown could be explained by the servers disk caches being
flushed by the indexing job. When a request arrives some cache is
reclaimed, but is will be a battle between the update and the search
jobs. With more machines, there will be fewer request/machine, so the
search-cache has a lower chance of being used again before it is
reclaimed by the updater.

Still, worse performance for 2 machines sounds pretty bad.

- Toke Eskildsen, State and University Library, Denmark




RE: Using Velocity with Child Documents?

2014-10-09 Thread Edwards, Joshua
I just realized that Solr supports returning child records with the parent 
starting in version 4.9.  I was on 4.8, so I will be upgrading to latest before 
continuing on this.  I think it will then make it easier to show the results in 
Velocity (in case anyone else needs to do this).

Thanks,
Josh Edwards

-Original Message-
From: Edwards, Joshua [mailto:joshua.edwa...@capitalone.com] 
Sent: Thursday, October 09, 2014 9:18 AM
To: solr-user@lucene.apache.org
Subject: RE: Using Velocity with Child Documents?

Hey, Erick -

Thanks for the response.  Yes, I've played around with Velocity before, and 
I've been able to get some good results.  However, with how Solr stores (and 
returns) child documents, I don't know of a way to get a response that is 
similar to the initial Json going in - with each parent document having the 
child documents underneath.  I believe that I have to get the parent 
information, and then run another query for each record to get the child 
records.  I didn't know if someone had done something similar.

Thanks,
Josh Edwards

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, October 08, 2014 3:10 PM
To: solr-user@lucene.apache.org
Subject: Re: Using Velocity with Child Documents?

Velocity is just taking the Solr response and displaying selected bits in HTML. 
So assuming the information you want is in the reponse packet (which you can 
tell just by doing the query from the browser) it's "just" a matter of pulling 
it out of the response and displaying it.

Mostly when I started down this path I poked around the velocity directory it 
was just a bit of hunt int to figure things out, with some help from the Apache 
Velocity page.

Not much help, but the short form is there's much of an example that I know of 
for your specific problem.

Erick

On Wed, Oct 8, 2014 at 8:54 AM, Edwards, Joshua  
wrote:
> Hi -
>
> I am trying to index a collection that has child documents.  I have 
> successfully loaded the data into my index using SolrJ, and I have verified 
> that I can search correctly using the "child of" method in my fq variable.  
> Now, I would like to use Velocity (Solritas) to display the parent records 
> with some details of the child records underneath.  Is there an easy way to 
> do this?  Is there an example somewhere that I can look at?
>
> Thanks,
> Josh Edwards
> 
>
> The information contained in this e-mail is confidential and/or proprietary 
> to Capital One and/or its affiliates. The information transmitted herewith is 
> intended only for use by the individual or entity to which it is addressed.  
> If the reader of this message is not the intended recipient, you are hereby 
> notified that any review, retransmission, dissemination, distribution, 
> copying or other use of, or taking of any action in reliance upon this 
> information is strictly prohibited. If you have received this communication 
> in error, please contact the sender and delete the material from your 
> computer.


The information contained in this e-mail is confidential and/or proprietary to 
Capital One and/or its affiliates. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed.  If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.


The information contained in this e-mail is confidential and/or proprietary to 
Capital One and/or its affiliates. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed.  If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.


Re: eDisMax parser and special characters

2014-10-09 Thread Lanke,Aniruddha
Is there a way to override this default behavior?

— Lanke

On Oct 8, 2014, at 4:55 PM, Jack Krupansky  wrote:

> Hyphen is a "prefix operator" and is normally followed by a term to indicate 
> that the term "must not" be present. So, your query has a syntax error. The 
> two query parsers differ in how they handle various errors. In the case of 
> edismax, it quotes operators and then tries again, so the hyphen gets quoted, 
> and then analyzed to nothing for text fields but is still a string for string 
> fields.
> 
> -- Jack Krupansky
> 
> -Original Message- From: Lanke,Aniruddha
> Sent: Wednesday, October 8, 2014 4:38 PM
> To: solr-user@lucene.apache.org
> Subject: Re: eDisMax parser and special characters
> 
> Sorry for a delayed reply here is more information -
> 
> Schema that we are using - http://pastebin.com/WQAJCCph
> Request Handler in config - http://pastebin.com/Y0kP40WF
> 
> Some analysis -
> 
> Search term: red -
> Parser eDismax
> No results show up
> (+((DisjunctionMaxQuery((name_starts_with:red^9.0 | 
> name_parts_starts_with:red^6.0 | s_detail:red | name:red^12.0 | 
> s_detail_starts_with:red^3.0 | s_detail_parts_starts_with:red^2.0)) 
> DisjunctionMaxQuery((name_starts_with:-^9.0 | 
> s_detail_starts_with:-^3.0)))~2))/no_coord
> 
> Search term: red -
> Parser dismax
> Results are returned
> (+DisjunctionMaxQuery((name_starts_with:red^9.0 | 
> name_parts_starts_with:red^6.0 | s_detail:red | name:red^12.0 | 
> s_detail_starts_with:red^3.0 | s_detail_parts_starts_with:red^2.0)) 
> ())/no_coord
> 
> Why do we see the variation in the results between dismax and eDismax?
> 
> 
> On Oct 8, 2014, at 8:59 AM, Erick Erickson 
> mailto:erickerick...@gmail.com>> wrote:
> 
> There's not much information here.
> What's the doc look like?
> What is the analyzer chain for it?
> What is the output when you add &debug=query?
> 
> Details matter. A lot ;)
> 
> Best,
> Erick
> 
> On Wed, Oct 8, 2014 at 6:26 AM, Michael Joyner 
> mailto:mich...@newsrx.com>> wrote:
> Try escaping special chars with a "\"
> 
> 
> On 10/08/2014 01:39 AM, Lanke,Aniruddha wrote:
> 
> We are using a eDisMax parser in our configuration. When we search using
> the query term that has a ‘-‘ we don’t get any results back.
> 
> Search term: red - yellow
> This doesn’t return any data back but
> 
> 
> 
> 
> CONFIDENTIALITY NOTICE This message and any included attachments are from 
> Cerner Corporation and are intended only for the addressee. The information 
> contained in this message is confidential and may constitute inside or 
> non-public information under international, federal, or state securities 
> laws. Unauthorized forwarding, printing, copying, distribution, or use of 
> such information is strictly prohibited and may be unlawful. If you are not 
> the addressee, please promptly delete this message and notify the sender of 
> the delivery error by e-mail or you may call Cerner's corporate offices in 
> Kansas City, Missouri, U.S.A at (+1) (816)221-1024. 



Re: per field similarity not working with solr 4.2.1

2014-10-09 Thread elisabeth benoit
ok thanks.


I think something is not working here (I'm quite sure my similarity class
is not beeing used because when I use
SchemaSimilarityFactory and a custom fieldtype similarity definition with
NoTFSimilarity, I don't get the same scoring as when I use NoTFSimilarity
as global similarity; but I'll try to gather more evidences).

Thanks again,
Elisabeth

2014-10-09 15:05 GMT+02:00 Markus Jelsma :

> Well, it is either the output of your calculation or writing something to
> System.out
> Markus
>
>
>
> -Original message-
> > From:elisabeth benoit 
> > Sent: Thursday 9th October 2014 13:31
> > To: solr-user@lucene.apache.org
> > Subject: Re: per field similarity not working with solr 4.2.1
> >
> > Thanks for the information!
> >
> > I've been struggling with that debug output. Any other way to know for
> sure
> > my similarity class is being used?
> >
> > Thanks again,
> > Elisabeth
> >
> > 2014-10-09 13:03 GMT+02:00 Markus Jelsma :
> >
> > > Hi - it should work, not seeing your implemenation in the debug output
> is
> > > a known issue.
> > >
> > >
> > > -Original message-
> > > > From:elisabeth benoit 
> > > > Sent: Thursday 9th October 2014 12:22
> > > > To: solr-user@lucene.apache.org
> > > > Subject: per field similarity not working with solr 4.2.1
> > > >
> > > > Hello,
> > > >
> > > > I am using Solr 4..2.1 and I've tried to use a per field similarity,
> as
> > > > described in
> > > >
> > > >
> > >
> https://apache.googlesource.com/lucene-solr/+/c5bb5cd921e1ce65e18eceb55e738f40591214f0/solr/core/src/test-files/solr/collection1/conf/schema-sim.xml
> > > >
> > > > so in my schema I have
> > > >
> > > > 
> > > > 
> > > >
> > > > and a custom similarity in fieldtype definition
> > > >
> > > >  > > > positionIncrementGap="100">
> > > >   > > > class="com.company.lbs.solr.search.similarity.NoTFSimilarity"/>
> > > >
> > > > ...
> > > >
> > > > but it is not working
> > > >
> > > > when I send a request with debugQuery=on, instead of [
> > > > NoTFSimilarity], I see []
> > > >
> > > > or to give an example, I have
> > > >
> > > >
> > > > weight(catchall:bretagn in 2575) []
> > > >
> > > > instead of weight(catchall:bretagn in 2575) [NoTFSimilarity]
> > > >
> > > > Anyone has a clue what I am doing wrong?
> > > >
> > > > Best regards,
> > > > Elisabeth
> > > >
> > >
> >
>


RE: Using Velocity with Child Documents?

2014-10-09 Thread Edwards, Joshua
Hey, Erick -

Thanks for the response.  Yes, I've played around with Velocity before, and 
I've been able to get some good results.  However, with how Solr stores (and 
returns) child documents, I don't know of a way to get a response that is 
similar to the initial Json going in - with each parent document having the 
child documents underneath.  I believe that I have to get the parent 
information, and then run another query for each record to get the child 
records.  I didn't know if someone had done something similar.

Thanks,
Josh Edwards

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, October 08, 2014 3:10 PM
To: solr-user@lucene.apache.org
Subject: Re: Using Velocity with Child Documents?

Velocity is just taking the Solr response and displaying selected bits in HTML. 
So assuming the information you want is in the reponse packet (which you can 
tell just by doing the query from the browser) it's "just" a matter of pulling 
it out of the response and displaying it.

Mostly when I started down this path I poked around the velocity directory it 
was just a bit of hunt int to figure things out, with some help from the Apache 
Velocity page.

Not much help, but the short form is there's much of an example that I know of 
for your specific problem.

Erick

On Wed, Oct 8, 2014 at 8:54 AM, Edwards, Joshua  
wrote:
> Hi -
>
> I am trying to index a collection that has child documents.  I have 
> successfully loaded the data into my index using SolrJ, and I have verified 
> that I can search correctly using the "child of" method in my fq variable.  
> Now, I would like to use Velocity (Solritas) to display the parent records 
> with some details of the child records underneath.  Is there an easy way to 
> do this?  Is there an example somewhere that I can look at?
>
> Thanks,
> Josh Edwards
> 
>
> The information contained in this e-mail is confidential and/or proprietary 
> to Capital One and/or its affiliates. The information transmitted herewith is 
> intended only for use by the individual or entity to which it is addressed.  
> If the reader of this message is not the intended recipient, you are hereby 
> notified that any review, retransmission, dissemination, distribution, 
> copying or other use of, or taking of any action in reliance upon this 
> information is strictly prohibited. If you have received this communication 
> in error, please contact the sender and delete the material from your 
> computer.


The information contained in this e-mail is confidential and/or proprietary to 
Capital One and/or its affiliates. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed.  If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.


Solr Cloud has lower performance with more servers

2014-10-09 Thread Yannick
Hello good Solr people,

I have the following surprising situation. 

I created a group of 2 Solr servers with a load-balancer in front (Haproxy). I 
have a batch client that sends requests (read-only) continuously to the 
load-balancer. The problem is: the performance is slower with 2 servers than it 
is with a single server (still via the load-balancer, with the second server 
down, so it's not the load-balancer itself causing the slowdown). My batch 
execution time is about 5 minutes with a single server, and more than 6 minutes 
with two servers.

Both servers are VMs, hosted on two different physical computers, with no 
resource sharing. I tried throwing in a third server, the performance was even 
lower.

I'm trying to find ideas on what could cause this and/or what else I could try? 
My goal is to decrease the execution times of these batches (which may last for 
many hours), but this clearly seems to be going the wrong way.

Thanks in advance,

Yann

RE: per field similarity not working with solr 4.2.1

2014-10-09 Thread Markus Jelsma
Well, it is either the output of your calculation or writing something to 
System.out
Markus

 
 
-Original message-
> From:elisabeth benoit 
> Sent: Thursday 9th October 2014 13:31
> To: solr-user@lucene.apache.org
> Subject: Re: per field similarity not working with solr 4.2.1
> 
> Thanks for the information!
> 
> I've been struggling with that debug output. Any other way to know for sure
> my similarity class is being used?
> 
> Thanks again,
> Elisabeth
> 
> 2014-10-09 13:03 GMT+02:00 Markus Jelsma :
> 
> > Hi - it should work, not seeing your implemenation in the debug output is
> > a known issue.
> >
> >
> > -Original message-
> > > From:elisabeth benoit 
> > > Sent: Thursday 9th October 2014 12:22
> > > To: solr-user@lucene.apache.org
> > > Subject: per field similarity not working with solr 4.2.1
> > >
> > > Hello,
> > >
> > > I am using Solr 4..2.1 and I've tried to use a per field similarity, as
> > > described in
> > >
> > >
> > https://apache.googlesource.com/lucene-solr/+/c5bb5cd921e1ce65e18eceb55e738f40591214f0/solr/core/src/test-files/solr/collection1/conf/schema-sim.xml
> > >
> > > so in my schema I have
> > >
> > > 
> > > 
> > >
> > > and a custom similarity in fieldtype definition
> > >
> > >  > > positionIncrementGap="100">
> > >   > > class="com.company.lbs.solr.search.similarity.NoTFSimilarity"/>
> > >
> > > ...
> > >
> > > but it is not working
> > >
> > > when I send a request with debugQuery=on, instead of [
> > > NoTFSimilarity], I see []
> > >
> > > or to give an example, I have
> > >
> > >
> > > weight(catchall:bretagn in 2575) []
> > >
> > > instead of weight(catchall:bretagn in 2575) [NoTFSimilarity]
> > >
> > > Anyone has a clue what I am doing wrong?
> > >
> > > Best regards,
> > > Elisabeth
> > >
> >
> 


Re: SolrCloud with client ssl

2014-10-09 Thread Sindre Fiskaa

This is output from overseer with severity to INFO

942420 [http-nio-443-exec-7] INFO
org.apache.solr.handler.admin.CollectionsHandler  ? Creating Collection :
numShards=3&createNodeSet=vt-searchln03:443_solr,vt-searchln04:443_solr,vt-
searchln01:443_solr,vt-searchln02:443_solr,vt-searchln05:443_solr,vt-search
ln06:443_solr&name=multisharding2&replicationFactor=2&action=CREATE
942472 [zkCallback-2-thread-3] INFO
org.apache.solr.common.cloud.ZkStateReader  ? A cluster state change:
WatchedEvent state:SyncConnected type:NodeDataChanged
path:/clusterstate.json, has occurred - updating... (live nodes size: 8)
942639 [zkCallback-2-thread-3] INFO
org.apache.solr.cloud.DistributedQueue  ? LatchChildWatcher fired on path:
/overseer/collection-queue-work/qnr-000290 state: SyncConnected type
NodeDataChanged
942644 [http-nio-443-exec-7] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/collections
params={numShards=3&createNodeSet=vt-searchln03:443_solr,vt-searchln04:443_
solr,vt-searchln01:443_solr,vt-searchln02:443_solr,vt-searchln05:443_solr,v
t-searchln06:443_solr&name=multisharding2&replicationFactor=2&action=CREATE
} status=0 QTime=224




On 09.10.14 11:59, "Jan Høydahl"  wrote:

>We also have another bug here, that the request responds with status=0,
>which means success, when only parts of the distributed request
>succeeded, but not all. That probably warrants its own JIRA issue.
>
>The logs you printed are from the client. Can you also dig up the
>corresponding logs from the Overseer node, we need to find what kind of
>IOException is happening and where.
>
>--
>Jan Høydahl, search solution architect
>Cominvent AS - www.cominvent.com
>
>8. okt. 2014 kl. 16:08 skrev Sindre Fiskaa :
>
>> Yes, running SolrCloud without SSL it works fine with the createNodeSet
>> param. I run this with Tomcat application server and 443 enabled.
>> Although I receive this error message the collection and the shards gets
>> created and the clusterstate.json updated, but the cores are missing. I
>> manual add them one by one in the admin console so I get my cloud up
>> running and the solr-nodes are able to talk to each other - no
>>certificate
>> issues or SSL handshake error between the nodes.
>> 
>> curl -E solr-ssl.pem:secret12 -k
>> 
>>'https://vt-searchln03:443/solr/admin/collections?action=CREATE&numShards
>>=3
>> 
>>&replicationFactor=2&name=multisharding&createNodeSet=vt-searchln03:443_s
>>ol
>> 
>>r,vt-searchln04:443_solr,vt-searchln01:443_solr,vt-searchln02:443_solr,vt
>>-s
>> earchln05:443_solr,vt-searchln06:443_solr'
>> 
>> 
>> 
>> 0> name="QTime">206> 
>>name="failure">org.apache.solr.client.solrj.SolrServerException:IOEx
>>ce
>> ption occured when talking to server at: https://vt-searchln03:443/solr
>> 
>>org.apache.solr.client.solrj.SolrS
>>er
>> verException:IOException occured when talking to server at:
>> https://vt-searchln04:443/solr
>> 
>>org.apache.solr.client.solrj.SolrS
>>er
>> verException:IOException occured when talking to server at:
>> https://vt-searchln06:443/solr
>> 
>>org.apache.solr.client.solrj.SolrS
>>er
>> verException:IOException occured when talking to server at:
>> https://vt-searchln05:443/solr
>> 
>>org.apache.solr.client.solrj.SolrS
>>er
>> verException:IOException occured when talking to server at:
>> https://vt-searchln01:443/solr
>> 
>>org.apache.solr.client.solrj.SolrS
>>er
>> verException:IOException occured when talking to server at:
>> https://vt-searchln02:443/solr 
>> 
>> 
>> 
>> -Sindre
>> 
>> On 08.10.14 15:14, "Jan Høydahl"  wrote:
>> 
>>> Hi,
>>> 
>>> I answered at https://issues.apache.org/jira/browse/SOLR-6595:
>>> 
>>> * Does it work with createNodeSet when using plain SolrCloud without
>>>SSL?
>>> * Please provide the exact CollectionApi request you used when it
>>>failed,
>>> so we can see if the syntax is correct. Also, is 443 your secure port
>>> number in Jetty/Tomcat?
>>> 
>>> ...but perhaps keep the conversation going here until it is a confirmed
>>> bug :)
>>> 
>>> --
>>> Jan Høydahl, search solution architect
>>> Cominvent AS - www.cominvent.com
>>> 
>>> 7. okt. 2014 kl. 06:57 skrev Sindre Fiskaa :
>>> 
 Followed the description
 https://cwiki.apache.org/confluence/display/solr/Enabling+SSL and
 generated a self signed key pair. Configured a few solr-nodes and used
 the collection api to crate a new collection. I get error message when
 specify the nodes with the createNodeSet param. When I don't use
 createNodeSet param the collection gets created without error on
random
 nodes. Could this be a bug related to the createNodeSet param?
 
 
 
 0>>> name="QTime">185>>> 
name="failure">org.apache.solr.client.solrj.SolrServerException:IO
Ex
 ception occured when talking to server
 
at:https://vt-searc

Re: Edismax parser and boosts

2014-10-09 Thread Pawel Rog
Hi,
Thank you for your response.
I checked it in Solr 4.8 but I think this works as I described from very
long time. I'm not 100% sure if it is really bug or not. When I run phrase
query like "foo^1.0 bar" this works very similarto what happens in edismax
with set *pf* parameter (boost part is not removed).

--
Paweł Róg

On Thu, Oct 9, 2014 at 12:07 AM, Jack Krupansky 
wrote:

> Definitely sounds like a bug! File a Jira. Thanks for reporting this. What
> release of Solr?
>
>
>
> -- Jack Krupansky
> -Original Message- From: Pawel Rog
> Sent: Wednesday, October 8, 2014 3:57 PM
> To: solr-user@lucene.apache.org
> Subject: Edismax parser and boosts
>
>
> Hi,
> I use edismax query with q parameter set as below:
>
> q=foo^1.0+AND+bar
>
> For such a query for the same document I see different (lower) scoring
> value than for
>
> q=foo+AND+bar
>
> By default boost of term is 1 as far as i know so why the scoring differs?
>
> When I check debugQuery parameter in parsedQuery for "foo^1.0+AND+bar" I
> see Boolean query which one of clauses is a phrase query "foo 1.0 bar". It
> seems that edismax parser takes whole q parameter as a phrase without
> removing boost value and add it as a boolean clause. Is it a bug or it
> should work like that?
>
> --
> Paweł Róg
>


Re: per field similarity not working with solr 4.2.1

2014-10-09 Thread elisabeth benoit
Thanks for the information!

I've been struggling with that debug output. Any other way to know for sure
my similarity class is being used?

Thanks again,
Elisabeth

2014-10-09 13:03 GMT+02:00 Markus Jelsma :

> Hi - it should work, not seeing your implemenation in the debug output is
> a known issue.
>
>
> -Original message-
> > From:elisabeth benoit 
> > Sent: Thursday 9th October 2014 12:22
> > To: solr-user@lucene.apache.org
> > Subject: per field similarity not working with solr 4.2.1
> >
> > Hello,
> >
> > I am using Solr 4..2.1 and I've tried to use a per field similarity, as
> > described in
> >
> >
> https://apache.googlesource.com/lucene-solr/+/c5bb5cd921e1ce65e18eceb55e738f40591214f0/solr/core/src/test-files/solr/collection1/conf/schema-sim.xml
> >
> > so in my schema I have
> >
> > 
> > 
> >
> > and a custom similarity in fieldtype definition
> >
> >  > positionIncrementGap="100">
> >   > class="com.company.lbs.solr.search.similarity.NoTFSimilarity"/>
> >
> > ...
> >
> > but it is not working
> >
> > when I send a request with debugQuery=on, instead of [
> > NoTFSimilarity], I see []
> >
> > or to give an example, I have
> >
> >
> > weight(catchall:bretagn in 2575) []
> >
> > instead of weight(catchall:bretagn in 2575) [NoTFSimilarity]
> >
> > Anyone has a clue what I am doing wrong?
> >
> > Best regards,
> > Elisabeth
> >
>


RE: per field similarity not working with solr 4.2.1

2014-10-09 Thread Markus Jelsma
Hi - it should work, not seeing your implemenation in the debug output is a 
known issue.
 
 
-Original message-
> From:elisabeth benoit 
> Sent: Thursday 9th October 2014 12:22
> To: solr-user@lucene.apache.org
> Subject: per field similarity not working with solr 4.2.1
> 
> Hello,
> 
> I am using Solr 4..2.1 and I've tried to use a per field similarity, as
> described in
> 
> https://apache.googlesource.com/lucene-solr/+/c5bb5cd921e1ce65e18eceb55e738f40591214f0/solr/core/src/test-files/solr/collection1/conf/schema-sim.xml
> 
> so in my schema I have
> 
> 
> 
> 
> and a custom similarity in fieldtype definition
> 
>  positionIncrementGap="100">
>   class="com.company.lbs.solr.search.similarity.NoTFSimilarity"/>
>
> ...
> 
> but it is not working
> 
> when I send a request with debugQuery=on, instead of [
> NoTFSimilarity], I see []
> 
> or to give an example, I have
> 
> 
> weight(catchall:bretagn in 2575) []
> 
> instead of weight(catchall:bretagn in 2575) [NoTFSimilarity]
> 
> Anyone has a clue what I am doing wrong?
> 
> Best regards,
> Elisabeth
> 


RE: does one need to reindex when changing similarity class

2014-10-09 Thread Markus Jelsma
Hi - no you don't have to, although maybe if you changed on how norms are 
encoded.
Markus

 
 
-Original message-
> From:elisabeth benoit 
> Sent: Thursday 9th October 2014 12:26
> To: solr-user@lucene.apache.org
> Subject: does one need to reindex when changing similarity class
> 
> I've read somewhere that we do have to reindex when changing similarity
> class. Is that right?
> 
> Thanks again,
> Elisabeth
> 


does one need to reindex when changing similarity class

2014-10-09 Thread elisabeth benoit
I've read somewhere that we do have to reindex when changing similarity
class. Is that right?

Thanks again,
Elisabeth


per field similarity not working with solr 4.2.1

2014-10-09 Thread elisabeth benoit
Hello,

I am using Solr 4..2.1 and I've tried to use a per field similarity, as
described in

https://apache.googlesource.com/lucene-solr/+/c5bb5cd921e1ce65e18eceb55e738f40591214f0/solr/core/src/test-files/solr/collection1/conf/schema-sim.xml

so in my schema I have




and a custom similarity in fieldtype definition


 
   
...

but it is not working

when I send a request with debugQuery=on, instead of [
NoTFSimilarity], I see []

or to give an example, I have


weight(catchall:bretagn in 2575) []

instead of weight(catchall:bretagn in 2575) [NoTFSimilarity]

Anyone has a clue what I am doing wrong?

Best regards,
Elisabeth


Re: SolrCloud with client ssl

2014-10-09 Thread Jan Høydahl
We also have another bug here, that the request responds with status=0, which 
means success, when only parts of the distributed request succeeded, but not 
all. That probably warrants its own JIRA issue.

The logs you printed are from the client. Can you also dig up the corresponding 
logs from the Overseer node, we need to find what kind of IOException is 
happening and where.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

8. okt. 2014 kl. 16:08 skrev Sindre Fiskaa :

> Yes, running SolrCloud without SSL it works fine with the createNodeSet
> param. I run this with Tomcat application server and 443 enabled.
> Although I receive this error message the collection and the shards gets
> created and the clusterstate.json updated, but the cores are missing. I
> manual add them one by one in the admin console so I get my cloud up
> running and the solr-nodes are able to talk to each other - no certificate
> issues or SSL handshake error between the nodes.
> 
> curl -E solr-ssl.pem:secret12 -k
> 'https://vt-searchln03:443/solr/admin/collections?action=CREATE&numShards=3
> &replicationFactor=2&name=multisharding&createNodeSet=vt-searchln03:443_sol
> r,vt-searchln04:443_solr,vt-searchln01:443_solr,vt-searchln02:443_solr,vt-s
> earchln05:443_solr,vt-searchln06:443_solr'
> 
> 
> 
> 0 name="QTime">206 name="failure">org.apache.solr.client.solrj.SolrServerException:IOExce
> ption occured when talking to server at: https://vt-searchln03:443/solr
> org.apache.solr.client.solrj.SolrSer
> verException:IOException occured when talking to server at:
> https://vt-searchln04:443/solr
> org.apache.solr.client.solrj.SolrSer
> verException:IOException occured when talking to server at:
> https://vt-searchln06:443/solr
> org.apache.solr.client.solrj.SolrSer
> verException:IOException occured when talking to server at:
> https://vt-searchln05:443/solr
> org.apache.solr.client.solrj.SolrSer
> verException:IOException occured when talking to server at:
> https://vt-searchln01:443/solr
> org.apache.solr.client.solrj.SolrSer
> verException:IOException occured when talking to server at:
> https://vt-searchln02:443/solr 
> 
> 
> 
> -Sindre
> 
> On 08.10.14 15:14, "Jan Høydahl"  wrote:
> 
>> Hi,
>> 
>> I answered at https://issues.apache.org/jira/browse/SOLR-6595:
>> 
>> * Does it work with createNodeSet when using plain SolrCloud without SSL?
>> * Please provide the exact CollectionApi request you used when it failed,
>> so we can see if the syntax is correct. Also, is 443 your secure port
>> number in Jetty/Tomcat?
>> 
>> ...but perhaps keep the conversation going here until it is a confirmed
>> bug :)
>> 
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> 
>> 7. okt. 2014 kl. 06:57 skrev Sindre Fiskaa :
>> 
>>> Followed the description
>>> https://cwiki.apache.org/confluence/display/solr/Enabling+SSL and
>>> generated a self signed key pair. Configured a few solr-nodes and used
>>> the collection api to crate a new collection. I get error message when
>>> specify the nodes with the createNodeSet param. When I don't use
>>> createNodeSet param the collection gets created without error on random
>>> nodes. Could this be a bug related to the createNodeSet param?
>>> 
>>> 
>>> 
>>> 0>> name="QTime">185>> name="failure">org.apache.solr.client.solrj.SolrServerException:IOEx
>>> ception occured when talking to server
>>> at:https://vt-searchln04:443/solr>> 3C/str%3E%3C/lst%3E>
>>> 
>> 
> 



Re: [ANNOUNCE] Luke 4.10.1 released

2014-10-09 Thread Bernd Fehling
Thanks for keeping this up to date.

When starting luke-4.10.1.jar I get:
SLF4J: Failed to load class “org.slf4j.impl.StaticLoggerBinder”

May I suggest to add that class also directly to luke?


And another one, I get a warning that log4j.properties should not
use "org.apache.hadoop.metrics.jvm.EventCounter" any more.
Instead it should use "org.apache.hadoop.log.metrics.EventCounter".
After changing this in log4j.properties the warning is gone
and everything runs fine.

Again, thanks for your work.

Regards,
Bernd


Am 07.10.2014 um 21:51 schrieb Dmitry Kan:
> Hello,
> 
> Luke 4.10.1 has been released. Download it here:
> 
> https://github.com/DmitryKey/luke/releases/tag/luke-4.10.1
> 
> The release has been tested against the solr-4.10.1 based index.
> 
> Changes:
> https://github.com/DmitryKey/luke/issues/5
> https://github.com/DmitryKey/luke/issues/6
> 
> Remember to pass the following JVM parameter when starting luke:
> 
> java -XX:MaxPermSize=512m -jar luke-with-deps.jar
> 
> or alternatively, use luke.bat or luke.sh to launch luke from the command
> line.
> 
> Enjoy,
> 
> Dmitry Kan
> 


Re: Advise on an architecture with lot of cores

2014-10-09 Thread Aditya
Hi Manoj

There  are advantages in both the approach. I recently read an article,
http://lucidworks.com/blog/podcast-solr-at-scale-at-aol/ . AOL uses Solr
and it uses one core per user.

Having one core per customer helps you
1. Easily migrate / backup the index
2. Load the core as and when required. When user has signed in, load his
index otherwise you don't need to keep his data in memory.
3. Rebuilding data for particular user is easier

Cons:
1. If most of users are actively siging in and you need to load most of the
cores all the time then it will reduce the search time.
2. Each core will have some set of files and there could be situitation
where you will end up in too many files open exception. (We faced this
scenario).


Having single core for all
1. This reduces the headache of user specific stuff and sees the DB / index
as a black box, where you could query for all
2. When the load is more, shard it

Cons:
1. Rebuilding index will take more time

Regards
Aditya
www.findbestopensource.com






On Tue, Oct 7, 2014 at 8:01 PM, Manoj Bharadwaj 
wrote:

> Hi Toke,
>
> I don't think I answered your question properly.
>
> With the current 1 core/customer setup many cores are idle. The redesign we
> are working on will move most of our searches to being driven by SOLR vs
> database (current split is 90% database, 10% solr). With that change, all
> cores will see traffic.
>
> We have 25G data in the index (across all cores) and they are currently in
> a 2 core VM with 32G memory. We are making some changes to the schema and
> the analyzers and we see the index size growing by 25% or so due to this.
> And to support this we will be moving to a VM with 4 cores and 64G memory.
> Hardware as such isn't a constraint.
>
> Regards
> Manoj
>
> On Tue, Oct 7, 2014 at 8:47 AM, Toke Eskildsen 
> wrote:
>
> > On Tue, 2014-10-07 at 14:27 +0200, Manoj Bharadwaj wrote:
> > > My team inherited a SOLR setup with an architecture that has a core for
> > > every customer. We have a few different types of cores, say "A", "B",
> C",
> > > and for each one of this there is a core per customer - namely "A1",
> > > "A2"..., "B1", "B2"... Overall we have over 600 cores. We don't know
> the
> > > history behind the current design - the exact reasons why it was done
> the
> > > way it was done - one probable consideration was to ensure a customer
> > data
> > > separate from other.
> >
> > It is not a bad reason. It ensures that ranked search is optimized
> > towards each customer's data and makes it easy to manage adding and
> > removing customers.
> >
> > > We want to go to a single core per type architecture, and move on to
> > SOLR
> > > cloud as well in near future to achieve sharding via the features cloud
> > > provides.
> >
> > If the setup is heavy queried on most of the cores or is there are
> > core-spanning searches, collapsing the user-specific cores into fewer
> > super-cores might lower hardware requirements a bit. On the other hand,
> > it most of the cores are idle most of the time, the 1 core/customer
> > setup would be give better utilization of the hardware.
> >
> > Why do you want to collapse the cores?
> >
> > - Toke Eskildsen, State and University Library, Denmark
> >
> >
> >
>