Re: Dismax Question

2012-07-05 Thread Steve Fatula
It turns out that Solr 3.5.0 does not have the dismax issue, so, we have 
reverted. Hopefully, the bug will be fixed.

Regression of JIRA 1826?

2012-07-05 Thread Jamie Johnson
I just upgraded to trunk to try to fix an issue I was having with the
highlighter described in JIRA 1826, but it appears that this issue
still exists on trunk.  I'm running the following query

subject:ztest*

subject is a text field (not multivalued) and the return in highlighting is

ZTestForZTestForJamie

the actual stored value is "ZTestForJamie".  Is anyone else experiencing this?


Re: Solr facet multiple constraint

2012-07-05 Thread davidbougearel
Well thanks for your answer, in fact i've written what the QueryResponse
return as the solr query here is my real solr query before use the
executeQuery :

q=service%3A1+AND+publicationstatus%3ALIVE&sort=publishingdate+desc&fq=%7B%21ex%3Ddt%7D%28%28%28user%3A10%29%29%29&facet.field=%7B%21tag%3Ddt%7Duser&facet=true&facet.mincount=1

which is the same as my first post without the 'wt=javabin' and & instead of
commas.

Could you please see if there is something wrong for you ?

Best regards, 

David.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-facet-multiple-constraint-tp3992974p3993408.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: ExtendedDisMax Field Alias Question

2012-07-05 Thread Jamie Johnson
It's been some time since I've thought about this, but I wanted anyone
interested to know I created the following JIRA asking for this
feature.

https://issues.apache.org/jira/browse/SOLR-3598

On Sat, May 26, 2012 at 9:28 PM, Jamie Johnson  wrote:
> Yeah cycles in general I agree are bad, but perhaps an option to also
> include the original field or special handling of the aliased field to
> support this.
>
> On Sat, May 26, 2012 at 3:26 PM, Jack Krupansky  
> wrote:
>> That would create an alias "loop", which is not supported.
>>
>> For example,
>>
>> http://localhost:8983/solr/select/?debugQuery=true&defType=edismax&f.person_first_name.qf=genre_s&f.person_last_name.qf=id&f.name.qf=name+person_first_name+person_last_name&q=name:smith
>>
>> in Solr 3.6 generates a 400 response status code with this exception:
>>
>> org.apache.lucene.queryParser.ParseException: Cannot parse 'name:smith ':
>> Field aliases lead to a cycle
>>
>> Maybe what you would like is an enhancement to permit an explicit
>> refererence to the underlying field rather than the alias in an alias
>> definition, like:
>>
>> &f.name.qf=field.name+person_first_name+person_last_name
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: Jamie Johnson
>> Sent: Friday, May 25, 2012 8:37 PM
>> To: solr-user@lucene.apache.org
>> Subject: ExtendedDisMax Field Alias Question
>>
>>
>> I was wondering if someone could explain if the following is supported
>> with the current EDisMax Field Aliasing.
>>
>> I have a field like person_name which exists in solr, we also have 2
>> other fields named person_first_name and person_last_name.  I would
>> like to allow queries for person_name to be aliased as person_name,
>> person_first_name and person_last_name.  Is this allowed or does the
>> alias need to not appear in the list of fields to be aliased to (I
>> remember seeing something about aliases to other aliases is allowed)?
>> I could obviously create a purely virtual field which aliases all 3
>> but it would be nice if the parser could support this case.


Re: highlighting and wildcards

2012-07-05 Thread Jamie Johnson
Looks like this issue has already been fixed, sorry for the static.  I
will update and try again.

On Thu, Jul 5, 2012 at 5:52 PM, Jamie Johnson  wrote:
> I am executing a query on a multivalued text field that has the value
> of "ZtestForJamie".  When I execute this query
>
> http://localhost:8501/solr/select?q=ztest*&fl=subject&hl=true
>
> the hit that comes back is as follows
>
> ZtestForZtestForJamie
>
> I am running a solr 4.0 snapshot from 5/3/2012.


Re: deleteById commitWithin question

2012-07-05 Thread Jamie Johnson
Thanks Yonik, glad that it is not an issue on Trunk.  I'll see if our
group is interested in updating now to take advantage of this.  In the
mean time we just issue a commit after we do deletes and that seems to
be a sufficient workaround.

On Thu, Jul 5, 2012 at 6:05 PM, Yonik Seeley  wrote:
> On Thu, Jul 5, 2012 at 5:12 PM, Jamie Johnson  wrote:
>> Ok, so some more context, hopefully this is useful.
>>
>> I didn't think this was a SolrCloud issue but it appears to be.  I
>> have a simple 2 shard set up, I add 1 document which goes to shard 1.
>> I then issue a delete to shard 2.  The delete gets there and I see the
>> commit eventually being run in the log, but it doesn't appear to be
>> being distributed to shard 1.  If I manually call a server.commit()
>> the item is removed.
>
> OK, I just verified this for trunk / 4x.  It appears to work fine now,
> but I'm not exactly sure why it may not with the snapshot you have
> (much has changed  since).
>
> I followed the solr cloud wiki except that the path now needs to have
> collection1 in it, and I set the number of shards to 1 (with 2
> replicas).   I indexed the example docs, then queried both shards to
> verify the doc existed on both.  Then I deleted the doc through the
> non-leader via:
> curl "http://localhost:7574/solr/update?commitWithin=6"; -H
> 'Content-type:application/json' -d '{"delete": { "id":"SOLR1000"}}'
>
> This was to test both the forwarding to the leader and the fowarding
> from the leader to the replica.
> After 60 seconds, the document was gone on both replicas.
>
> aside: If you use deleteByQuery, be aware that full distributed
> indexing support was just finished yesterday by SOLR-3559
>
> -Yonik
> http://lucidimagination.com


Re: Filtering a query by range returning unexpected results

2012-07-05 Thread Erick Erickson
For future reference, JIRA is here:
https://issues.apache.org/jira/browse/SOLR-3595

Erick

On Thu, Jul 5, 2012 at 4:19 PM, Andrew Meredith  wrote:
> Thanks! That worked. I re-built the index with my "prices" field being of
> the tfloat type, and I am now able to perform range queries. I appreciate
> your help, Erick.
>
> On Tue, Jul 3, 2012 at 9:35 AM, Erick Erickson wrote:
>
>> OK, this appears to be something with the "currency" type. It works fine
>> for
>> regular float fields. I can't get the multiValued currency types to work
>> with
>> range queries. Don't quite know what I was doing when I thought they
>> _did_ work.
>>
>> One work-around I think, if you are using a single currency USD might be
>> to copy your price to a simple float field and do your range queries on
>> that.
>>
>> I'm not at all sure that the currency type was ever intended to support
>> multiValued="true". I don't know enough about the internals to know if
>> it's even a good idea to try, but the current behavior could be improved
>> upon.
>>
>> But it seems to me that one of two things should happen:
>> 1> the startup should barf if a currency type is multiValued (fail early)
>> or
>> 2> currency should work when multiValued.
>>
>> Unfortunately, JIRA is down so I can't look to see if this is already a
>> known
>> issue or enter a JIRA if it isn't. I'll try to look later if it all
>> comes back up.
>>
>> Best
>> Erick
>>
>> On Mon, Jul 2, 2012 at 1:53 PM, Andrew Meredith 
>> wrote:
>> > Yep, that 15.00.00 was a typo.
>> >
>> > Here are the relevant portions of my schema.xml:
>> >
>> > 
>> > 
>> > > precisionStep="8"
>> > defaultCurrency="USD" currencyConfig="currency.xml" />
>> > 
>> > 
>> >
>> > 
>> > 
>> > > > multiValued="true" />
>> > 
>> > 
>> >
>> > And here is the output of a sample query with &debugQuery=on appended:
>> >
>> > 
>> > Furtick
>> > Furtick
>> > 
>> > +DisjunctionMaxQuery((subtitle:furtick | frontlist_flapcopy:furtick^0.5 |
>> > frontlist_ean:furtick^6.0 | author:furtick^3.0 | series:furtick^1.5 |
>> > title:furtick^2.0)) ()
>> > 
>> > 
>> > +(subtitle:furtick | frontlist_flapcopy:furtick^0.5 |
>> > frontlist_ean:furtick^6.0 | author:furtick^3.0 | series:furtick^1.5 |
>> > title:furtick^2.0) ()
>> > 
>> > 
>> > ExtendedDismaxQParser
>> > 
>> > 
>> > 
>> > prices:[5.00 TO 21.00]
>> > forsaleinusa:true
>> > 
>> > 
>> > 
>> > ConstantScore(frange(currency(prices)):[500 TO 2100])
>> > 
>> > forsaleinusa:true
>> > 
>> > 
>> > 3.0
>> > 
>> > 2.0
>> > 
>> > 2.0
>> > 
>> > 
>> > 0.0
>> > 
>> > 
>> > 0.0
>> > 
>> > 
>> > 0.0
>> > 
>> > 
>> > 0.0
>> > 
>> > 
>> > 0.0
>> > 
>> > 
>> > 0.0
>> > 
>> > 
>> > 
>> > 1.0
>> > 
>> > 1.0
>> > 
>> > 
>> > 0.0
>> > 
>> > 
>> > 0.0
>> > 
>> > 
>> > 0.0
>> > 
>> > 
>> > 0.0
>> > 
>> > 
>> > 0.0
>> > 
>> > 
>> > 0.0
>> > 
>> > 
>> > 
>> > 
>> >
>> >
>> > If I run this same query with the filter, prices:[5.00 TO 99.00], then I
>> > get a result that includes the following field:
>> >
>> > 
>> > 12.99,USD
>> > 14.99,USD
>> > 15.00,USD
>> > 25.00,USD
>> > 
>> >
>> >
>> > I can't figure out why this is not being returned with the first query.
>> > I'll try re-building the index with the "prices" field type set to float
>> > and see if that changes the behaviour.
>> >
>> > On Sat, Jun 30, 2012 at 6:49 PM, Erick Erickson > >wrote:
>> >
>> >> This works fine for me with 3.6, float fields and even on a currency
>> type.
>> >>
>> >> I'm assuming a typo for 15.00.00 BTW.
>> >>
>> >> I admit I'm not all that familiar with the "currency" type, which I
>> infer
>> >> you're
>> >> using given the "USD" bits. But I ran a quick test with currency types
>> and
>> >> it worked at least the way I ran it... But another quick look shows that
>> >> some interesting things are being done with the "currency" type, so who
>> >> knows?
>> >>
>> >> So, let's see your relevant schema bits, and the results of your query
>> >> when you attach &debugQuery=on to it.
>> >>
>> >>
>> >> Best
>> >> Erick
>> >>
>> >> On Fri, Jun 29, 2012 at 2:43 PM, Andrew Meredith <
>> andymered...@gmail.com>
>> >> wrote:
>> >> > First off, I have to say that I am working on my first project that
>> has
>> >> > required me to work with Solr, so my question my be very elementary -
>> I
>> >> > just could not find an answer elsewhere.
>> >> >
>> >> > I am trying to add a ranged query filter that returns all items in a
>> >> given
>> >> > "prices" range. In my situation, each item can have multiple prices,
>> so
>> >> it
>> >> > is a multivalued field. When I search a range, say, prices:[15.00.00
>> TO
>> >> > 21.00], I want Solr to return all items that have *any* price in that
>> >> > range, rather than returning results where *all* prices are in the
>> range.
>> >> > For example, if i have an item with the following prices, it will not
>> be
>> >> > returned:
>> >> >   
>> >> >   19.99,USD
>> >> >   22.50,USD
>> >> >   
>> >> >
>> >> > Is there any way to change the behaviour of Solr so that it will matc

Re: deleteById commitWithin question

2012-07-05 Thread Yonik Seeley
On Thu, Jul 5, 2012 at 5:12 PM, Jamie Johnson  wrote:
> Ok, so some more context, hopefully this is useful.
>
> I didn't think this was a SolrCloud issue but it appears to be.  I
> have a simple 2 shard set up, I add 1 document which goes to shard 1.
> I then issue a delete to shard 2.  The delete gets there and I see the
> commit eventually being run in the log, but it doesn't appear to be
> being distributed to shard 1.  If I manually call a server.commit()
> the item is removed.

OK, I just verified this for trunk / 4x.  It appears to work fine now,
but I'm not exactly sure why it may not with the snapshot you have
(much has changed  since).

I followed the solr cloud wiki except that the path now needs to have
collection1 in it, and I set the number of shards to 1 (with 2
replicas).   I indexed the example docs, then queried both shards to
verify the doc existed on both.  Then I deleted the doc through the
non-leader via:
curl "http://localhost:7574/solr/update?commitWithin=6"; -H
'Content-type:application/json' -d '{"delete": { "id":"SOLR1000"}}'

This was to test both the forwarding to the leader and the fowarding
from the leader to the replica.
After 60 seconds, the document was gone on both replicas.

aside: If you use deleteByQuery, be aware that full distributed
indexing support was just finished yesterday by SOLR-3559

-Yonik
http://lucidimagination.com


Re: Solr 4.0 UI issue

2012-07-05 Thread Stefan Matheis
Hi, we are working on https://issues.apache.org/jira/browse/SOLR-3591 - so you 
may check the (raw) log output .. which may cause the UI showing this message.



On Thursday, July 5, 2012 at 7:19 PM, anarchos78 wrote:

> Greetings friends,
> I just discovered today that there is a new solr release (4.0 ALPHA). So, I
> give it a try. After setting it up (under Tomcat) I had the following error
> message:  
> “This interface requires that you activate the admin request handlers, add
> the following configuration to your solrconfig.xml:  
>  
> ”
>  
> The above error popped up when I added the following in the solrconfig.xml:
>  
>  class="org.apache.solr.handler.dataimport.DataImportHandler">
> 
> data-config.xml
> 
> 
>  
> Does anybody know what it is wrong?
>  
> Thank you in advance,
>  
> Tom
> Greece
>  
> *The solr folder structure:*
> +solr
> +conf
> +data
> +lib
> -contrib
> -dist
>  
> *The solr.xml file:*
> 
> 
> 
> 
> 
> 
>  
> *The solrconfig.xml file:*
> 
> 
> LUCENE_40
>  
> 
> 
>  
> 
> 
>  
> 
>  
>  
> 
> 
>  
> 
> 
>  
>  regex="apache-solr-dataimporthandler-extras-\d.*\.jar" />
>  
> 
> 
>   
>  
> ${solr.data.dir:}
>  
>   
> class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/>
>  
> 
>  
> 
> 
>  
> 
>  
>   
> 15000  
> false  
> 
>  
> 
> ${solr.data.dir:}
> 
>  
>  
> 
>  
> 
>  
> 1024
>  
>  size="512"
> initialSize="512"
> autowarmCount="0"/>
>  
>  size="512"
> initialSize="512"
> autowarmCount="0"/>
>  
>  size="512"
> initialSize="512"
> autowarmCount="0"/>
>  
> true
>  
> 20
>  
> 200
>  
> 
> 
>  
> 
> 
> 
> 
> 
> static firstSearcher warming in solrconfig.xml
> 
> 
> 
>  
> false
>  
> 2
>  
> 
>  
> 
>  
>  multipartUploadLimitInKB="2048000" />
>  
> 
>  
> 
>  
>  class="org.apache.solr.handler.dataimport.DataImportHandler">
> 
> data-config.xml
> 
> 
>  
> 
>  
> 
> explicit
> 10
> text
> 
>  
> 
>  
> 
> 
> explicit
> json
> true
> text
> 
> 
>  
> 
> 
> true
> 
> 
>  
> 
> 
> explicit
>  
> velocity
> browse
> layout
> Solritas
>  
> edismax
> 
> text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
> 
> 100%
> *:*
> 10
> *,score
>  
> 
> text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
> 
> text,features,name,sku,id,manu,cat
> 3
>  
> on
> cat
> manu_exact
> ipod
> GB
> 1
> cat,inStock
> after
> price
> 0
> 600
> 50
> popularity
> 0
> 10
> 3
> manufacturedate_dt
>  name="f.manufacturedate_dt.facet.range.start">NOW/YEAR-10YEARS
> NOW
> +1YEAR
> before
> after
>  
> on
> text features name
> 0
> name
>  
> on
> false  
> 5
> 2
> 5  
> true
> true  
> 5
> 3  
> 
>  
> 
> spellcheck
> 
> 
>  
> 
>  
> 
>  
>  startup="lazy"
> class="solr.extraction.ExtractingRequestHandler" >
> 
>  
> text
> true
> ignored_
>  
> true
> links
> ignored_
> 
> 
>  
>  startup="lazy"
> class="solr.FieldAnalysisRequestHandler" />
>  
>  class="solr.DocumentAnalysisRequestHandler"  
> startup="lazy" />
>  
>  class="solr.admin.AdminHandlers" />
>  
> 
> 
> solrpingquery
> 
> 
> all
> 
> 
>  
> 
> 
> explicit  
> true
> 
> 
>  
>  startup="lazy" />  
>  
> 
>  
> textSpell
>  
> 
> default
> name
> solr.DirectSolrSpellChecker
>  
> internal
>  
> 0.5
>  
> 2
>  
> 1
>  
> 5
>  
> 4
>  
> 0.01
>  
> 
>  
> 
> wordbreak
> solr.WordBreakSolrSpellChecker  
> name
> true
> true
> 10
> 
>  
> 
>  
> 
> 
> text
>  
> default
> wordbreak
> on
> true  
> 10
> 5
> 5  
> true
> true  
> 10
> 5  
> 
> 
> spellcheck
> 
> 
>  
> 
>  
> 
> 
> text
> true
> 
> 
> tvComponent
> 
> 
>  
>  enable="${solr.clustering.enabled:false}"
> class="solr.clustering.ClusteringComponent" >
>  
> 
>  
> default
>  
>  name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm
>  
> 20
>  
> clustering/carrot2
>  
> ENGLISH
> 
> 
> stc
>  name="carrot.algorithm">org.carrot2.clustering.stc.STCClusteringAlgorithm
> 
> 
>  
>  startup="lazy"
> enable="${solr.clustering.enabled:false}"
> class="solr.SearchHandler">
> 
> true
> default
> true
>  
> name
> id
>  
> features
>  
> true
>  
>  
>  
> false
>  
> edismax
> 
> text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
> 
> *:*
> 10
> *,score
>   
> 
> clustering
> 
> 
>  
> 
>  
> 
> 
> true
>   
> 
> terms
> 
> 
>  
> 
>  
> string
> elevate.xml
> 
>  
> 
> 
> explicit
> text
> 
> 
> elevator
> 
> 
>  
> 
> 
>  
>  
>  default="true"
> class="solr.highlight.GapFragmenter">
> 
> 100
> 
> 
>  
>  class="solr.highlight.RegexFragmenter">
> 
>  
> 70
>  
> 0.5
>  
> [-\w ,/\n\"']{20,200}
> 
> 
>  
>  default="true"
> class="solr.highlight.HtmlFormatter">
> 
> 
> 
> 
> 
>  
>  class="solr.highlight.HtmlEncoder" />
>  
>  class="solr.highlight.SimpleFragListBuilder"/>
>  
>  class="solr.highlight.SingleFragListBuilder"/>
>  
>  default="true"
> class="solr.highlight.WeightedFragListBuilder"/>
>  
>  default="true"
> class="solr.highlight.ScoreOrderFragmentsBuilder">
>  
> 
>  
>  class="solr.highlight.ScoreOrderFragmentsBuilder">
> 
> 
> 
> 
> 
>  
>  default="true"
> class="solr.highlight.SimpleBoundaryScanner">

Re: deleteById commitWithin question

2012-07-05 Thread Jamie Johnson
Ok, so some more context, hopefully this is useful.

I didn't think this was a SolrCloud issue but it appears to be.  I
have a simple 2 shard set up, I add 1 document which goes to shard 1.
I then issue a delete to shard 2.  The delete gets there and I see the
commit eventually being run in the log, but it doesn't appear to be
being distributed to shard 1.  If I manually call a server.commit()
the item is removed.

On Thu, Jul 5, 2012 at 4:56 PM, Jamie Johnson  wrote:
> Thanks for the fast response.  Yeah, it could be in either place
> either SolrJ or on the Server side.
>
> On Thu, Jul 5, 2012 at 4:47 PM, Yonik Seeley  
> wrote:
>> On Thu, Jul 5, 2012 at 4:29 PM, Jamie Johnson  wrote:
>>> I am running off of a snapshot taken 5/3/2012 of solr 4.0 and am
>>> noticing some issues around deleteById when a commitWithin parameter
>>> is included using SolrJ
>>
>> Oh wait... I just realized you were talking about SolrJ specifically -
>> so the issue may be there.
>>
>> -Yonik
>> http://lucidimagination.com


RE: Update JSON format post 3.1?

2012-07-05 Thread Klostermeyer, Michael
Sorry...I found the answer in the comments of the previously mentioned Jira 
ticket.  Apparently the proposed solution differed from the final one (the 
"doc" structure key is not needed, apparently).

Mike


-Original Message-
From: Klostermeyer, Michael [mailto:mklosterme...@riskexchange.com] 
Sent: Thursday, July 05, 2012 12:55 PM
To: solr-user@lucene.apache.org
Subject: Update JSON format post 3.1?

Is there any official documentation on the JSON format Solr 3.5 expects when 
adding/updating documents via /update/json?  Most of the official documentation 
is 3.1, and I am of the understanding this changed in v3.2 
(https://issues.apache.org/jira/browse/SOLR-2496).  I believe I have the 
correct format, but I am getting an odd error:

"The request sent by the client was syntactically incorrect (Expected: 
OBJECT_START but got ARRAY_START"

My JSON format is as follows (simplified):
{"add":
{"doc":
[
{"ID":"987654321","Name":"Steve Smith","ChildIDs":["3841"]} ] } }

The idea is that I want to be able to send multiple documents within the same 
request, although in this example I am demonstrating only a single document. 
"ChildIDs" is defined as a multivalued field.

Thanks.

Mike Klostermeyer


Re: deleteById commitWithin question

2012-07-05 Thread Jamie Johnson
Thanks for the fast response.  Yeah, it could be in either place
either SolrJ or on the Server side.

On Thu, Jul 5, 2012 at 4:47 PM, Yonik Seeley  wrote:
> On Thu, Jul 5, 2012 at 4:29 PM, Jamie Johnson  wrote:
>> I am running off of a snapshot taken 5/3/2012 of solr 4.0 and am
>> noticing some issues around deleteById when a commitWithin parameter
>> is included using SolrJ
>
> Oh wait... I just realized you were talking about SolrJ specifically -
> so the issue may be there.
>
> -Yonik
> http://lucidimagination.com


Re: deleteById commitWithin question

2012-07-05 Thread Yonik Seeley
On Thu, Jul 5, 2012 at 4:29 PM, Jamie Johnson  wrote:
> I am running off of a snapshot taken 5/3/2012 of solr 4.0 and am
> noticing some issues around deleteById when a commitWithin parameter
> is included using SolrJ

Oh wait... I just realized you were talking about SolrJ specifically -
so the issue may be there.

-Yonik
http://lucidimagination.com


Re: deleteById commitWithin question

2012-07-05 Thread Yonik Seeley
On Thu, Jul 5, 2012 at 4:29 PM, Jamie Johnson  wrote:
> I am running off of a snapshot taken 5/3/2012 of solr 4.0 and am
> noticing some issues around deleteById when a commitWithin parameter
> is included using SolrJ, specifically commit isn't executed.  If I
> later just call commit on the solr instance I see the item is deleted
> though.  Is anyone aware if this should work in that snapshot?


I thought I remembered something like this... but looking at the
commit log for DUH2, I don't see it.

/opt/code/lusolr4$ svn log
./solr/core/src/java/org/apache/solr/update/DirectUpdateHandler2.java
| less


r1357332 | yonik | 2012-07-04 12:23:09 -0400 (Wed, 04 Jul 2012) | 1 line

log DBQ reordering events

r1356858 | markrmiller | 2012-07-03 14:18:48 -0400 (Tue, 03 Jul 2012) | 1 line

SOLR-3587: After reloading a SolrCore, the original Analyzer is still
used rather than a new one

r1356845 | yonik | 2012-07-03 13:47:56 -0400 (Tue, 03 Jul 2012) | 1 line

SOLR-3559: DBQ reorder support

r1355088 | sarowe | 2012-06-28 13:51:38 -0400 (Thu, 28 Jun 2012) | 1 line

LUCENE-4172: clean up redundant throws clauses (merge from trunk)

r1348984 | hossman | 2012-06-11 15:46:14 -0400 (Mon, 11 Jun 2012) | 1 line

LUCENE-3949: fix license headers to not be javadoc style comments

r1343813 | rmuir | 2012-05-29 12:16:38 -0400 (Tue, 29 May 2012) | 1 line

create stable branch for 4.x releases

r1328890 | yonik | 2012-04-22 11:01:55 -0400 (Sun, 22 Apr 2012) | 1 line

SOLR-3392: fix search leak when openSearcher=false

r1328883 | yonik | 2012-04-22 09:58:00 -0400 (Sun, 22 Apr 2012) | 1 line

SOLR-3391: Make explicit commits cancel pending autocommits.


I'll try out trunk quick and see if it currently works.

-Yonik
http://lucidimagination.com


deleteById commitWithin question

2012-07-05 Thread Jamie Johnson
I am running off of a snapshot taken 5/3/2012 of solr 4.0 and am
noticing some issues around deleteById when a commitWithin parameter
is included using SolrJ, specifically commit isn't executed.  If I
later just call commit on the solr instance I see the item is deleted
though.  Is anyone aware if this should work in that snapshot?


Re: Filtering a query by range returning unexpected results

2012-07-05 Thread Andrew Meredith
Thanks! That worked. I re-built the index with my "prices" field being of
the tfloat type, and I am now able to perform range queries. I appreciate
your help, Erick.

On Tue, Jul 3, 2012 at 9:35 AM, Erick Erickson wrote:

> OK, this appears to be something with the "currency" type. It works fine
> for
> regular float fields. I can't get the multiValued currency types to work
> with
> range queries. Don't quite know what I was doing when I thought they
> _did_ work.
>
> One work-around I think, if you are using a single currency USD might be
> to copy your price to a simple float field and do your range queries on
> that.
>
> I'm not at all sure that the currency type was ever intended to support
> multiValued="true". I don't know enough about the internals to know if
> it's even a good idea to try, but the current behavior could be improved
> upon.
>
> But it seems to me that one of two things should happen:
> 1> the startup should barf if a currency type is multiValued (fail early)
> or
> 2> currency should work when multiValued.
>
> Unfortunately, JIRA is down so I can't look to see if this is already a
> known
> issue or enter a JIRA if it isn't. I'll try to look later if it all
> comes back up.
>
> Best
> Erick
>
> On Mon, Jul 2, 2012 at 1:53 PM, Andrew Meredith 
> wrote:
> > Yep, that 15.00.00 was a typo.
> >
> > Here are the relevant portions of my schema.xml:
> >
> > 
> > 
> >  precisionStep="8"
> > defaultCurrency="USD" currencyConfig="currency.xml" />
> > 
> > 
> >
> > 
> > 
> >  > multiValued="true" />
> > 
> > 
> >
> > And here is the output of a sample query with &debugQuery=on appended:
> >
> > 
> > Furtick
> > Furtick
> > 
> > +DisjunctionMaxQuery((subtitle:furtick | frontlist_flapcopy:furtick^0.5 |
> > frontlist_ean:furtick^6.0 | author:furtick^3.0 | series:furtick^1.5 |
> > title:furtick^2.0)) ()
> > 
> > 
> > +(subtitle:furtick | frontlist_flapcopy:furtick^0.5 |
> > frontlist_ean:furtick^6.0 | author:furtick^3.0 | series:furtick^1.5 |
> > title:furtick^2.0) ()
> > 
> > 
> > ExtendedDismaxQParser
> > 
> > 
> > 
> > prices:[5.00 TO 21.00]
> > forsaleinusa:true
> > 
> > 
> > 
> > ConstantScore(frange(currency(prices)):[500 TO 2100])
> > 
> > forsaleinusa:true
> > 
> > 
> > 3.0
> > 
> > 2.0
> > 
> > 2.0
> > 
> > 
> > 0.0
> > 
> > 
> > 0.0
> > 
> > 
> > 0.0
> > 
> > 
> > 0.0
> > 
> > 
> > 0.0
> > 
> > 
> > 0.0
> > 
> > 
> > 
> > 1.0
> > 
> > 1.0
> > 
> > 
> > 0.0
> > 
> > 
> > 0.0
> > 
> > 
> > 0.0
> > 
> > 
> > 0.0
> > 
> > 
> > 0.0
> > 
> > 
> > 0.0
> > 
> > 
> > 
> > 
> >
> >
> > If I run this same query with the filter, prices:[5.00 TO 99.00], then I
> > get a result that includes the following field:
> >
> > 
> > 12.99,USD
> > 14.99,USD
> > 15.00,USD
> > 25.00,USD
> > 
> >
> >
> > I can't figure out why this is not being returned with the first query.
> > I'll try re-building the index with the "prices" field type set to float
> > and see if that changes the behaviour.
> >
> > On Sat, Jun 30, 2012 at 6:49 PM, Erick Erickson  >wrote:
> >
> >> This works fine for me with 3.6, float fields and even on a currency
> type.
> >>
> >> I'm assuming a typo for 15.00.00 BTW.
> >>
> >> I admit I'm not all that familiar with the "currency" type, which I
> infer
> >> you're
> >> using given the "USD" bits. But I ran a quick test with currency types
> and
> >> it worked at least the way I ran it... But another quick look shows that
> >> some interesting things are being done with the "currency" type, so who
> >> knows?
> >>
> >> So, let's see your relevant schema bits, and the results of your query
> >> when you attach &debugQuery=on to it.
> >>
> >>
> >> Best
> >> Erick
> >>
> >> On Fri, Jun 29, 2012 at 2:43 PM, Andrew Meredith <
> andymered...@gmail.com>
> >> wrote:
> >> > First off, I have to say that I am working on my first project that
> has
> >> > required me to work with Solr, so my question my be very elementary -
> I
> >> > just could not find an answer elsewhere.
> >> >
> >> > I am trying to add a ranged query filter that returns all items in a
> >> given
> >> > "prices" range. In my situation, each item can have multiple prices,
> so
> >> it
> >> > is a multivalued field. When I search a range, say, prices:[15.00.00
> TO
> >> > 21.00], I want Solr to return all items that have *any* price in that
> >> > range, rather than returning results where *all* prices are in the
> range.
> >> > For example, if i have an item with the following prices, it will not
> be
> >> > returned:
> >> >   
> >> >   19.99,USD
> >> >   22.50,USD
> >> >   
> >> >
> >> > Is there any way to change the behaviour of Solr so that it will match
> >> > documents in which any value of a multivalued field matches a ranged
> >> query
> >> > filter?
> >> >
> >> > Thanks!
> >> >
> >> > --
> >> > 
> >> > S.D.G.
> >>
> >
> >
> >
> > --
> > Andrew Meredith
> >
> > Personal Blog: Soli Deo Gloria 
> > Programming Blog 

Re: Synonyms and Regions Taxonomy

2012-07-05 Thread Tri Cao
I don't think there's a synonym file for this use case. I am not even sure if
synonym is the right way to handle it.

I think the better way to improve recall is to mark up your documents with
a "hidden" field of is the geographic relations. For example, before indexing,
you can add a field to all documents containing "South America", something
like: "South America is a subcontinent, that is consisted of the countries 
Brazil,
Chile, Argentina, …"

This data can come from various sources, such as wikipedia, wordnet, etc.


On Jul 5, 2012, at 4:12 AM, Stephen Lacy wrote:

> When a user types in South America they want to be able to see documents
> containing Brazil, Chile etc.
> No I have already thrown together a list of countries and continents
> however I'm a little more ambitious,
> I would like to get a lot more regions such as american states as well or
> Former members of the USSR...
> Are there ready made synonym files or taxonomies in a different format.
> Are synonyms the best way of achieving this? Perhaps there is a better way?
> Any pitfalls or advice on this subject from someone who has done this
> before would be appreciated.
> Thanks
> 
> Stephen



Re: what is the sequence of execution of solr query , is it right to left?

2012-07-05 Thread Erick Erickson
This question can't be answered correctly as asked. The query parser does not
implement boolean logic directly, so any answer is pretty much wrong. Here's
an excellent writeup of what the query parser actually does:

http://www.lucidimagination.com/blog/2011/12/28/why-not-and-or-and-not/

As pointed out in the blog if you were going to try to get
boolean-like behavior,
you must be careful to parenthesize properly

Best
Erick

On Thu, Jul 5, 2012 at 6:51 AM, Alok Bhandari
 wrote:
> Hello,
>
> if I search solr with the criteria A AND B OR C , then what is the order of
> execution of boolean operators?
>
> I guess it is
>
> 1)Get result of B OR C
> 2) Get Result of A AND (result of step 1)
>
> Is it correct?. I am using solr 3.6.
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/what-is-the-sequence-of-execution-of-solr-query-is-it-right-to-left-tp3993182.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr-4.0 and high cpu usage

2012-07-05 Thread Erick Erickson
This is rather surprising. I know some defaults have changed between
1.4 and 4.0,
so the first thing I'd do is try some of your sample queries with
&debugQuery=on and
compare the debug info to see if you're executing more complex queries without
meaning to.

Also, dismax is deprecated in favor of edismax, so I'm not quite sure
you're comparing
apples/apples here.

But it's worth pursuing. And I'm assuming you're measuring QTime
rather than just
measuring response time, although you did say jmeter, so I'd guess
you're getting
response times, not QTime.

Best
Erick

On Thu, Jul 5, 2012 at 7:18 AM, Anatoli Matuskova
 wrote:
> Hello,
> I'm testing solr-4.0-alpha compared to 1.4. My index is optimized to one
> segment. I've seen a decrease in memory usage but a very high increase in
> CPU. This high cpu usage ends up giving me slower response times with 4.0
> than 1.4
> The server I'm using: Linux version 2.6.30-2-amd64 16 2.26GHz Intel Pentium
> Xeon Processors, 8GB RAM
> I have a jmeter sending queries using between 1 and 4 threads.
> The queries doesn't use faceting, neighter filters. Simple search to 5 text
> fields using dismax
> The index has 1M docs and is 1.2Gb size
> I've checked memory with jvmconsole and it's not GC fault.
> The index is built using the TieredMergePolicy for 4.0 and
> LogByteSizeMergePolicy for 1.4 but both were optimized to 1 segment (so in
> that case the merge policy shouldn't make any difference, am I correct?).
> I'm not indexing any docs while doing the tests
> Average response time came from 0.1sec to 0.3 sec
>
> Here is a graph of the cpu increase:
> Any advice or something I should take into account? With the same resources
> solr-4.0 is being 3 times slower than 1.4 and I don't know what I'm doing
> wrong
>
>
>
>
> http://lucene.472066.n3.nabble.com/file/n3993187/ganglia-solr.png
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/solr-4-0-and-high-cpu-usage-tp3993187.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Frequency of Unique Id displayed more than 1

2012-07-05 Thread Erick Erickson
No, it doesn't affect facet counts, grouping, or any of that stuff. After
all, facets and grouping are only calculated for documents that
satisfy a query, and deleted documents are, by definition, excluded
from "satisfying a query".

There are some subtle issues in scoring that can be affected, but you'll
rarely care about that. And the "extra" information is purged over time. When
segment merges happen, the data associated with deleted documents
is removed from the segments being merge (which, in effect, is what you
force with an optimize)..

BTW, optimizing is rarely required, usually people only optimize when an index
is pretty static, that is more toward the write-once end of the spectrum.

Best
Erick

On Thu, Jul 5, 2012 at 1:44 PM, Sohail Aboobaker  wrote:
> Thanks Eric,
>
> This is indeed what we are seeing. I hope we can just ignore the
> frequencies. Does it in any way effect facet counts for such records?
>
> Sohail


Update JSON format post 3.1?

2012-07-05 Thread Klostermeyer, Michael
Is there any official documentation on the JSON format Solr 3.5 expects when 
adding/updating documents via /update/json?  Most of the official documentation 
is 3.1, and I am of the understanding this changed in v3.2 
(https://issues.apache.org/jira/browse/SOLR-2496).  I believe I have the 
correct format, but I am getting an odd error:

"The request sent by the client was syntactically incorrect (Expected: 
OBJECT_START but got ARRAY_START"

My JSON format is as follows (simplified):
{"add":
{"doc":
[
{"ID":"987654321","Name":"Steve Smith","ChildIDs":["3841"]}
]
}
}

The idea is that I want to be able to send multiple documents within the same 
request, although in this example I am demonstrating only a single document. 
"ChildIDs" is defined as a multivalued field.

Thanks.

Mike Klostermeyer


Re: alphanumeric interval

2012-07-05 Thread Cat Bieber
I did not use facets in my implementation, so I don't have any 
facet-specific code snippet that would be helpful to you. However, if 
your handler extends SearchHandler and calls super.handleRequestBody() 
it should be running the facet component code. You have access to the 
SolrQueryResponse built by it, and may be able to get the data out of 
that object. You'll need to look at the javadoc for NamedList, and I 
found it helpful to dump the list in debug statements so I could examine 
its structure and contents. I suspect you need something like 
rsp.getValues().get("facet_counts") to get the facet data, but haven't 
tested it.

-Cat Bieber

On 07/05/2012 04:32 AM, AlexR wrote:

Hi,

thanks a lot for your answer, and sorry for my late response.

It's my first time to write a solr plugin. I already have a plugin with
empty handleRequestBody() method and i'm able to call them.

I need the list of facetted field person (facet.field=person) in my method.
but i don't know how.

do you have a code snipped of your implementation?

thx
Alex


--
View this message in context: 
http://lucene.472066.n3.nabble.com/alphanumeric-interval-tp3990965p3993148.html
Sent from the Solr - User mailing list archive at Nabble.com.
   


Re: Frequency of Unique Id displayed more than 1

2012-07-05 Thread Sohail Aboobaker
Thanks Eric,

This is indeed what we are seeing. I hope we can just ignore the
frequencies. Does it in any way effect facet counts for such records?

Sohail


Re: Problems with elevation component configuration

2012-07-05 Thread Chris Warner
Hi, Igor,

I also set forceElevation to true for my elevate results.

 Cheers,
Chris



- Original Message -
From: Igor Salma 
To: solr-user@lucene.apache.org
Cc: 
Sent: Thursday, July 5, 2012 4:47 AM
Subject: Problems with elevation component configuration

Hi to all,

we are using solr in combination with nutch and there are multiple cores
defined under solr. From some reason we can't configure elevation request
handler. We followed the instruction on
http://wiki.apache.org/solr/QueryElevationComponent,
only changed node for queryFieldType to work with "text" type fields:

    text
    elevate.xml
  

  
    
      explicit
    
    
      elevator
    
  

and added the node in elevate.xml :


    http://10.237.119.179:28080/ncal/mdo/presentation/conditions/conditionpage.jsp?condition=Condition_Epilepsy.xml";
/>


("id" field is the type of string and, as you can see, we are storing ulrs
as identifiers)

And when restart solr and try
http://localhost:8080/solr/elevate?q=indexingabstract:brain&debugQuery=true&enableElevation=truenothing
changes - the mentioned document is not on the top of the resut
list.

Can someone please help?

Thanks in advance,
Igor



Re: solr-4.0 and high cpu usage [SOLVED]

2012-07-05 Thread Anatoli Matuskova
Found why!
On Solr 1.4 dismax param mm defaults to 1 if not specified, which is
equivalent to AND. On Solr 4.0 if mm is not specified, the default operator
is used, which defaults to OR. That made return much more results for each
query I was running, increasing the response time and the CPU usage.


--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-4-0-and-high-cpu-usage-tp3993187p3993275.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Frequency of Unique Id displayed more than 1

2012-07-05 Thread Erick Erickson
solr updates are really a delete followed by a re-index. The old terms
are left in the index, but the associated document is marked as deleted.
Schema browser, for instance, will happily report frequencies > 1 for
s when a document has been updated.

You can ignore this if you query on the schemaId in question and get
only back one document, it's expected behavior. Also, if this is indeed
what you're seeing, doing  an optimize (forceMerge in the new parlance)
should take all the frequencies back to 1.

Best
Erick

On Thu, Jul 5, 2012 at 9:09 AM, Savvas Andreas Moysidis
 wrote:
> hmm, based on the schema I don't see how you would be able to commit
> the same "schemaid" twice?
> maybe you want to investigate how you post a document to solr (do you
> do a commit after the post etc) or the merge strategy that is being
> applied.
>
> Just to exclude any possibilities, is is possible that at some point
> the "schemaid" field wasn't defined as "string" and then changed to
> "string" without re-indexing?
>
> On 5 July 2012 12:09, Sohail Aboobaker  wrote:
>> 
>>  > required="true"/>
>>  > required="true"/>
>>  > required="true"/>
>>  > required="true"/>
>> > required="true"/>
>> > required="true"/>
>> > required="true"/>
>> > required="true"/>
>> > required="true" multiValued="true"/>> type="string" indexed="true" stored="true" required="false"
>> multiValued="true"/>> indexed="true" stored="true" required="false" multiValued="true"/>> name="level4Categories" type="string" indexed="true" stored="true"
>> required="false" multiValued="true"/>
>> 
>> schemaid
>>
>> Above is the main schema. Let me know if you need more information.


Re: data Model/UML diagrams/Class diagrams.

2012-07-05 Thread Anuj Kumar
Hi Prasad,

If you have the source code, why not reverse engineer the UML diagrams-
http://stackoverflow.com/questions/51786/recommended-eclipse-plugins-to-generate-uml-from-java-code

Regards,
Anuj

On Thu, Jul 5, 2012 at 6:42 PM, prasad deshpande <
prasad.deshpand...@gmail.com> wrote:

> Is there any who can help me?
>
> On Thu, Jul 5, 2012 at 10:10 AM, prasad deshpande <
> prasad.deshpand...@gmail.com> wrote:
>
> > I would like to understand how solr/lucene/Tika frameworks designed.
> Where
> > would I get the class diagram/UML diagrams to better understand the
> > solr/lucene/Tika.
> > I have the source code for all of them.
> >
> >
> > Thanks,
> > Prasad
> >
>


RE: Any ideas on Solr 4.0 Release.

2012-07-05 Thread Steven A Rowe
Hi Sohail,

Some of your questions are answered here: 
. 

See Chris Hostetter's blog post for more info, particularly on questions around 
stability: 
.

Steve 

-Original Message-
From: Sohail Aboobaker [mailto:sabooba...@gmail.com] 
Sent: Thursday, July 05, 2012 5:22 AM
To: solr-user@lucene.apache.org
Subject: Any ideas on Solr 4.0 Release.

Hi,

Congratulations on Alpha release. I am wondering is there a ball park on final 
release for 4.0? Is it expected in August or Sep time frame or is it further 
away? We badly need some features included in this release. These are around 
grouped facet counts. We have limited use for Solr in our current release. In 
next release, we will add more features (full text searching, location based 
searches etc.). I am wondering if the facet and group counts side of things is 
stable in Alpha or not? I have tested with the nightly builds before and it 
works fine for our scenarios.

Thanks.

Regards,
Sohail


Re: Solr facet multiple constraint

2012-07-05 Thread Erick Erickson
Well, to start with this query is totally messed up.

facet=true,sort=publishingdate desc,facet.mincount=1,q=service:1 AND
publicationstatus:LIVE,facet.field={!ex=dt}user,wt=javabin,fq={!tag=dt}user:10,version=2

You've put in commas where you should have ampersands to separate
parameters=value pairs.

e.g.
facet=true&sort=publishingdate desc&facet.mincount=1&q=service:1 AND
publicationstatus:LIVE&facet.field={!ex=dt}user,wt=javabin,fq={!tag=dt}user:10,version=2


Best
Erick

On Thu, Jul 5, 2012 at 1:17 AM, davidbougearel
 wrote:
> Please someone can help me,
>
> we are a team waiting for a fix.
> We try several ways to implement it without success.
>
> Thanks for reading anyway, David.
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-facet-multiple-constraint-tp3992974p3993119.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Boosting the whole documents

2012-07-05 Thread Erick Erickson
Can we see the query and the output with &debugQuery=on? Look
especially for ConstantScoreQuery in the debug output, if you
do certain searches then all the boosting stuff is ignored, e.g.
*:*.

Best
Erick

On Wed, Jul 4, 2012 at 9:39 AM, Danilak Michal  wrote:
> Hi,
>
> I have the following problem.
> I would like to give a boost to the whole documents as I index them. I am
> sending to solr xml in the form:
>
> 
>
> But it does't seem to alter the search scores in any way. I would expect
> that to multiply the final search score by two, am I correct?
> Probably I would need to alter schema.xml, but I found only information on
> how to do that for specific fields (just put omitNorms=false into the field
> tag). But what should I do, if I want to boost the whole document?
>
> Note: by boosting a whole document I mean, that if document A has search
> score 10.0 and document B has search score 15.0 and I give document A the
> boost 2.0, when I index it, I would expect its search score to be 20.0.
>
> Thanks in advance!
>
> Michal Danilak


Re: How negative queries are handled with edismax parser?

2012-07-05 Thread Erick Erickson
Have you looked at the results with &debugQuery=on? I admit it takes a while
to get the hang of reading the debug output, but it will probably give
you a good
idea of what's going on

Best
Erick

On Tue, Jul 3, 2012 at 7:46 AM, Alok Bhandari
 wrote:
> Hello,
>
> I am using and edismax parser, when I query for
>
> *- *  it returns 0 results but if I use lucene parser then I get all docs
> from the index. So want to know how exactly edismax handles this query and
> is it related to negative query? Please let me know.I am using solr 3.6
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/How-negative-queries-are-handled-with-edismax-parser-tp3992724.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: data Model/UML diagrams/Class diagrams.

2012-07-05 Thread prasad deshpande
Is there any who can help me?

On Thu, Jul 5, 2012 at 10:10 AM, prasad deshpande <
prasad.deshpand...@gmail.com> wrote:

> I would like to understand how solr/lucene/Tika frameworks designed. Where
> would I get the class diagram/UML diagrams to better understand the
> solr/lucene/Tika.
> I have the source code for all of them.
>
>
> Thanks,
> Prasad
>


Problems with elevation component configuration

2012-07-05 Thread Igor Salma
Hi to all,

we are using solr in combination with nutch and there are multiple cores
defined under solr. From some reason we can't configure elevation request
handler. We followed the instruction on
http://wiki.apache.org/solr/QueryElevationComponent,
only changed node for queryFieldType to work with "text" type fields:

text
elevate.xml
  

  

  explicit


  elevator

  

and added the node in elevate.xml :


http://10.237.119.179:28080/ncal/mdo/presentation/conditions/conditionpage.jsp?condition=Condition_Epilepsy.xml";
/>
 

("id" field is the type of string and, as you can see, we are storing ulrs
as identifiers)

And when restart solr and try
http://localhost:8080/solr/elevate?q=indexingabstract:brain&debugQuery=true&enableElevation=truenothing
changes - the mentioned document is not on the top of the resut
list.

Can someone please help?

Thanks in advance,
Igor


solr-4.0 and high cpu usage

2012-07-05 Thread Anatoli Matuskova
Hello,
I'm testing solr-4.0-alpha compared to 1.4. My index is optimized to one
segment. I've seen a decrease in memory usage but a very high increase in
CPU. This high cpu usage ends up giving me slower response times with 4.0
than 1.4
The server I'm using: Linux version 2.6.30-2-amd64 16 2.26GHz Intel Pentium
Xeon Processors, 8GB RAM
I have a jmeter sending queries using between 1 and 4 threads.
The queries doesn't use faceting, neighter filters. Simple search to 5 text
fields using dismax
The index has 1M docs and is 1.2Gb size
I've checked memory with jvmconsole and it's not GC fault.
The index is built using the TieredMergePolicy for 4.0 and
LogByteSizeMergePolicy for 1.4 but both were optimized to 1 segment (so in
that case the merge policy shouldn't make any difference, am I correct?).
I'm not indexing any docs while doing the tests
Average response time came from 0.1sec to 0.3 sec

Here is a graph of the cpu increase:
Any advice or something I should take into account? With the same resources
solr-4.0 is being 3 times slower than 1.4 and I don't know what I'm doing
wrong




http://lucene.472066.n3.nabble.com/file/n3993187/ganglia-solr.png 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-4-0-and-high-cpu-usage-tp3993187.html
Sent from the Solr - User mailing list archive at Nabble.com.


Synonyms and Regions Taxonomy

2012-07-05 Thread Stephen Lacy
When a user types in South America they want to be able to see documents
containing Brazil, Chile etc.
No I have already thrown together a list of countries and continents
however I'm a little more ambitious,
I would like to get a lot more regions such as american states as well or
Former members of the USSR...
Are there ready made synonym files or taxonomies in a different format.
Are synonyms the best way of achieving this? Perhaps there is a better way?
Any pitfalls or advice on this subject from someone who has done this
before would be appreciated.
Thanks

Stephen


Re: Use of Solr as primary store for search engine

2012-07-05 Thread Sohail Aboobaker
In many e-commerce sites, most of data that we display (except images)
especially in grids and lists is minimal. We were inclined to use Solr as
data store for only displaying the information in grids. We stopped only
due to non-availability of joins in Solr3.5. Since, our data (like any
other relational store) is split in multiple tables, we needed to
de-normalize to use solr as a store. We decided against it because that
would mean potentially heavy updates to indexes whenever related data is
updated. With Solr 4.0, we might have decided differently and implement the
grids using joins within solr.

We are too new to Solr to have any insights into it.

Regards,
Sohail


Re: Frequency of Unique Id displayed more than 1

2012-07-05 Thread Savvas Andreas Moysidis
can you post the schema you are applying pls?

On 5 July 2012 11:28, Sohail Aboobaker  wrote:
> Another observation is that when we query an individual schemaid, it
> returns only one row using the search interface. Why would frequency be
> more than 1?


Re: Frequency of Unique Id displayed more than 1

2012-07-05 Thread Sohail Aboobaker
Another observation is that when we query an individual schemaid, it
returns only one row using the search interface. Why would frequency be
more than 1?


Re: Frequency of Unique Id displayed more than 1

2012-07-05 Thread Sohail Aboobaker
We have defined the schemaid as String. It has concatenated value of the
product id and language. It takes the form of ID-EN. For example:
'123012-EN', '124020-EN', '12392-FR'.

Sohail


Re: Frequency of Unique Id displayed more than 1

2012-07-05 Thread Savvas Andreas Moysidis
Hello,

Make sure your unique id has a type which always yields one token
after tokenisation is applied (e.g. either "string" or a type which
only defines the KeywordTokenizer in its chain)

Regards,
Savvas

On 5 July 2012 11:02, Sohail Aboobaker  wrote:
> Hi,
>
> We have defined a unique key as schemaid. We add documents using
> server.addBean(obj) method. We are using the same method for updates as
> well. When browsing the schema, we see that some of the schemaid values
> have frequency of more than 1. Since, schemaid column is defined as unique
> key, we are expecting when addBean, it will automatically "replace" the
> existing entry in index.
>
> Are we supposed to use a different method for update as opposed to add?
>
> Regards,
> Sohail


Frequency of Unique Id displayed more than 1

2012-07-05 Thread Sohail Aboobaker
Hi,

We have defined a unique key as schemaid. We add documents using
server.addBean(obj) method. We are using the same method for updates as
well. When browsing the schema, we see that some of the schemaid values
have frequency of more than 1. Since, schemaid column is defined as unique
key, we are expecting when addBean, it will automatically "replace" the
existing entry in index.

Are we supposed to use a different method for update as opposed to add?

Regards,
Sohail


Any ideas on Solr 4.0 Release.

2012-07-05 Thread Sohail Aboobaker
Hi,

Congratulations on Alpha release. I am wondering is there a ball park on
final release for 4.0? Is it expected in August or Sep time frame or is it
further away? We badly need some features included in this release. These
are around grouped facet counts. We have limited use for Solr in our
current release. In next release, we will add more features (full text
searching, location based searches etc.). I am wondering if the facet and
group counts side of things is stable in Alpha or not? I have tested with
the nightly builds before and it works fine for our scenarios.

Thanks.

Regards,
Sohail


Re: Trunk error in Tomcat

2012-07-05 Thread Stefan Matheis
Great, thanks Vadim



On Thursday, July 5, 2012 at 9:34 AM, Vadim Kisselmann wrote:

> Hi Stefan,
> ok, i would test the latest version from trunk with tomcat in next
> days and open an new issue:)
> regards
> Vadim
> 
> 
> 2012/7/3 Stefan Matheis  (mailto:matheis.ste...@googlemail.com)>:
> > On Tuesday, July 3, 2012 at 8:10 PM, Vadim Kisselmann wrote:
> > > sorry, i overlooked your latest comment with the new issue in SOLR-3238 ;)
> > > Should i open an new issue?
> > 
> > 
> > 
> > 
> > NP Vadim, yes a new Issue would help .. all available Information too :) 




Re: alphanumeric interval

2012-07-05 Thread AlexR
Hi,

thanks a lot for your answer, and sorry for my late response.

It's my first time to write a solr plugin. I already have a plugin with
empty handleRequestBody() method and i'm able to call them.

I need the list of facetted field person (facet.field=person) in my method.
but i don't know how.

do you have a code snipped of your implementation?

thx
Alex
 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/alphanumeric-interval-tp3990965p3993148.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Search for abc AND *foo* return all docs for abc which do not have foo why?

2012-07-05 Thread Alok Bhandari
It is my mistake, the field which I was referring to was non existing so this
effect is shown. Sorry for the stupid question I have asked :-)

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Search-for-abc-AND-foo-return-all-docs-for-abc-which-do-not-have-foo-why-tp3993138p3993147.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Get all matching terms of an OR query

2012-07-05 Thread Michael Jakl
Thank you!

On 4 July 2012 17:37, Jack Krupansky  wrote:
> What exactly is it that is too slow?

I was comparing Queries with "debugQuery" enabled and disabled. The
difference was 60 seconds to 30 seconds for some (unusual) large
Queries (many Terms over a large set of documents chosen by filter
queries). After the caches are warm, the performance is of course far
better.

> It would be nice to have an optional search component or query parser option
> that returned the analyzed term for each query term.

Yes, I was thinking of reusing the analysis.jsp for that task, but
couldn't see an easy way to handle phrase queries and wasn't sure if
it performs better than the debugQuery approach.

> But as things stand, I would suggest that you do your own "fuzzy match"
> between the debugQuery terms and your source terms. That may not be 100%
> accurate, but probably would cover most/many cases.

Thanks, that's reassuring :)

Cheers,
Michael


Re: Trunk error in Tomcat

2012-07-05 Thread Vadim Kisselmann
Hi Stefan,
ok, i would test the latest version from trunk with tomcat in next
days and open an new issue:)
regards
Vadim


2012/7/3 Stefan Matheis :
> On Tuesday, July 3, 2012 at 8:10 PM, Vadim Kisselmann wrote:
>> sorry, i overlooked your latest comment with the new issue in SOLR-3238 ;)
>> Should i open an new issue?
>
>
> NP Vadim, yes a new Issue would help .. all available Information too :)


Re: How to change tmp directory

2012-07-05 Thread Erik Fäßler
Ah - allright, that's it! Thank you!

Erik

Am 04.07.2012 um 17:59 schrieb Jack Krupansky:

> Solr is probably simply using Java's temp directory, which you can redefine 
> by setting the java.io.tmpdir system property on the java command line or 
> using a system-specific environment variable.
> 
> -- Jack Krupansky
> 
> -Original Message- From: Erik Fäßler
> Sent: Wednesday, July 04, 2012 3:56 AM
> To: solr-user@lucene.apache.org
> Subject: How to change tmp directory
> 
> Hello all,
> 
> I came about an odd issue today when I wanted to add ca. 7M documents to my 
> Solr index: I got a SolrServerException telling me "No space left on device". 
> I had a look at the directory Solr (and its index) is installed in and there 
> is plenty space (~300GB).
> I then noticed a file named "upload_457ee97b_1385125274b__8000_0005.tmp" 
> had taken up all space of the machine's /tmp directory. The partition holding 
> the /tmp directory only has around 1GB of space and this file already took 
> nearly 800MB. I had a look at it and I realized that the file contained the 
> data I was adding to Solr in an XML format.
> 
> Is there a possibility to change the temporary directory for this action?
> 
> I use an Iterator with the HttpSolrServer's add(Iterator) 
> method for performance. So I can't just do commits from time to time.
> 
> Best regards,
> 
> Erik