date:20100719

Re: Finding distinct unique IDs in documents returned by fq -- Urgent Help Req

2010-07-19 Thread Ninad Raut

Hi,

I would like get the total count of the facet.field response values:

i.e. if my response id


  17
  12
  12
  9
  4



I would like the count of uniques names found as 5 ("Canon
USA"+"Olympus"+"Sony"+"Panasonic"+"Nikon")


On Fri, Jul 16, 2010 at 7:28 PM, kenf_nc  wrote:

>
> It may just be a mis-wording, but if you do distinct on 'unique' IDs, the
> count should be the same as response.numFound. But if you didn't mean
> 'unique', just count of some field in the results, Rebecca is correct,
> facets should do the job. Something like:
>
> ?q=content:query+text&facet=on&facet.field=rootId
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Finding-distinct-unique-IDs-in-documents-returned-by-fq-Urgent-Help-Req-tp971883p972601.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

why spellcheck and elevate search components can't work together?

2010-07-19 Thread Chamnap Chhorn

In my solrconfig.xml, I setup this way, but it doesn't work at all. Any one
can help? it works one without other one.

  
string_ci
elevateListings.xml
false
  

  

  explicit
  20
  dismax
  name^2 full_text^1
  uuid
  2.2
  on
  0.1


  type:Listing


  false


  spellcheck


  elevateListings

  

If I remove spellcheck component, the elevate component works (the result
also loads from elevateListings.xml).
If I remove elevate component,
http://localhost:8081/solr/select/?q=redd&qt=mb_listings&spellcheck=true&spellcheck.collate=truedoes
work.

Any ideas?

Chhorn Chamnap
http://chamnapchhorn.blogspot.com/

How to patch SOLR-1966?

2010-07-19 Thread Chamnap Chhorn

Hi,

I'm a ruby developer, no background in Java at all. I need
*exclusive=true*to work on elevation search component. However, it
does need a patch,
https://issues.apache.org/jira/browse/SOLR-1966. Anyone could present me a
step by step in order to do that?


-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/

Spatial filtering

2010-07-19 Thread Olivier Ricordeau

Re: why spellcheck and elevate search components can't work together?

2010-07-19 Thread Koji Sekiguchi


(10/07/19 19:14), Chamnap Chhorn wrote:

In my solrconfig.xml, I setup this way, but it doesn't work at all. Any one
can help? it works one without other one.

   
 string_ci
 elevateListings.xml
 false
   

   
 
   explicit
   20
   dismax
   name^2 full_text^1
   uuid
   2.2
   on
   0.1
 
 
   type:Listing
 
 
   false
 
 
   spellcheck
 
 
   elevateListings
 
   

If I remove spellcheck component, the elevate component works (the result
also loads from elevateListings.xml).
If I remove elevate component,
http://localhost:8081/solr/select/?q=redd&qt=mb_listings&spellcheck=true&spellcheck.collate=truedoes
work.

Any ideas?

Chhorn Chamnap
http://chamnapchhorn.blogspot.com/

   

Chhorn,

Try to change the "last-components" section to:


spellcheck
elevateListings


Koji

--
http://www.rondhuit.com/en/

Re: why spellcheck and elevate search components can't work together?

2010-07-19 Thread dan sutton

It needs to be :

   
 spellcheck
 elevateListings
   

or

   
 elevateListings
 spellcheck
   

Dan


On Mon, Jul 19, 2010 at 11:14 AM, Chamnap Chhorn wrote:

> In my solrconfig.xml, I setup this way, but it doesn't work at all. Any one
> can help? it works one without other one.
>
>   class="org.apache.solr.handler.component.QueryElevationComponent" >
>string_ci
>elevateListings.xml
>false
>  
>
>  
>
>  explicit
>  20
>  dismax
>  name^2 full_text^1
>  uuid
>  2.2
>  on
>  0.1
>
>
>  type:Listing
>
>
>  false
>
>
>  spellcheck
>
>
>  elevateListings
>
>  
>
> If I remove spellcheck component, the elevate component works (the result
> also loads from elevateListings.xml).
> If I remove elevate component,
>
> http://localhost:8081/solr/select/?q=redd&qt=mb_listings&spellcheck=true&spellcheck.collate=truedoes
> work.
>
> Any ideas?
>
> Chhorn Chamnap
> http://chamnapchhorn.blogspot.com/
>

Problem with Solr-Mailinglist

2010-07-19 Thread MitchK


Hello,

I try to post
http://lucene.472066.n3.nabble.com/Solr-in-an-extra-project-what-about-replication-scaling-etc-td977961.html#a977961
 
this  message for the fourth time to the Solr Mailinglist and everytime I
get the following response from the Mailing-list's server:



>   solr-user@lucene.apache.org
> SMTP error from remote mail server after end of data:
> host mx1.eu.apache.org [192.87.106.230]: 552 spam score (7.8) exceeded
> threshold
> 

Why is my posting declared as Spam?! Did anyone else has got such problems?

Thank you!
- Mitch
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Problem-with-Solr-Mailinglist-tp978247p978247.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: jetty logging

2010-07-19 Thread Lukas Kahwe Smith


On 17.07.2010, at 15:39, Lukas Kahwe Smith wrote:

> Hi,
> 
> I am following:
> http://wiki.apache.org/solr/LoggingInDefaultJettySetup
> 
> All works fine except defining the logging properties files from jetty.xml
> Does this approach work for anyone else?


problem  solved
i had to remove "-Dcom.sun.management.jmxremote" when starting solr for it to 
read the setting out of the jetty.xml

regards,
Lukas Kahwe Smith
m...@pooteeweet.org

Re: filter query on timestamp slowing query???

2010-07-19 Thread Ahmet Arslan

> my goal is to run a query and limit it with the timestamp
> of the last document i found.

I didn't understand this part.

> will TrieDateField give me this precision? 

You should use tdate instead of pdate for faster date range queries and date 
faceting. Please comments in schema.xml file.

> i also see slow queries when using a filter on a field that
> is a simple  string(StrField), that has only 3 types of values, don't
> understand what  might cause it

Is your index optimized?

Is this query also a range query? Or it is something like myStrField:foo?

RE: Finding distinct unique IDs in documents returned by fq -- Urgent Help Req

2010-07-19 Thread Jonathan Rochkind

> I would like get the total count of the facet.field response values

I'm pretty sure there's no way to get Solr to do that -- other than not setting 
a facet.limit, getting every value back in the response, and counting them 
yourself (not feasible for very large counts).   I've looked at trying to patch 
Solr to do it, because I could really use it too; it's definitely possible, but 
made trickier because there are now several different methods that Solr can use 
to do facetting, with separate code paths.  It seems like an odd omission to me 
too. 

Jonathan

Re: Finding distinct unique IDs in documents returned by fq -- Urgent Help Req

2010-07-19 Thread kenf_nc


Oh, okay. Got it now. Unfortunately I don't believe Solr supplies a total
count of matching facet values. One way to do this, although performance may
suffer, is to set your limit to -1 and just get back everything, that will
give you the count. You may want to set mincount to 1 so you aren't counting
facet values that aren't in your query, but that really depends on your
need.

...&facet.limit=-1&facet.mincount=1

adding that to any facet query will return all matching facet values.
Depending on how many unique values you have, this could be a lot. But it
will give you what you are looking for. Unless your data changes frequently,
maybe you can call it once and cache the results for some period of time.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Finding-distinct-unique-IDs-in-documents-returned-by-fq-Urgent-Help-Req-tp971883p978548.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: filter query on timestamp slowing query???

2010-07-19 Thread oferiko


1.I query my index once every 30 minutes. I save the timestamp of the newest
returned document. next time i query doe documents with timestamp between
the timestamp i saved from the previous query and NOW.

2.Sad to day it is not optimized, i'm at 60% of the disk space, and waiting
for another disk to be added before i can optimize.

3.it is a simple myStrField:foo

thanks for helping
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/filter-query-on-timestamp-slowing-query-tp977280p978595.html
Sent from the Solr - User mailing list archive at Nabble.com.

2 solr dataImport requests on a single core at the same time

2010-07-19 Thread kishan


Hi,
Iam using solr 1.4 version 

Iam using a single cored solr ,
iam doing full-import always.

my problem is i have 2 clients , on 1 client i have some 3lacks records to
index while the solr DataImport has been called and doing the index for this
client1 iam sending another solrDataImport request to index some 100 records
for client2,but the second request is not taking even the client1 data got
indexed completely .

if i send request once the first request completely done then the index is
happening through solr dataImport but when already 1 request is processing
its not taking the 2nd request iam able to see the request Params in the 
log for the seconbd request but.

please help me 


Thanks in advance for all.


 
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/2-solr-dataImport-requests-on-a-single-core-at-the-same-time-tp978649p978649.html
Sent from the Solr - User mailing list archive at Nabble.com.

slovene language support

2010-07-19 Thread Markus Goldbach

Hi,

I want to setup an solr with support for several languages.
The language list includes slovene, unfortunately I found nothing about it in 
the wiki.
Has some one experiences with solr 1.4 and slovene?

thanks for help
Markus

Re: slovene language support

2010-07-19 Thread Robert Muir

Hello,

There is some information here (prototype stemmer) about support in
snowball.
But Martin Porter had some unanswered questions/reservations so nothing ever
got added to snowball:

http://snowball.tartarus.org/archives/snowball-discuss/0725.html

Of course you could take that stemmer and generate java code with the
snowball code generator and use it, but it seems like it would be best for
those issues to get resolved and get it fixed/included in snowball itself...

On Mon, Jul 19, 2010 at 10:42 AM, Markus Goldbach  wrote:

> Hi,
>
> I want to setup an solr with support for several languages.
> The language list includes slovene, unfortunately I found nothing about it
> in the wiki.
> Has some one experiences with solr 1.4 and slovene?
>
> thanks for help
> Markus

-- 
Robert Muir
rcm...@gmail.com

stats on a field with no values

2010-07-19 Thread Jonathan Rochkind

When I use the stats component on a field that has no values in the 
result set (ie, stats.missing == rowCount), I'd expect that 'min'and 
'max' would be blank.


Instead, they seem to be the smallest and largest float values or 
something, min = 1.7976931348623157E308, max = 4.9E-324 .


Is this a bug?

Jonathan

Wiki, login and password recovery

2010-07-19 Thread Markus Jelsma

Hi,

 

This probably should be in INFRA (to which i'm not subscribed) or something 
like that. Anyway, for some reason, my user/pass won't let me login anymore and 
i'm quite sure my browser still `remembers` the correct combination. I'm unsure 
whether this is a bug: to get that answer, i need to recover my current 
password so i can check... But, how convenient, the password recovery mechanism 
`cannot connect with the mailserver on localhost ERRNO: 60` and times out.

 

Any assistance on this one?

 

Cheers,

How to get the list of all available fields in a (sharded) index

2010-07-19 Thread olivier sallou

Hi,
I cannot find any info on how to get the list of current fields in an index
(possibly sharded). With dynamic fields, I cannot simply parse the schema to
know what field are available.
Is there any way to get it via a request (or easilly programmable) ? I know
information is available in one of the Lucene generated files, but I 'd like
to get it via a query for my whole index.

Thanks

Olivier

Re: Wiki, login and password recovery

2010-07-19 Thread Chris Hostetter


You don't need to subscribe to any infra lists to file an INFRA bug, just 
use Jira...

https://issues.apache.org/jira/browse/INFRA

Note that there was infra work this weekend that involved moving servers 
for the wiki system (as was noted in advance on 
http://monitoring.apache.org and http://twitter.com/infrabot) so maybe you 
just got unluky with the timing?

  https://blogs.apache.org/infra/entry/new_hardware_for_apache_org 
  
: This probably should be in INFRA (to which i'm not subscribed) or 
: something like that. Anyway, for some reason, my user/pass won't let me 
: login anymore and i'm quite sure my browser still `remembers` the 
: correct combination. I'm unsure whether this is a bug: to get that 
: answer, i need to recover my current password so i can check... But, how 
: convenient, the password recovery mechanism `cannot connect with the 
: mailserver on localhost ERRNO: 60` and times out.



-Hoss

Re: Problem with Solr-Mailinglist

2010-07-19 Thread Chris Hostetter


: this  message for the fourth time to the Solr Mailinglist and everytime I
: get the following response from the Mailing-list's server:

I have no idea why it might be flaged as spam -- but many of the reasons 
why spam filters flag things have nothing to do with content, and 
everything to do with the headers -- which we can't see just byu looking 
at the copy on nabble.

I would suggest you file a bug with the Infra team, and provided as much 
details as possible: the exact times you tried to send the 
message, the complete message you tried to send with all 
headers (especailly message-id), and the complete bounce reply you got 
with all headers ad attachments.

https://issues.apache.org/jira/browse/INFRA 


-Hoss

Re: Problem with Solr-Mailinglist

2010-07-19 Thread Marvin Humphrey

On Mon, Jul 19, 2010 at 11:28:10AM -0700, Chris Hostetter wrote:
> 
> : this  message for the fourth time to the Solr Mailinglist and everytime I
> : get the following response from the Mailing-list's server:
> 
> I have no idea why it might be flaged as spam -- but many of the reasons 
> why spam filters flag things have nothing to do with content, and 
> everything to do with the headers -- which we can't see just byu looking 
> at the copy on nabble.

FWIW, the first response that infra gives when this comes up is "don't send
HTML mail", which almost always resolves the problem.  (HTML mail boosts spam
score compared to text mail).

Marvin Humphrey

Ranking based on term position

2010-07-19 Thread Papiya Misra


I need to make sure that documents with the search term occurring
towards the beginning of the document are ranked higher.

For example,

Search term : ox
Doc 1: box fox ox
Doc 2: ox box fox

Result: Doc2 will be ranked higher than Doc1.

The solution I can think of is sorting by term position (after enabling
term vectors). Is that the best way to go about it ?

Thanks
Papiya


Pink OTC Markets Inc. provides the leading inter-dealer quotation and trading 
system in the over-the-counter (OTC) securities market.   We create innovative 
technology and data solutions to efficiently connect market participants, 
improve price discovery, increase issuer disclosure, and better inform 
investors.   Our marketplace, comprised of the issuer-listed OTCQX and 
broker-quoted   Pink Sheets, is the third largest U.S. equity trading venue for 
company shares.

This document contains confidential information of Pink OTC Markets and is only 
intended for the recipient.   Do not copy, reproduce (electronically or 
otherwise), or disclose without the prior written consent of Pink OTC Markets.  
If you receive this message in error, please destroy all copies in your 
possession (electronically or otherwise) and contact the sender above.

RE: Re: Wiki, login and password recovery

2010-07-19 Thread Markus Jelsma

This happened just a few hours ago and the problem persists at this very 
moment. I filed an issue: https://issues.apache.org/jira/browse/INFRA-2884

 

Cheers!


 
-Original message-
From: Chris Hostetter 
Sent: Mon 19-07-2010 20:23
To: solr-user@lucene.apache.org; 
Subject: Re: Wiki, login and password recovery


You don't need to subscribe to any infra lists to file an INFRA bug, just 
use Jira...

https://issues.apache.org/jira/browse/INFRA

Note that there was infra work this weekend that involved moving servers 
for the wiki system (as was noted in advance on 
http://monitoring.apache.org and http://twitter.com/infrabot) so maybe you 
just got unluky with the timing?

 https://blogs.apache.org/infra/entry/new_hardware_for_apache_org 
 
: This probably should be in INFRA (to which i'm not subscribed) or 
: something like that. Anyway, for some reason, my user/pass won't let me 
: login anymore and i'm quite sure my browser still `remembers` the 
: correct combination. I'm unsure whether this is a bug: to get that 
: answer, i need to recover my current password so i can check... But, how 
: convenient, the password recovery mechanism `cannot connect with the 
: mailserver on localhost ERRNO: 60` and times out.



-Hoss

Autocomplete with NGrams

2010-07-19 Thread Frank A

I'm trying to follow the link below for setting up an auto complete/suggest
via NGrams:

http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/

I'm trying to do it withing a single SOLR instance, but since this index
isn't an index of the main documents (just a single term list) I don't have
the same primary keys.  The document talks about multiple cores, my question
is a) is this needed? and b) how do I properly setup a second core if it's
necessary.

Thanks.
Frank

RE: indexing best practices

2010-07-19 Thread Burton-West, Tom

Hi Ken,

This is all very dependent on your documents, your indexing setup and your 
hardware. Just as an extreme data point, I'll describe our experience.  

We run 5 clients on each of 6 machines to send documents to Solr using the 
standard http xml process.  Our documents contain about 10 fields, but one 
field contains OCR for the full text of a book.  The documents are about 700KB 
in size.

Each client sends solr documents to one of 10 solr shards on a round-robin 
basis.  We are running 5 shards on each of two dedicated indexing machines each 
with 144GB of memory and 2 x Quad Core Intel Xeon E5540 2.53GHz processors 
(Nehalem).  What we generally see is that once the index gets large enough for 
significant merging, our producers can send documents to solr faster than it 
can index them.

We suspect that our bottleneck is simply disk I/O for index merging on the Solr 
build machines.  We are currently experimenting with changing the 
maxRAMBufferSize settings and various merge policies/merge factors to see if we 
can speed up the Solr end of the indexing process.   Since we optimize our 
index down to two segments, we are also planning to experiment with using the 
"nomerge" merge policy. I hope to have some results to report on our blog 
sometime in the next  month or so.

Tom Burton-West
www.hathitrust.org/blogs

-Original Message-
From: kenf_nc [mailto:ken.fos...@realestate.com] 
Sent: Sunday, July 18, 2010 8:18 AM
To: solr-user@lucene.apache.org
Subject: Re: indexing best practices


No one has done performance analysis? Or has a link to anywhere where it's
been done?

basically fastest way to get documents into Solr. So many options available,
what's the fastest:
1) file import (xml, csv)  vs  DIH  vs POSTing
2) number of concurrent clients   1   vs 10 vs 100 ...is there a diminishing
returns number?

I have 16 million small (8 to 10 fields, no large text fields) docs that get
updated monthly and 2.5 million largish (20 to 30 fields, a couple html text
fields) that get updated monthly. It currently takes about 20 hours to do a
full import. I would like to cut that down as much as possible.
Thanks,
Ken
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-best-practices-tp973274p976313.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Problem with Solr-Mailinglist

2010-07-19 Thread MitchK


Thank you both.

I will do what Hoss suggested, tomorrow.
The mail was sent over the nabble-board and another time over my
thunderbird-client. Both with the same result. So there was not more
HTML-code than it was in every of my other postings.

Kind regards
- Mitch
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Problem-with-Solr-Mailinglist-tp978247p979602.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Autocomplete with NGrams

2010-07-19 Thread MitchK


Frank,

have a look at Solr's example-directory's and look for 'multicore'. There
you can see an example-configuration for a multicore-environment.

Kind regards,
- Mitch

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Autocomplete-with-NGrams-tp979312p979610.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Autocomplete with NGrams

2010-07-19 Thread Frank A

Just to confirm - does multicore sound like the right solution here?
Is it not possible to "serve" both use cases from one core?

Thanks.

On Mon, Jul 19, 2010 at 5:12 PM, MitchK  wrote:
>
> Frank,
>
> have a look at Solr's example-directory's and look for 'multicore'. There
> you can see an example-configuration for a multicore-environment.
>
> Kind regards,
> - Mitch
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Autocomplete-with-NGrams-tp979312p979610.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to patch SOLR-1966?

2010-07-19 Thread Erick Erickson

I don't think you need to go there. That patch has been committed
to the 3.x code line as well as the 1.4. You can get the 3x build here:
http://hudson.zones.apache.org/hudson/job/Solr-3.x/
Just try that...

Or you can get the latest build if you're working with 1.4 here:
http://hudson.zones.apache.org/hudson/job/Solr-trunk/lastSuccessfulBuild/artifact/trunk/solr/dist/

HTH
Erick

On Mon, Jul 19, 2010 at 6:17 AM, Chamnap Chhorn wrote:

> Hi,
>
> I'm a ruby developer, no background in Java at all. I need
> *exclusive=true*to work on elevation search component. However, it
> does need a patch,
> https://issues.apache.org/jira/browse/SOLR-1966. Anyone could present me a
> step by step in order to do that?
>
>
> --
> Chhorn Chamnap
> http://chamnapchhorn.blogspot.com/
>

Re: filter query on timestamp slowing query???

2010-07-19 Thread Erick Erickson

Here's my guess, and it's only a guess. I'm inferring that you're
updating your index between queries, so you may be reloading
your cache every time. Are you updating or adding documents
between queries and if so, how?

If this is vaguely on target, have you tried firing up warmup queries
after you update your index that involve your timestamp?

Best
Erick

On Mon, Jul 19, 2010 at 10:18 AM, oferiko  wrote:

>
> 1.I query my index once every 30 minutes. I save the timestamp of the
> newest
> returned document. next time i query doe documents with timestamp between
> the timestamp i saved from the previous query and NOW.
>
> 2.Sad to day it is not optimized, i'm at 60% of the disk space, and waiting
> for another disk to be added before i can optimize.
>
> 3.it is a simple myStrField:foo
>
> thanks for helping
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/filter-query-on-timestamp-slowing-query-tp977280p978595.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: How to patch SOLR-1966?

2010-07-19 Thread Chamnap Chhorn

I'm using Solr 1.4, but the exclusive=true doesn't work for me at all. I
wonder how that is ?

Any ideas?

On Tue, Jul 20, 2010 at 6:40 AM, Erick Erickson wrote:

> I don't think you need to go there. That patch has been committed
> to the 3.x code line as well as the 1.4. You can get the 3x build here:
> http://hudson.zones.apache.org/hudson/job/Solr-3.x/
> Just try that...
>
> Or you can get the latest build if you're working with 1.4 here:
>
> http://hudson.zones.apache.org/hudson/job/Solr-trunk/lastSuccessfulBuild/artifact/trunk/solr/dist/
>
> HTH
> Erick
>
> On Mon, Jul 19, 2010 at 6:17 AM, Chamnap Chhorn  >wrote:
>
> > Hi,
> >
> > I'm a ruby developer, no background in Java at all. I need
> > *exclusive=true*to work on elevation search component. However, it
> > does need a patch,
> > https://issues.apache.org/jira/browse/SOLR-1966. Anyone could present me
> a
> > step by step in order to do that?
> >
> >
> > --
> > Chhorn Chamnap
> > http://chamnapchhorn.blogspot.com/
> >
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/

Re: How to patch SOLR-1966?

2010-07-19 Thread Erick Erickson

Not 1.4 released, 1.4 last successful build. I don't think the 1.4 release
has the patch applied.

See the second link I originally provided.

HTH
Erick

On Mon, Jul 19, 2010 at 9:15 PM, Chamnap Chhorn wrote:

> I'm using Solr 1.4, but the exclusive=true doesn't work for me at all. I
> wonder how that is ?
>
> Any ideas?
>
> On Tue, Jul 20, 2010 at 6:40 AM, Erick Erickson  >wrote:
>
> > I don't think you need to go there. That patch has been committed
> > to the 3.x code line as well as the 1.4. You can get the 3x build here:
> > http://hudson.zones.apache.org/hudson/job/Solr-3.x/
> > Just try that...
> >
> > Or you can get the latest build if you're working with 1.4 here:
> >
> >
> http://hudson.zones.apache.org/hudson/job/Solr-trunk/lastSuccessfulBuild/artifact/trunk/solr/dist/
> >
> > HTH
> > Erick
> >
> > On Mon, Jul 19, 2010 at 6:17 AM, Chamnap Chhorn  > >wrote:
> >
> > > Hi,
> > >
> > > I'm a ruby developer, no background in Java at all. I need
> > > *exclusive=true*to work on elevation search component. However, it
> > > does need a patch,
> > > https://issues.apache.org/jira/browse/SOLR-1966. Anyone could present
> me
> > a
> > > step by step in order to do that?
> > >
> > >
> > > --
> > > Chhorn Chamnap
> > > http://chamnapchhorn.blogspot.com/
> > >
> >
>
>
>
> --
> Chhorn Chamnap
> http://chamnapchhorn.blogspot.com/
>

Re: How to patch SOLR-1966?

2010-07-19 Thread Chamnap Chhorn

Ah, I get what you mean. One more thing, I wonder the patch SOLR-1147 has
included in the last successful build or not?


Thanks,
Chamnap

On Tue, Jul 20, 2010 at 8:26 AM, Erick Erickson wrote:

> Not 1.4 released, 1.4 last successful build. I don't think the 1.4 release
> has the patch applied.
>
> See the second link I originally provided.
>
> HTH
> Erick
>
> On Mon, Jul 19, 2010 at 9:15 PM, Chamnap Chhorn  >wrote:
>
> > I'm using Solr 1.4, but the exclusive=true doesn't work for me at all. I
> > wonder how that is ?
> >
> > Any ideas?
> >
> > On Tue, Jul 20, 2010 at 6:40 AM, Erick Erickson  > >wrote:
> >
> > > I don't think you need to go there. That patch has been committed
> > > to the 3.x code line as well as the 1.4. You can get the 3x build here:
> > > http://hudson.zones.apache.org/hudson/job/Solr-3.x/
> > > Just try that...
> > >
> > > Or you can get the latest build if you're working with 1.4 here:
> > >
> > >
> >
> http://hudson.zones.apache.org/hudson/job/Solr-trunk/lastSuccessfulBuild/artifact/trunk/solr/dist/
> > >
> > > HTH
> > > Erick
> > >
> > > On Mon, Jul 19, 2010 at 6:17 AM, Chamnap Chhorn <
> chamnapchh...@gmail.com
> > > >wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm a ruby developer, no background in Java at all. I need
> > > > *exclusive=true*to work on elevation search component. However, it
> > > > does need a patch,
> > > > https://issues.apache.org/jira/browse/SOLR-1966. Anyone could
> present
> > me
> > > a
> > > > step by step in order to do that?
> > > >
> > > >
> > > > --
> > > > Chhorn Chamnap
> > > > http://chamnapchhorn.blogspot.com/
> > > >
> > >
> >
> >
> >
> > --
> > Chhorn Chamnap
> > http://chamnapchhorn.blogspot.com/
> >
>

Re: How to patch SOLR-1966?

2010-07-19 Thread Chamnap Chhorn

The other thing I want to ask is the latest build of solr is stable or not?
I'm afraid it might bring some other problems to my system.

Thanks,
Chamnap

On Tue, Jul 20, 2010 at 8:41 AM, Chamnap Chhorn wrote:

> Ah, I get what you mean. One more thing, I wonder the patch SOLR-1147 has
> included in the last successful build or not?
>
>
> Thanks,
> Chamnap
>
>
> On Tue, Jul 20, 2010 at 8:26 AM, Erick Erickson 
> wrote:
>
>> Not 1.4 released, 1.4 last successful build. I don't think the 1.4 release
>> has the patch applied.
>>
>> See the second link I originally provided.
>>
>> HTH
>> Erick
>>
>> On Mon, Jul 19, 2010 at 9:15 PM, Chamnap Chhorn > >wrote:
>>
>> > I'm using Solr 1.4, but the exclusive=true doesn't work for me at all. I
>> > wonder how that is ?
>> >
>> > Any ideas?
>> >
>> > On Tue, Jul 20, 2010 at 6:40 AM, Erick Erickson <
>> erickerick...@gmail.com
>> > >wrote:
>> >
>> > > I don't think you need to go there. That patch has been committed
>> > > to the 3.x code line as well as the 1.4. You can get the 3x build
>> here:
>> > > http://hudson.zones.apache.org/hudson/job/Solr-3.x/
>> > > Just try that...
>> > >
>> > > Or you can get the latest build if you're working with 1.4 here:
>> > >
>> > >
>> >
>> http://hudson.zones.apache.org/hudson/job/Solr-trunk/lastSuccessfulBuild/artifact/trunk/solr/dist/
>> > >
>> > > HTH
>> > > Erick
>> > >
>> > > On Mon, Jul 19, 2010 at 6:17 AM, Chamnap Chhorn <
>> chamnapchh...@gmail.com
>> > > >wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > I'm a ruby developer, no background in Java at all. I need
>> > > > *exclusive=true*to work on elevation search component. However, it
>> > > > does need a patch,
>> > > > https://issues.apache.org/jira/browse/SOLR-1966. Anyone could
>> present
>> > me
>> > > a
>> > > > step by step in order to do that?
>> > > >
>> > > >
>> > > > --
>> > > > Chhorn Chamnap
>> > > > http://chamnapchhorn.blogspot.com/
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > Chhorn Chamnap
>> > http://chamnapchhorn.blogspot.com/
>> >
>>
>


-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/

Re: How to patch SOLR-1966?

2010-07-19 Thread Chamnap Chhorn

Is there an easy way to apply this patch to solr 1.4 release in my system
because I will use in the production server.

Thanks,
Chamnap

On Tue, Jul 20, 2010 at 8:49 AM, Chamnap Chhorn wrote:

> The other thing I want to ask is the latest build of solr is stable or not?
> I'm afraid it might bring some other problems to my system.
>
> Thanks,
> Chamnap
>
>
> On Tue, Jul 20, 2010 at 8:41 AM, Chamnap Chhorn 
> wrote:
>
>> Ah, I get what you mean. One more thing, I wonder the patch SOLR-1147 has
>> included in the last successful build or not?
>>
>>
>> Thanks,
>> Chamnap
>>
>>
>> On Tue, Jul 20, 2010 at 8:26 AM, Erick Erickson 
>> wrote:
>>
>>> Not 1.4 released, 1.4 last successful build. I don't think the 1.4
>>> release
>>> has the patch applied.
>>>
>>> See the second link I originally provided.
>>>
>>> HTH
>>> Erick
>>>
>>> On Mon, Jul 19, 2010 at 9:15 PM, Chamnap Chhorn >> >wrote:
>>>
>>> > I'm using Solr 1.4, but the exclusive=true doesn't work for me at all.
>>> I
>>> > wonder how that is ?
>>> >
>>> > Any ideas?
>>> >
>>> > On Tue, Jul 20, 2010 at 6:40 AM, Erick Erickson <
>>> erickerick...@gmail.com
>>> > >wrote:
>>> >
>>> > > I don't think you need to go there. That patch has been committed
>>> > > to the 3.x code line as well as the 1.4. You can get the 3x build
>>> here:
>>> > > http://hudson.zones.apache.org/hudson/job/Solr-3.x/
>>> > > Just try that...
>>> > >
>>> > > Or you can get the latest build if you're working with 1.4 here:
>>> > >
>>> > >
>>> >
>>> http://hudson.zones.apache.org/hudson/job/Solr-trunk/lastSuccessfulBuild/artifact/trunk/solr/dist/
>>> > >
>>> > > HTH
>>> > > Erick
>>> > >
>>> > > On Mon, Jul 19, 2010 at 6:17 AM, Chamnap Chhorn <
>>> chamnapchh...@gmail.com
>>> > > >wrote:
>>> > >
>>> > > > Hi,
>>> > > >
>>> > > > I'm a ruby developer, no background in Java at all. I need
>>> > > > *exclusive=true*to work on elevation search component. However, it
>>> > > > does need a patch,
>>> > > > https://issues.apache.org/jira/browse/SOLR-1966. Anyone could
>>> present
>>> > me
>>> > > a
>>> > > > step by step in order to do that?
>>> > > >
>>> > > >
>>> > > > --
>>> > > > Chhorn Chamnap
>>> > > > http://chamnapchhorn.blogspot.com/
>>> > > >
>>> > >
>>> >
>>> >
>>> >
>>> > --
>>> > Chhorn Chamnap
>>> > http://chamnapchhorn.blogspot.com/
>>> >
>>>
>>
>
>
> --
> Chhorn Chamnap
> http://chamnapchhorn.blogspot.com/
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/

Re: Spatial filtering

2010-07-19 Thread Lance Norskog

Re: Ranking based on term position

2010-07-19 Thread Li Li

I have considerd this problem and tried to solve it using 2 methods
By these methods, we also can boost a doc by the relative positions of
query terms.

1: add term Position when indexing
   modify TermScorer.score

  public float score() {
assert doc != -1;
int f = freqs[pointer];
float raw =   // compute tf(f)*weight
  f < SCORE_CACHE_SIZE// check cache
  ? scoreCache[f] // cache hit
  : getSimilarity().tf(f)*weightValue;// cache miss
//modified by LiLi
try {
int[] positions=this.getPositions(f);
float positionBoost=1.0f;
for(int pos:positions){
positionBoost*=this.getPositionBoost(pos);
}
raw*=positionBoost;
} catch (IOException e) {
}
//modified
return norms == null ? raw : raw * SIM_NORM_DECODER[norms[doc] &
0xFF]; // normalize for field
  }


  private int[] getPositions(int f) throws IOException{
  termPositions.skipTo(doc);
  int[] positions=new int[f];
  int docId = termPositions.doc();
  assert docId==doc;
  int tf=termPositions.freq();
  assert tf==f;
  for(int i=0;i:
> I need to make sure that documents with the search term occurring
> towards the beginning of the document are ranked higher.
>
> For example,
>
> Search term : ox
> Doc 1: box fox ox
> Doc 2: ox box fox
>
> Result: Doc2 will be ranked higher than Doc1.
>
> The solution I can think of is sorting by term position (after enabling
> term vectors). Is that the best way to go about it ?
>
> Thanks
> Papiya
>
>
> Pink OTC Markets Inc. provides the leading inter-dealer quotation and
> trading system in the over-the-counter (OTC) securities market.   We create
> innovative technology and data solutions to efficiently connect market
> participants, improve price discovery, increase issuer disclosure, and
> better inform investors.   Our marketplace, comprised of the issuer-listed
> OTCQX and broker-quoted   Pink Sheets, is the third largest U.S. equity
> trading venue for company shares.
>
> This document contains confidential information of Pink OTC Markets and is
> only intended for the recipient.   Do not copy, reproduce (electronically or
> otherwise), or disclose without the prior written consent of Pink OTC
> Markets.      If you receive this message in error, please destroy all
> copies in your possession (electronically or otherwise) and contact the
> sender above.
>

Re: filter query on timestamp slowing query???

2010-07-19 Thread Chris Hostetter

: updating your index between queries, so you may be reloading
: your cache every time. Are you updating or adding documents
: between queries and if so, how?
: 
: If this is vaguely on target, have you tried firing up warmup queries
: after you update your index that involve your timestamp?

based on the usecase, i'm not sure that that will really help -- it sounds 
like the range query is alwasy based on the exact timestamp of the most 
recent doc from the lat time this particular query was run -- which means 
by definition that that timestamp changes every time the query is 
run, so caching it is useless.

skimming the thread, it's seems like the OP isn't using TrieDateField 
(when asked if he was, he posted a followup about precision if he did use 
it -- implying he is not currently).  Switching to TrieDateField is 
probably the only thing improvement possibly to make a significant 
differnece in speeding up these one time queries.

i've also seen no explanation of how big the index is, or what the OP's 
definition of "slow" is (how fast are the queries with and w/o these 
filters?).  that type of information is fairly critical to being able to 
offer performance suggestions.

I'm also suspicious of hte entire line of questioning -- it smells like 
there might be an XY Problem here.  knowing what the ultimate goal that 
lead to this timestamp based filter query appraoch might help us suggest 
an alternate/better/faster solution.



-Hoss

Re: setting up clustering

2010-07-19 Thread Chris Hostetter

: I'm trying to enable clustering in solr 1.4. I'm following these instructions:
: 
: http://wiki.apache.org/solr/ClusteringComponent
: 
: However, `ant get-libraries` fails for me. Before it tries to download
: the 4 jar files, it tries to compile lucene? Is this necessary?

FWIW: in the future please post the actual commands you run and the output 
you recieve so we can understand exactly what you are talking about.

Looking at the build.xml file i see that the clustering "get-libraries" 
ant target depends on the "init" target (for reasons i don't understand), 
and init then wants to compile *solr* (again, for reasons i don't understand).

Is that perhaps what you are seeing?  that get-libraries is compiling 
solr? (not all of lucene)

I've opened a bug to improve the situation in future releases...
https://issues.apache.org/jira/browse/SOLR-2007

..however there are other things about your email that don't make sense to 
me...

: My next attempt was to just copy contrib/clustering/lib/*.jar and
: contrib/clustering/lib/downloads/*.jar to WEB-INF/lib and enable

1) you shouldn't ever need to copy jars into WEB-INF/lib.  solr makes it 
   very easy to load jars with plugins...
 http://wiki.apache.org/solr/SolrPlugins#How_to_Load_Plugins
 http://wiki.apache.org/solr/SolrConfigXml#lib
   ...the example solrconfig.xml file already has a  directive that 
   points at that downloads directory.

2) if the jars didn't get downloaded, they aren't going to be there for 
   you to copy, so there's not much point in doing this -- but your 
   statement implies you have something to copy in the downloads/  
   directory ... do you?  what exactly is in that downloads directory?

: SEVERE: org.apache.solr.common.SolrException: Error loading class
: 'org.apache.solr.handler.clustering.ClusteringComponent'

once again: full error messages are neccessary.  all of the other details 
of that error that come after that line are crucial to knowing *why* the 
component didn't load.

if you have those jars in the download directory, then you should have 
everything you need, it's possible there is another reason for this error.


(details, details, details)

-Hoss

dismax request handler without q

2010-07-19 Thread Chamnap Chhorn

I wonder how could i make a query to return only *all books* that has
keyphrase "web development" using dismax handler? A book has multiple
keyphrases (keyphrase is multivalued column). Do I have to pass q parameter?


Is it the correct one?
http://locahost:8081/solr/select?&q=hotel&fq=keyphrase:%20hotel

-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/

Re: dismax request handler without q

2010-07-19 Thread olivier sallou

Hi,
this is not very clear, if you need to query only keyphrase, why don't you
query directly it? e.g. q=keyphrase:hotel ?
Furthermore, why dismax if only keyphrase field is of interest? dismax is
used to query multiple fields automatically.

At least dismax do not appear in your query (using query type). It is set in
your config for your default request handler?

2010/7/20 Chamnap Chhorn 

> I wonder how could i make a query to return only *all books* that has
> keyphrase "web development" using dismax handler? A book has multiple
> keyphrases (keyphrase is multivalued column). Do I have to pass q
> parameter?
>
>
> Is it the correct one?
> http://locahost:8081/solr/select?&q=hotel&fq=keyphrase:%20hotel
>
> --
> Chhorn Chamnap
> http://chamnapchhorn.blogspot.com/
>

Re: Solr Index corruption

2010-07-19 Thread benedictdsilva


After a much closer look at the logs, I found these errors:

Looks like the errors started from the 3rd day after my server was running.
After the initial Java heap space error Jul 15, 2010 9:46:34 AM
Solr did run for a certain time till Jul 15, 2010 8:13:33 PM  when it
started failing due to the missing file

What could cause this? 


catalina.2010-07-15.log:SEVERE: java.lang.OutOfMemoryError: Java heap space
Jul 15, 2010 9:46:34 AM org.apache.solr.common.SolrException log
SEVERE: java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.search.FieldCacheImpl$StringIndexCache.createValue(FieldCacheImpl.java:658)
at
org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:205)
at
org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:646)
at
org.apache.lucene.search.FieldComparator$StringOrdValComparator.setNextReader(FieldComparator.java:667)
at
org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:91)
at
org.apache.solr.search.DocSetDelegateCollector.setNextReader(DocSetHitCollector.java:140)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:251)
at org.apache.lucene.search.Searcher.search(Searcher.java:173)
at
org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1101)
at
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:880)
at
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341)
at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:174)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1299)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454)
at java.lang.Thread.run(Thread.java:619)


catalina.2010-07-15.log:SEVERE: Exception invoking periodic operation: 
catalina.2010-07-15.log:SEVERE: auto commit error...
catalina.2010-07-15.log:SEVERE: auto commit error...
catalina.2010-07-15.log:SEVERE: auto commit error...
catalina.2010-07-15.log:SEVERE: auto commit error...
catalina.2010-07-15.log:SEVERE: auto commit error...
catalina.2010-07-15.log:SEVERE: auto commit error...
catalina.2010-07-15.log:SEVERE: auto commit error...


Jul 15, 2010 7:10:31 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.OutOfMemoryError: Java heap space

Jul 15, 2010 7:10:35 PM org.apache.catalina.connector.CoyoteAdapter service
SEVERE: An exception or error occurred in the container during the request
processing
java.lang.OutOfMemoryError: Java heap space
Jul 15, 2010 7:10:39 PM org.apache.coyote.http11.Http11Processor process
SEVERE: Error finishing response
java.lang.OutOfMemoryError: Java heap space
Jul 15, 2010 7:10:42 PM org.apache.solr.update.processor.LogUpdateProcessor
finish
INFO: {} 0 2039
Jul 15, 2010 7:10:43 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.OutOfMemoryError: Java heap space

Jul 15, 2010 7:10:47 PM org.apache.solr.update.processor.LogUpdateProcessor
finish
INFO: {} 0 14741
Jul 15, 2010 7:10:53 PM org.apache.catalina.connector.CoyoteAdapter service
SEVERE: An exception or error occurred in the container during the request
processing
java.lang.OutOfMemoryError: Java heap space
Jul 15, 2010 7:10:55 PM org.apache.coyote.http11.Http11Processor process
SEVERE: Error finishing response
java.lang.OutOfMemoryError: Java heap space
Jul 15, 2010 7:10:49 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.OutOfMemoryError: Java heap space

Jul 15, 2010 7:11:10 PM org.apache.solr.update.processor.LogUpdateP

Re: dismax request handler without q

2010-07-19 Thread Chamnap Chhorn

There are some default configuration on my solrconfig.xml that I didn't show
you. I'm a little confused when reading
http://wiki.apache.org/solr/DisMaxRequestHandler#q. I think q is for plain
user input query.

On Tue, Jul 20, 2010 at 12:08 PM, olivier sallou
wrote:

> Hi,
> this is not very clear, if you need to query only keyphrase, why don't you
> query directly it? e.g. q=keyphrase:hotel ?
> Furthermore, why dismax if only keyphrase field is of interest? dismax is
> used to query multiple fields automatically.
>
> At least dismax do not appear in your query (using query type). It is set
> in
> your config for your default request handler?
>
> 2010/7/20 Chamnap Chhorn 
>
> > I wonder how could i make a query to return only *all books* that has
> > keyphrase "web development" using dismax handler? A book has multiple
> > keyphrases (keyphrase is multivalued column). Do I have to pass q
> > parameter?
> >
> >
> > Is it the correct one?
> > http://locahost:8081/solr/select?&q=hotel&fq=keyphrase:%20hotel
> >
> > --
> > Chhorn Chamnap
> > http://chamnapchhorn.blogspot.com/
> >
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/

Re: dismax request handler without q

2010-07-19 Thread Chamnap Chhorn

I can't put q=keyphrase:hotel in my request using dismax handler. It returns
no result.

On Tue, Jul 20, 2010 at 1:19 PM, Chamnap Chhorn wrote:

> There are some default configuration on my solrconfig.xml that I didn't
> show you. I'm a little confused when reading
> http://wiki.apache.org/solr/DisMaxRequestHandler#q. I think q is for plain
> user input query.
>
>
> On Tue, Jul 20, 2010 at 12:08 PM, olivier sallou  > wrote:
>
>> Hi,
>> this is not very clear, if you need to query only keyphrase, why don't you
>> query directly it? e.g. q=keyphrase:hotel ?
>> Furthermore, why dismax if only keyphrase field is of interest? dismax is
>> used to query multiple fields automatically.
>>
>> At least dismax do not appear in your query (using query type). It is set
>> in
>> your config for your default request handler?
>>
>> 2010/7/20 Chamnap Chhorn 
>>
>> > I wonder how could i make a query to return only *all books* that has
>> > keyphrase "web development" using dismax handler? A book has multiple
>> > keyphrases (keyphrase is multivalued column). Do I have to pass q
>> > parameter?
>> >
>> >
>> > Is it the correct one?
>> > http://locahost:8081/solr/select?&q=hotel&fq=keyphrase:%20hotel
>> >
>> > --
>> > Chhorn Chamnap
>> > http://chamnapchhorn.blogspot.com/
>> >
>>
>
>
>
> --
> Chhorn Chamnap
> http://chamnapchhorn.blogspot.com/
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/

Re: Finding distinct unique IDs in documents returned by fq -- Urgent Help Req

2010-07-19 Thread Ninad Raut

Hi,

Also the collapsing feature doesn't give the count of number of records
returned (grouped by a field value). It gives the count of the hits for the
query. This is really not useful when it comes to pagination.

Is there a way, at least in collapsing,  wherein I can get the count of
actual records returned and not the hit count?

On Mon, Jul 19, 2010 at 7:32 PM, kenf_nc  wrote:

>
> Oh, okay. Got it now. Unfortunately I don't believe Solr supplies a total
> count of matching facet values. One way to do this, although performance
> may
> suffer, is to set your limit to -1 and just get back everything, that will
> give you the count. You may want to set mincount to 1 so you aren't
> counting
> facet values that aren't in your query, but that really depends on your
> need.
>
> ...&facet.limit=-1&facet.mincount=1
>
> adding that to any facet query will return all matching facet values.
> Depending on how many unique values you have, this could be a lot. But it
> will give you what you are looking for. Unless your data changes
> frequently,
> maybe you can call it once and cache the results for some period of time.
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Finding-distinct-unique-IDs-in-documents-returned-by-fq-Urgent-Help-Req-tp971883p978548.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: filter query on timestamp slowing query???

2010-07-19 Thread oferiko



Chris Hostetter-3 wrote:
> 
> : updating your index between queries, so you may be reloading
> : your cache every time. Are you updating or adding documents
> : between queries and if so, how?
> : 
> : If this is vaguely on target, have you tried firing up warmup queries
> : after you update your index that involve your timestamp?
> 
> based on the usecase, i'm not sure that that will really help -- it sounds 
> like the range query is alwasy based on the exact timestamp of the most 
> recent doc from the lat time this particular query was run -- which means 
> by definition that that timestamp changes every time the query is 
> run, so caching it is useless.
> 
> skimming the thread, it's seems like the OP isn't using TrieDateField 
> (when asked if he was, he posted a followup about precision if he did use 
> it -- implying he is not currently).  Switching to TrieDateField is 
> probably the only thing improvement possibly to make a significant 
> differnece in speeding up these one time queries.
> 
> i've also seen no explanation of how big the index is, or what the OP's 
> definition of "slow" is (how fast are the queries with and w/o these 
> filters?).  that type of information is fairly critical to being able to 
> offer performance suggestions.
> 
> I'm also suspicious of hte entire line of questioning -- it smells like 
> there might be an XY Problem here.  knowing what the ultimate goal that 
> lead to this timestamp based filter query appraoch might help us suggest 
> an alternate/better/faster solution.
> 
> 
> 
> -Hoss
> 
> 
> 
You are correct, first of all i haven't move yet to the TrieDateField, but i
am still waiting to find out a bit more information about it, and there's
not a lot of info, other then in the xml file.
Second, i also think caching is not my problem, as the queries are usually
of different time ranges.
The index is pretty big, right now we have around 700M documents, the size
of it on the disk is about 600GB. 
More then half of the documents are pretty short, 10-20 words, the others
are around 300 words.

I'll explain my use case, so you'll know a bit more. I have an  index that's
being updated regularly, (every second i have 10 to 50 new documents, most
of them are small)

Every 30 minutes, i ask the index what are the documents that were added to
it, since the last time i queried it, that match a certain criteria.
>From time to time, once a week or so, i ask the index for ALL the documents
that match that criteria. (i also do this for not only one query, but
several)
This is why i need the timestamp filter.

The queries that don't have any time range, take a few seconds to finish,
while the ones with time range, take a few minutes.
Hope that helps understanding my situation, and i am open to any suggestion
how to change the way things work, if it will improve performance.

Thank you guys
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/filter-query-on-timestamp-slowing-query-tp977280p980526.html
Sent from the Solr - User mailing list archive at Nabble.com.

46 matches

Mail list logo