date:20100519

Personalized Search

2010-05-19 Thread Rih

Has anybody done personalized search with Solr? I'm thinking of including
fields such as "bought" or "like" per member/visitor via dynamic fields to a
product search schema. Another option is to have a multi-value field that
can contain user IDs. What are the possible performance issues with this
setup?

Looking forward to your ideas.

Rih

Subclassing DIH

2010-05-19 Thread Blargy


I am trying to subclass DIH to add I am having a hard time trying to get
access to the current Solr Context. How is this possible? 

Is there anyway to get access to the current DataSource, DataImporter etc?

On a related note... when working with an onImportEnd, or onImportStart how
can I get a reference to the current Request/Response that initiated the
import? 

>From the DIH subclass I can access the request/response but not the context.
>From the event listener I can access the Context but not the
request/response. 
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Subclassing-DIH-tp830954p830954.html
Sent from the Solr - User mailing list archive at Nabble.com.

caching on unique queries

2010-05-19 Thread Kevin Osborn

Pretty much every one of my queries is going to be unique. However, the query 
is fairly complex and also contains both unique and non-unique data. In the 
query, some fields will be unique (e.g description), but other fields will be 
fairly common (e.g. category). If we could use those common fields as filters, 
it would be easy to use the filter cache. I could just separate the filters and 
let the filter cache do its thing. Unfortunately, due to the nature of our 
application, pretty much every field is just a boost.

So, right now, I am getting absolutely no use out of the cache. The only cache 
that might be useful is the Document Cache. Even then I am not sure.

Is there anyway to cache part of the query? Or basically cache subqueries? I 
have my own request handler, so I am willing to write the necessary code. I am 
fearful that the best performance may be to just turn off caching.

Re: Moving from Lucene to Solr?

2010-05-19 Thread Peter Karich

Sorry. Wasn't intended as a hijacking :-(


: Subject: Moving from Lucene to Solr?
: References: 
: In-Reply-To: 

http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists

When starting a new discussion on a mailing list, please do not reply to 
an existing message, instead start a fresh email.  Even if you change the 
subject line of your email, other mail headers still track which thread 
you replied to and your question is "hidden" in that thread and gets less 
attention.   It makes following discussions in the mailing list archives 
particularly difficult.
See Also:  http://en.wikipedia.org/wiki/User:DonDiego/Thread_hijacking



-Hoss

Query Timings increase after system is idle

2010-05-19 Thread ST ST

Folks,

We have a problem in our environment where after a system is idle the query
time goes up from a few 100ms to 4+ seconds after 9 hours of idle time on
the system.

System Details:
 - Solr 1.4
 - 10 Million Index.
 - Use MMAP for mapping the index files in memory

Test Details:
-  8 hour performance run with ingestion (@ 8 docs/sec) , query rate - 3
Queries per sec.
-  Commit is per hour.

Issue:
- After 9 hours of idle time (ie no queries, no ingestion ) every query
takes 4+ seconds, subsequent queries are fast.

I have a few specific questions:
A. Does Lucene/Solr have internal caches which may be flushed out of memory
when the system is idle ?
B. What operations are done on a per term basis (example: build doc lists )
for first time queries.
C. Any pointers to what else may be an issue here.

Really appreciate any help you can provide.

ST

Re: Stemming Filters in wiki

2010-05-19 Thread Chris Hostetter

: 
: These entries were moved here: http://wiki.apache.org/solr/LanguageAnalysis

but there doesn't seem to be a link to that page from 
AnalyzersTokenizersTokenFilters (or from anywhere on the wiki according to 
the wiki link search feature) ... so i'll add some verbage about it.

: 
: On Wed, May 19, 2010 at 2:49 PM, Asif Rahman  wrote:
: > I see that the entries for PorterStemFilterFactory,
: > EnglishPorterFilterFactory, and SnowballPorterFilterFactory have been
: > removed from the Analyzers, Tokenizers, and Token Filters wiki page.  Is
: > there a reason for this?
: >
: > Thanks,
: >
: > asif
: >
: >
: > --
: > Asif Rahman
: > Lead Engineer - NewsCred
: > a...@newscred.com
: > http://platform.newscred.com
: >
: 
: 
: 
: -- 
: Robert Muir
: rcm...@gmail.com
: 



-Hoss

Re: Moving from Lucene to Solr?

2010-05-19 Thread Chris Hostetter


: Subject: Moving from Lucene to Solr?
: References: 
: In-Reply-To: 

http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists

When starting a new discussion on a mailing list, please do not reply to 
an existing message, instead start a fresh email.  Even if you change the 
subject line of your email, other mail headers still track which thread 
you replied to and your question is "hidden" in that thread and gets less 
attention.   It makes following discussions in the mailing list archives 
particularly difficult.
See Also:  http://en.wikipedia.org/wiki/User:DonDiego/Thread_hijacking



-Hoss

Re: Stemming Filters in wiki

2010-05-19 Thread Robert Muir

Hi Asif,

These entries were moved here: http://wiki.apache.org/solr/LanguageAnalysis

On Wed, May 19, 2010 at 2:49 PM, Asif Rahman  wrote:
> I see that the entries for PorterStemFilterFactory,
> EnglishPorterFilterFactory, and SnowballPorterFilterFactory have been
> removed from the Analyzers, Tokenizers, and Token Filters wiki page.  Is
> there a reason for this?
>
> Thanks,
>
> asif
>
>
> --
> Asif Rahman
> Lead Engineer - NewsCred
> a...@newscred.com
> http://platform.newscred.com
>



-- 
Robert Muir
rcm...@gmail.com

Re: Embedded Server, Caching, Stats page updates

2010-05-19 Thread Chris Hostetter


: "Switched" works for the specific setup i'm using - the server would refer
: to itself in the CommonHttpSolrServer request sent, i.e. it would run both
: the server and client sides. Removing this and simply using
: EmbeddedSolrServer just made the setup a little more sane in that aspect.
: Does that make more sense now?

not really ... what *exactly* did you change about your setup and 
your client code?  please be specific -- how did you run solr
before when you were using CommonsHttpSolrServer? whare are *all* of the 
steps you did when you switched to EmbeddedSolrServer (specificly: what 
did the changes to your java client code look like, and what did you 
hcange about how you "run" solr)

Because if you still have the solr.war running in your servlet container, 
and all you did is edit your java code to use EmbeddedSolrServer (poiting 
at the same directory on disk) instead of COmmonsHttpSolrServer, thne you 
are now running *two* instances of Solr in your VM, both reading from the 
same indexes.


-Hoss

Stemming Filters in wiki

2010-05-19 Thread Asif Rahman

I see that the entries for PorterStemFilterFactory,
EnglishPorterFilterFactory, and SnowballPorterFilterFactory have been
removed from the Analyzers, Tokenizers, and Token Filters wiki page.  Is
there a reason for this?

Thanks,

asif


-- 
Asif Rahman
Lead Engineer - NewsCred
a...@newscred.com
http://platform.newscred.com

RE: disable caches in real time

2010-05-19 Thread Nagelberg, Kallin

I suppose you are still losing some performance on the replicated box since it 
needs to use some resources to warm the cache. It would be nice if a warmed 
cache could be replicated from the master though perhaps that's not practical. 
Chris is right though: The newly updated index created by a commit is not seen 
by users until it has been warmed, at which point it is atomically swapped.

-Kallin Nagelberg



-Original Message-
From: Chris Hostetter [mailto:hossman_luc...@fucit.org] 
Sent: Wednesday, May 19, 2010 2:38 PM
To: solr-user@lucene.apache.org
Subject: Re: disable caches in real time


: I've always undestand that if you do a commit (replication does it), a new
: searcher is open, and you lose performance (queries per second) while the
: caches are regenerated. I think i don't explain correctly my situation

not if you configure your caches with autowarming -- then solr will warm 
up the new caches (on the new index) while the old index still serves 
requests -- this is all manged for you by the SolrCore, no need for core 
swapping.


-Hoss

Re: disable caches in real time

2010-05-19 Thread Chris Hostetter


: I've always undestand that if you do a commit (replication does it), a new
: searcher is open, and you lose performance (queries per second) while the
: caches are regenerated. I think i don't explain correctly my situation

not if you configure your caches with autowarming -- then solr will warm 
up the new caches (on the new index) while the old index still serves 
requests -- this is all manged for you by the SolrCore, no need for core 
swapping.


-Hoss

Re: Custom sorting

2010-05-19 Thread Daniel Cassiano

Hi Dan,

It seems that you want a SearchComponent[1], something like the
QueryElevationComponent[2].
Take a look how at him and I think you can build your custom solution.

[1]-
http://lucene.apache.org/solr/api/org/apache/solr/handler/component/SearchComponent.html
[2]- http://wiki.apache.org/solr/QueryElevationComponent


Cheers,

-- Daniel Cassiano

http://dcassiano.wordpress.com


On Wed, May 19, 2010 at 6:46 AM, dan sutton  wrote:

> Hi,
>
> I have a requirement to do the following:
>
> For up to the first 10 results (i.e. only on the first page) show
> sponsored category ads, in order of bid, but no more than 2 / category,
> and only if all sponsored cat' ads are more that min% of the highest
> score. e.g. If I had the following:
>
> min% =1
>
>
> doc score bid  cat_id sponsored
>  1   100   x   x 0
>  255x   x 0
>
>  3502   2 1
>  4202   2 1
>  5052   2 1
>
>  6801   1 1
>  7701   1 1
>  8601   1 1
>
> x = dont care
>
> sorted order would be:
>
> 3
> 4
>
> 6
> 7
>
> 1
> 8
> 2
> 5
>
> I'm not sure if this can be implemented with a custom comparator as I
> need access to the final score to enforce min%, I'm thinking I'm
> probably going to have to implement a subclass of QParserPlugin with a
> custom sort. but was wondering if there were alternatives ?
>
> Many thanks in advance.
> Dan
>

The Seven Deadly Sins of Solr spanish translation

2010-05-19 Thread Juan Pedro Danculovic

Hello, I translate this article into Spanish. It is very helpful to avoid
common mistakes in solr installations.

http://www.linebee.com/?p=434&lang=es

Thanks,

Juan

Re: index merge

2010-05-19 Thread Ahmet Arslan

> I am running solr in 64 bit HP-UX system. The total
> index size is about
> 5GB and when i try load any new document, solr tries to
> merge the existing
> segments first and results in following error. I could see
> a temp file is
> growng within index dir around 2GB in size and later it
> fails with this
> exception. It looks like, by reaching Integer.MAXVALUE, the
> exception
> occurs.

32 isn't 32MB ramBufferSizeMB too small?

Re: index merge

2010-05-19 Thread uma m


Hi All,

  I am running solr in 64 bit HP-UX system. The total index size is about
5GB and when i try load any new document, solr tries to merge the existing
segments first and results in following error. I could see a temp file is
growng within index dir around 2GB in size and later it fails with this
exception. It looks like, by reaching Integer.MAXVALUE, the exception
occurs.

Exception in thread "Lucene Merge Thread #0"
org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException:
File too large (errno:27)
at
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:351)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:315)
Caused by: java.io.IOException: File too large (errno:27)
at java.io.RandomAccessFile.writeBytes(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:456)
at
org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexOutput.flushBuffer(SimpleFSDirectory.java:192)
at
org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96)
at
org.apache.lucene.store.BufferedIndexOutput.flush(BufferedIndexOutput.java:85)
at
org.apache.lucene.store.BufferedIndexOutput.close(BufferedIndexOutput.java:109)
at
org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexOutput.close(SimpleFSDirectory.java:199)
at org.apache.lucene.index.FieldsWriter.close(FieldsWriter.java:144)
at
org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:357)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:153)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5029)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4614)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:235)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)

---

The solrconfig.xml contains default values for , 
sections as below.

  ^M
   ^M
false^M
^M
10^M
^M
^M
^M
^M
32^M
^M
1^M
1000^M
1^M
 ^M
^M
  ^
 ^M
^M
false^M
32^M
10^M
^M
^M
^M
 ^


Could anyone help me to resolve this exception?

Regards,
Uma
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/index-merge-tp472904p829810.html
Sent from the Solr - User mailing list archive at Nabble.com.

Solr Delta Queries

2010-05-19 Thread Vladimir Sutskever

I have a "indexed_timestamp" field  in my index - which lets me know when 
document was indexed:




For some reason when doing delta indexing via DIH, this field is not being 
updated.

Are timestamp fields updated during DELTA updates?



Kind regards,

Vladimir Sutskever
Investment Bank - Technology
JPMorgan Chase, Inc.



This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.

Re: defaultSearchField

2010-05-19 Thread Antonello Mangone

thank you all ;)

2010/5/19 Jan Kammer 

> There is something called dismax-requesthandler. I think this is what you
> are looking for.
>
> greetz, Jan
>
>
> Am 19.05.2010 15:47, schrieb Antonello Mangone:
>
>  Hi to everyone, I'd like to know if it's possible to use the *
>> defaultSearchField* on more fields ???
>>
>> i.e.
>>
>>   field1, field2, field3
>>
>>
>> Thanks you all
>>
>>
>>
>
>

RE: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread hkmortensen


yes I think that will make a good solution. In Dänish "sku" is a bad word
;-), but thanks for the info.


Nagelberg, Kallin wrote:
> 
> Sorry, in North America 'sku' (stock keeping unit) is the common term in
> business to specifically identify a particular product,
> http://lmgtfy.com/?q=sku. 
> 
> And yes, I think you understand me. I am imagining you can structure your
> products in a hierarchy. For each node in the tree you traverse all
> children, collecting their attributes into the current node.
> 
> -Kallin Nagelberg
> 
> -Original Message-
> From: hkmortensen [mailto:ko...@yahoo.com] 
> Sent: Wednesday, May 19, 2010 11:39 AM
> To: solr-user@lucene.apache.org
> Subject: RE: Challenge: Searching for variant products and get basic
> products in result set
> 
> 
> sorry, what does "sku" mean?
> 
> I understand you like this: indexing base and variants, and include all
> atributes (for one base and its variants) in each document. I think that
> would work. Thanks.
> 
> 
> Nagelberg, Kallin wrote:
>> 
>> I agree that pulling all attributes into the parent sku during indexing
>> could work well. Define a Boolean field like 'isVirtual' to identify the
>> non-leaf skus, and use a multi-valued field for each of the attributes.
>> For now you can do a search like (isVirtual:true AND doorType:screen). If
>> at a later date you want the actual variants just search for
>> isVirtual:false.
>> 
>> Does that work?
>> 
>> -Kallin Nagelberg
>> 
>> -Original Message-
>> From: Leonardo Menezes [mailto:leonardo.menez...@googlemail.com] 
>> Sent: Wednesday, May 19, 2010 11:13 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Challenge: Searching for variant products and get basic
>> products in result set
>> 
>> if that is so, and maybe, you have for example, two variants of cars with
>> automatic, what would define on which one was the hit? or field dont
>> share
>> common information across variants? if they do share, you wouldnt be able
>> to
>> define in which one was the hit(because it was on both of them) and would
>> either have to pick one randomly, or retrieve both. if they dont share
>> that
>> info, you would have that covered, since only one would match any given
>> query.
>> 
>> On Wed, May 19, 2010 at 5:04 PM, hkmortensen  wrote:
>> 
>>>
>>> thanks. Currently not, but requirements change all the time as always
>>> ;-)
>>> If we get a requirement, that a facet shall be "material of doors", we
>>> will
>>> need to know which variant was the hit. I would like to be prepared for
>>> that.
>>>
>>>
>>>
>>>
>>> Leonardo Menezes wrote:
>>> >
>>> > would you then need to know in which variant was your match produced?
>>> > because if not, you can just index the whole thing as one single
>>> > document...
>>> >
>>> > On Wed, May 19, 2010 at 4:23 PM, hkmortensen  wrote:
>>> >
>>> >>
>>> >> I do searching for products. Each base product exist in variants as
>>> well.
>>> >> One
>>> >> variant has a glass door, another a steel door etc. The variants can
>>> have
>>> >> diffent prices. The base product does not really exist, only the
>>> variants
>>> >> exists IRL. The case corresponds to cars: the car model is the base
>>> >> product,
>>> >> with color variants  or with automatic/manual etc.
>>> >>
>>> >> I want to search for variants, but I only want to have base products
>>> in
>>> >> the
>>> >> result. Ie when one or more variants from the same base product are
>>> >> found,
>>> >> only the base product shall be in the search result.
>>> >>
>>> >> Does somebody have an idea how this could be done?
>>> >>
>>> >> Best regards
>>> >>
>>> >> Henning
>>> >> --
>>> >> View this message in context:
>>> >>
>>> http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829218.html
>>> >> Sent from the Solr - User mailing list archive at Nabble.com.
>>> >>
>>> >
>>> >
>>>
>>> --
>>> View this message in context:
>>> http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829319.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>> 
>> 
> 
> -- 
> View this message in context:
> http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829435.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829530.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: DIH. behavior after a import. Log, delete table !?

2010-05-19 Thread Ahmet Arslan

> createn an Jar-file. this jar file delete my table.
> 
> but SOLR absolute dont want to start this JAR. i put a
> run.bat file into my
> folder where is my jar saved. this batch-file runs and
> delete the table, but
> when solr start this batch-file. it doesnt work. i dont
> know why. !?!?!?
> i test the batch-file in different wayy and it should be
> work... help ^^
> 
> windows xp for test ;-)

I don't know why but, it seems that we need to set dir other than '.'
Anyway I got it working in Windows in two ways:

1-)

  
  java 
  solr/bin 
   -jar junk.jar 
  true 
  


2-) Giving full paths:


  
  C:\test.bat 
  C:\ 
  true 
 


It should work this time on windows.

RE: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread Nagelberg, Kallin

Sorry, in North America 'sku' (stock keeping unit) is the common term in 
business to specifically identify a particular product, 
http://lmgtfy.com/?q=sku. 

And yes, I think you understand me. I am imagining you can structure your 
products in a hierarchy. For each node in the tree you traverse all children, 
collecting their attributes into the current node.

-Kallin Nagelberg

-Original Message-
From: hkmortensen [mailto:ko...@yahoo.com] 
Sent: Wednesday, May 19, 2010 11:39 AM
To: solr-user@lucene.apache.org
Subject: RE: Challenge: Searching for variant products and get basic products 
in result set


sorry, what does "sku" mean?

I understand you like this: indexing base and variants, and include all
atributes (for one base and its variants) in each document. I think that
would work. Thanks.


Nagelberg, Kallin wrote:
> 
> I agree that pulling all attributes into the parent sku during indexing
> could work well. Define a Boolean field like 'isVirtual' to identify the
> non-leaf skus, and use a multi-valued field for each of the attributes.
> For now you can do a search like (isVirtual:true AND doorType:screen). If
> at a later date you want the actual variants just search for
> isVirtual:false.
> 
> Does that work?
> 
> -Kallin Nagelberg
> 
> -Original Message-
> From: Leonardo Menezes [mailto:leonardo.menez...@googlemail.com] 
> Sent: Wednesday, May 19, 2010 11:13 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Challenge: Searching for variant products and get basic
> products in result set
> 
> if that is so, and maybe, you have for example, two variants of cars with
> automatic, what would define on which one was the hit? or field dont share
> common information across variants? if they do share, you wouldnt be able
> to
> define in which one was the hit(because it was on both of them) and would
> either have to pick one randomly, or retrieve both. if they dont share
> that
> info, you would have that covered, since only one would match any given
> query.
> 
> On Wed, May 19, 2010 at 5:04 PM, hkmortensen  wrote:
> 
>>
>> thanks. Currently not, but requirements change all the time as always ;-)
>> If we get a requirement, that a facet shall be "material of doors", we
>> will
>> need to know which variant was the hit. I would like to be prepared for
>> that.
>>
>>
>>
>>
>> Leonardo Menezes wrote:
>> >
>> > would you then need to know in which variant was your match produced?
>> > because if not, you can just index the whole thing as one single
>> > document...
>> >
>> > On Wed, May 19, 2010 at 4:23 PM, hkmortensen  wrote:
>> >
>> >>
>> >> I do searching for products. Each base product exist in variants as
>> well.
>> >> One
>> >> variant has a glass door, another a steel door etc. The variants can
>> have
>> >> diffent prices. The base product does not really exist, only the
>> variants
>> >> exists IRL. The case corresponds to cars: the car model is the base
>> >> product,
>> >> with color variants  or with automatic/manual etc.
>> >>
>> >> I want to search for variants, but I only want to have base products
>> in
>> >> the
>> >> result. Ie when one or more variants from the same base product are
>> >> found,
>> >> only the base product shall be in the search result.
>> >>
>> >> Does somebody have an idea how this could be done?
>> >>
>> >> Best regards
>> >>
>> >> Henning
>> >> --
>> >> View this message in context:
>> >>
>> http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829218.html
>> >> Sent from the Solr - User mailing list archive at Nabble.com.
>> >>
>> >
>> >
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829319.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
> 
> 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829435.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread hkmortensen


sorry, what does "sku" mean?

I understand you like this: indexing base and variants, and include all
atributes (for one base and its variants) in each document. I think that
would work. Thanks.


Nagelberg, Kallin wrote:
> 
> I agree that pulling all attributes into the parent sku during indexing
> could work well. Define a Boolean field like 'isVirtual' to identify the
> non-leaf skus, and use a multi-valued field for each of the attributes.
> For now you can do a search like (isVirtual:true AND doorType:screen). If
> at a later date you want the actual variants just search for
> isVirtual:false.
> 
> Does that work?
> 
> -Kallin Nagelberg
> 
> -Original Message-
> From: Leonardo Menezes [mailto:leonardo.menez...@googlemail.com] 
> Sent: Wednesday, May 19, 2010 11:13 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Challenge: Searching for variant products and get basic
> products in result set
> 
> if that is so, and maybe, you have for example, two variants of cars with
> automatic, what would define on which one was the hit? or field dont share
> common information across variants? if they do share, you wouldnt be able
> to
> define in which one was the hit(because it was on both of them) and would
> either have to pick one randomly, or retrieve both. if they dont share
> that
> info, you would have that covered, since only one would match any given
> query.
> 
> On Wed, May 19, 2010 at 5:04 PM, hkmortensen  wrote:
> 
>>
>> thanks. Currently not, but requirements change all the time as always ;-)
>> If we get a requirement, that a facet shall be "material of doors", we
>> will
>> need to know which variant was the hit. I would like to be prepared for
>> that.
>>
>>
>>
>>
>> Leonardo Menezes wrote:
>> >
>> > would you then need to know in which variant was your match produced?
>> > because if not, you can just index the whole thing as one single
>> > document...
>> >
>> > On Wed, May 19, 2010 at 4:23 PM, hkmortensen  wrote:
>> >
>> >>
>> >> I do searching for products. Each base product exist in variants as
>> well.
>> >> One
>> >> variant has a glass door, another a steel door etc. The variants can
>> have
>> >> diffent prices. The base product does not really exist, only the
>> variants
>> >> exists IRL. The case corresponds to cars: the car model is the base
>> >> product,
>> >> with color variants  or with automatic/manual etc.
>> >>
>> >> I want to search for variants, but I only want to have base products
>> in
>> >> the
>> >> result. Ie when one or more variants from the same base product are
>> >> found,
>> >> only the base product shall be in the search result.
>> >>
>> >> Does somebody have an idea how this could be done?
>> >>
>> >> Best regards
>> >>
>> >> Henning
>> >> --
>> >> View this message in context:
>> >>
>> http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829218.html
>> >> Sent from the Solr - User mailing list archive at Nabble.com.
>> >>
>> >
>> >
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829319.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
> 
> 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829435.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread hkmortensen


your are right, in that case an arbitrary on would have to be chosen or
probably then both should be in the result set. Difficult to say what the
marketing department would like ;-)



Leonardo Menezes wrote:
> 
> if that is so, and maybe, you have for example, two variants of cars with
> automatic, what would define on which one was the hit? or field dont share
> common information across variants? if they do share, you wouldnt be able
> to
> define in which one was the hit(because it was on both of them) and would
> either have to pick one randomly, or retrieve both. if they dont share
> that
> info, you would have that covered, since only one would match any given
> query.
> 
> On Wed, May 19, 2010 at 5:04 PM, hkmortensen  wrote:
> 
>>
>> thanks. Currently not, but requirements change all the time as always ;-)
>> If we get a requirement, that a facet shall be "material of doors", we
>> will
>> need to know which variant was the hit. I would like to be prepared for
>> that.
>>
>>
>>
>>
>> Leonardo Menezes wrote:
>> >
>> > would you then need to know in which variant was your match produced?
>> > because if not, you can just index the whole thing as one single
>> > document...
>> >
>> > On Wed, May 19, 2010 at 4:23 PM, hkmortensen  wrote:
>> >
>> >>
>> >> I do searching for products. Each base product exist in variants as
>> well.
>> >> One
>> >> variant has a glass door, another a steel door etc. The variants can
>> have
>> >> diffent prices. The base product does not really exist, only the
>> variants
>> >> exists IRL. The case corresponds to cars: the car model is the base
>> >> product,
>> >> with color variants  or with automatic/manual etc.
>> >>
>> >> I want to search for variants, but I only want to have base products
>> in
>> >> the
>> >> result. Ie when one or more variants from the same base product are
>> >> found,
>> >> only the base product shall be in the search result.
>> >>
>> >> Does somebody have an idea how this could be done?
>> >>
>> >> Best regards
>> >>
>> >> Henning
>> >> --
>> >> View this message in context:
>> >>
>> http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829218.html
>> >> Sent from the Solr - User mailing list archive at Nabble.com.
>> >>
>> >
>> >
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829319.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
> 
> 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829413.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread Nagelberg, Kallin

I agree that pulling all attributes into the parent sku during indexing could 
work well. Define a Boolean field like 'isVirtual' to identify the non-leaf 
skus, and use a multi-valued field for each of the attributes. For now you can 
do a search like (isVirtual:true AND doorType:screen). If at a later date you 
want the actual variants just search for isVirtual:false.

Does that work?

-Kallin Nagelberg

-Original Message-
From: Leonardo Menezes [mailto:leonardo.menez...@googlemail.com] 
Sent: Wednesday, May 19, 2010 11:13 AM
To: solr-user@lucene.apache.org
Subject: Re: Challenge: Searching for variant products and get basic products 
in result set

if that is so, and maybe, you have for example, two variants of cars with
automatic, what would define on which one was the hit? or field dont share
common information across variants? if they do share, you wouldnt be able to
define in which one was the hit(because it was on both of them) and would
either have to pick one randomly, or retrieve both. if they dont share that
info, you would have that covered, since only one would match any given
query.

On Wed, May 19, 2010 at 5:04 PM, hkmortensen  wrote:

>
> thanks. Currently not, but requirements change all the time as always ;-)
> If we get a requirement, that a facet shall be "material of doors", we will
> need to know which variant was the hit. I would like to be prepared for
> that.
>
>
>
>
> Leonardo Menezes wrote:
> >
> > would you then need to know in which variant was your match produced?
> > because if not, you can just index the whole thing as one single
> > document...
> >
> > On Wed, May 19, 2010 at 4:23 PM, hkmortensen  wrote:
> >
> >>
> >> I do searching for products. Each base product exist in variants as
> well.
> >> One
> >> variant has a glass door, another a steel door etc. The variants can
> have
> >> diffent prices. The base product does not really exist, only the
> variants
> >> exists IRL. The case corresponds to cars: the car model is the base
> >> product,
> >> with color variants  or with automatic/manual etc.
> >>
> >> I want to search for variants, but I only want to have base products in
> >> the
> >> result. Ie when one or more variants from the same base product are
> >> found,
> >> only the base product shall be in the search result.
> >>
> >> Does somebody have an idea how this could be done?
> >>
> >> Best regards
> >>
> >> Henning
> >> --
> >> View this message in context:
> >>
> http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829218.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >
> >
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829319.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread Leonardo Menezes

if that is so, and maybe, you have for example, two variants of cars with
automatic, what would define on which one was the hit? or field dont share
common information across variants? if they do share, you wouldnt be able to
define in which one was the hit(because it was on both of them) and would
either have to pick one randomly, or retrieve both. if they dont share that
info, you would have that covered, since only one would match any given
query.

On Wed, May 19, 2010 at 5:04 PM, hkmortensen  wrote:

>
> thanks. Currently not, but requirements change all the time as always ;-)
> If we get a requirement, that a facet shall be "material of doors", we will
> need to know which variant was the hit. I would like to be prepared for
> that.
>
>
>
>
> Leonardo Menezes wrote:
> >
> > would you then need to know in which variant was your match produced?
> > because if not, you can just index the whole thing as one single
> > document...
> >
> > On Wed, May 19, 2010 at 4:23 PM, hkmortensen  wrote:
> >
> >>
> >> I do searching for products. Each base product exist in variants as
> well.
> >> One
> >> variant has a glass door, another a steel door etc. The variants can
> have
> >> diffent prices. The base product does not really exist, only the
> variants
> >> exists IRL. The case corresponds to cars: the car model is the base
> >> product,
> >> with color variants  or with automatic/manual etc.
> >>
> >> I want to search for variants, but I only want to have base products in
> >> the
> >> result. Ie when one or more variants from the same base product are
> >> found,
> >> only the base product shall be in the search result.
> >>
> >> Does somebody have an idea how this could be done?
> >>
> >> Best regards
> >>
> >> Henning
> >> --
> >> View this message in context:
> >>
> http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829218.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >
> >
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829319.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread hkmortensen


thanks. Currently not, but requirements change all the time as always ;-) 
If we get a requirement, that a facet shall be "material of doors", we will
need to know which variant was the hit. I would like to be prepared for
that.




Leonardo Menezes wrote:
> 
> would you then need to know in which variant was your match produced?
> because if not, you can just index the whole thing as one single
> document...
> 
> On Wed, May 19, 2010 at 4:23 PM, hkmortensen  wrote:
> 
>>
>> I do searching for products. Each base product exist in variants as well.
>> One
>> variant has a glass door, another a steel door etc. The variants can have
>> diffent prices. The base product does not really exist, only the variants
>> exists IRL. The case corresponds to cars: the car model is the base
>> product,
>> with color variants  or with automatic/manual etc.
>>
>> I want to search for variants, but I only want to have base products in
>> the
>> result. Ie when one or more variants from the same base product are
>> found,
>> only the base product shall be in the search result.
>>
>> Does somebody have an idea how this could be done?
>>
>> Best regards
>>
>> Henning
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829218.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
> 
> 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829319.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Embedded Server, Caching, Stats page updates

2010-05-19 Thread Antoniya Statelova

>
> The way you phrased that paragraph makes me think that one of us doesn't
> understand what exactly you did when you "switched" ...
>

"Switched" works for the specific setup i'm using - the server would refer
to itself in the CommonHttpSolrServer request sent, i.e. it would run both
the server and client sides. Removing this and simply using
EmbeddedSolrServer just made the setup a little more sane in that aspect.
Does that make more sense now?


> Now for starters: if the remote server you were running solr on is more
> powerful then the local machine you are running your java application on,
> that alone could explain some performance differences (likewise for JVM
> settings).
>
The machine I'm running it on is exactly the same - the code change was
pushed and I had performance before and after. Same load observed (since
it's a testing machine i could regulate that). That's why i was so surprised
that removing that additional http request didn't cause improvement.


> Most importantly: when running solr embedded in your application, there is
> no "stats.jsp" page for you to look at -- because solr is no longer
> running in a servlet container.  so if you are seeing stats on your
> solr server that say your caches aren't being hit, the reason is because
> the server isn't being hit at all.
>

This is nice to know, I didn't look into how the actual page was generated.
I expected something like this to be true. Thank you!


> When running an embedded solr server, the filterCache and queryResultCache
> will still be used.  the settings in the solrconfig.xml you specify when
> initializing the SolrCore will be honored.  you can see use JMX to monitor
> those cache hit rates (assuming you have JMX enabled for your application,
> and the appropriate setting is in your solrconfig.xml)
>
> I'll look into using JMX, thanks for the suggestion.

Tony

Re: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread Leonardo Menezes

would you then need to know in which variant was your match produced?
because if not, you can just index the whole thing as one single document...

On Wed, May 19, 2010 at 4:23 PM, hkmortensen  wrote:

>
> I do searching for products. Each base product exist in variants as well.
> One
> variant has a glass door, another a steel door etc. The variants can have
> diffent prices. The base product does not really exist, only the variants
> exists IRL. The case corresponds to cars: the car model is the base
> product,
> with color variants  or with automatic/manual etc.
>
> I want to search for variants, but I only want to have base products in the
> result. Ie when one or more variants from the same base product are found,
> only the base product shall be in the search result.
>
> Does somebody have an idea how this could be done?
>
> Best regards
>
> Henning
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829218.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: DIH. behavior after a import. Log, delete table !?

2010-05-19 Thread stockii


hey, thx

i did all what you say.

createn an Jar-file. this jar file delete my table.

but SOLR absolute dont want to start this JAR. i put a run.bat file into my
folder where is my jar saved. this batch-file runs and delete the table, but
when solr start this batch-file. it doesnt work. i dont know why. !?!?!?
i test the batch-file in different wayy and it should be work... help ^^

windows xp for test ;-)
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/DIH-behavior-after-a-import-Log-delete-table-tp823232p829230.html
Sent from the Solr - User mailing list archive at Nabble.com.

Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread hkmortensen


I do searching for products. Each base product exist in variants as well. One
variant has a glass door, another a steel door etc. The variants can have
diffent prices. The base product does not really exist, only the variants
exists IRL. The case corresponds to cars: the car model is the base product,
with color variants  or with automatic/manual etc.

I want to search for variants, but I only want to have base products in the
result. Ie when one or more variants from the same base product are found,
only the base product shall be in the search result.

Does somebody have an idea how this could be done?

Best regards

Henning
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Challenge-Searching-for-variant-products-and-get-basic-products-in-result-set-tp829218p829218.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: defaultSearchField

2010-05-19 Thread Jan Kammer

There is something called dismax-requesthandler. I think this is what 
you are looking for.


greetz, Jan


Am 19.05.2010 15:47, schrieb Antonello Mangone:

Hi to everyone, I'd like to know if it's possible to use the *
defaultSearchField* on more fields ???

i.e.

  field1, field2, field3


Thanks you all

Re: defaultSearchField

2010-05-19 Thread Ahmet Arslan

> Hi to everyone, I'd like to know if
> it's possible to use the *
> defaultSearchField* on more fields ???
> 
> i.e.
> 
>  field1, field2, field3
> 
> 

No. But you can query multiple fields using dismax. 

qf=field1,field2,field3&defType=dismax

http://wiki.apache.org/solr/DisMaxRequestHandler

defaultSearchField

2010-05-19 Thread Antonello Mangone

Hi to everyone, I'd like to know if it's possible to use the *
defaultSearchField* on more fields ???

i.e.

 field1, field2, field3 


Thanks you all

Re: jmx issue with solr

2010-05-19 Thread Na_D


Thanks for the info , using the above properties solved the issue .
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/jmx-issue-with-solr-tp828478p829057.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Storing RandomSortField

2010-05-19 Thread Alexandre Rocco

Leonardo,

I was able to use the feature with a dynamic field as pointed in the
documentation.
So, I was just curious to take a peek at the values that are generated, even
when the field is not dynamic, so I tried to figure out a way to do so.
Maybe some output when the debug query is enabled would be useful, but it
seems it's not implemented yet.
I will try to take a look at the classes and see what can I do about it.

Thanks!

On Wed, May 19, 2010 at 5:34 AM, Leonardo Menezes <
leonardo.menez...@googlemail.com> wrote:

> Hey,
>   for random sorting, random values are generated in runtime using the seed
> you passed as one of the parameters to generate the value, among other
> things. this way, if the value you use as seed is the same in different
> request, the sorting order should be the same. you could also, for debbuing
> purposes, edit the random sort field class and put some traces in there, so
> it could print the id of the document and the value generated for example.
> but the values wont be stored on the idx.
>
> cheers
>
> On Wed, May 19, 2010 at 10:00 AM, Marco Martinez <
> mmarti...@paradigmatecnologico.com> wrote:
>
> > Hi Alexandre,
> >
> > I am not totally sure about this, but the random sort field its only used
> > to
> > do a random sort on your searchs, and you will to pass differents values
> to
> > have differents sorts, so this only applies in the searchs, so no value
> is
> > indexed. You will find more information here:
> >
> >
> http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField.html
> >
> > Marco Martínez Bautista
> > http://www.paradigmatecnologico.com
> > Avenida de Europa, 26. Ática 5. 3ª Planta
> > 28224 Pozuelo de Alarcón
> > Tel.: 91 352 59 42
> >
> >
> > 2010/5/18 Alexandre Rocco 
> >
> > > Hi guys,
> > >
> > > Is there any way to mak a RandomSortField be stored?
> > > I'm trying to do it for debugging purposes,
> > > My intention is to take a look at the values that are stored there to
> > > determine the sorting that is being applied to the results.
> > >
> > > I tried to make it a stored field as:
> > > 
> > >
> > > And also tried to create another text field, copying the result from
> the
> > > random field like this:
> > >  stored="true"/>
> > > 
> > >
> > > Neither of the approaches worked.
> > > Is there any restriction on this kind of field that prevents it from
> > being
> > > displayed in the results?
> > >
> > > Thanks,
> > > Alexandre
> > >
> >
>

Re: jmx issue with solr

2010-05-19 Thread Jean-Sebastien Vachon

Hi,

Try adding these options...

-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false


On 2010-05-19, at 3:44 AM, Na_D wrote:

> 
> Hi,
> 
> I am trying to start solr with the following command :
> 
> java -Dsolr.solr.home="./example-DIH/solr/" -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.port=3000
> 
> 
> On doing so an error is reported :
> 
> Error: Password file read access must be restricted: C:\Program
> Files\Java\jdk1.
> 6.0_18\jre\lib\management\jmxremote.password
> 
> 
> The jmxremote.password file is there in the lib\management folder and the
> same has been set to read-only.
> still the error persists.I am using Windows XP SP3 Version 2002, just
> mentioning the same if its of any help.
> Please do put in your suggestions.
> -- 
> View this message in context: 
> http://lucene.472066.n3.nabble.com/jmx-issue-with-solr-tp828478p828478.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Deduplication

2010-05-19 Thread Ahmet Arslan

> TermsComponent maybe? 
> 
> or faceting?
> q=*:*&facet=true&facet.field=signatureField&defType=lucene&rows=0&start=0
> 
> if you append &facet.mincount=1 to above url you can
> see your duplications
> 

After re-reading your message: sometimes you want to show duplicates, sometimes 
you don't want them. I have never used FieldCollapsing by myself but heard 
about it many times.

http://wiki.apache.org/solr/FieldCollapsing

Re: Deduplication

2010-05-19 Thread Ahmet Arslan


> Basically for some uses cases I would like to show
> duplicates for other I
> wanted them ignored.
> 
> If I have overwriteDupes=false and I just create the dedup
> hash how can I
> query for only unique hash values... ie something like a
> SQL group by. 

TermsComponent maybe? 

or faceting? 
q=*:*&facet=true&facet.field=signatureField&defType=lucene&rows=0&start=0

if you append &facet.mincount=1 to above url you can see your duplications

Re: Moving from Lucene to Solr?

2010-05-19 Thread findbestopensource

Hi Peter,

You need to use Lucene,

   - To have more control
   - You cannot depend on any Web server
   - To use termvector, termdocs etc
   - You could easily extend to have your own Analyzer

You need to use Solr,

   - To index and search docs easily by writting few code
   - Solr is a standalone App and it takes care most of the stuff like
   optimizing,warmup the reader etc..
   - Solr could be extended to multiple nodes
   - To use facet

If you are developing your client in Java and want to use Solr then i would
advise to use SolrJ as it is easy and you don't need to care about HTTP
stuff. I use Solr using SolrJ in my project www.findbestopensource.com

Regards
Aditya
www.findbestopensource.com



On Wed, May 19, 2010 at 4:08 PM, Peter Karich  wrote:

> Hi all,
>
> while asking a question on stackoverflow [1] some other questions appear:
> Is SolrJ a recommended way to access Solr or should I prefer the HTTP
> interface?
>
> How can I (j)unit-test Solr? (e.g. create+delete index via Java call)
>
> Is Lucene faster than Solr? ... do you have experiences, preferable with
> the same index?
>
> The background is an application which uses Lucene at the moment but I
> hardly need the facetting feature of Solr and I don't want to implement
> it in Lucene for myself.
>
> Regards,
> Peter.
>
> [1]
>
> http://stackoverflow.com/questions/2856427/situations-to-prefer-apache-lucene-over-solr
>
>

Moving from Lucene to Solr?

2010-05-19 Thread Peter Karich

Hi all,

while asking a question on stackoverflow [1] some other questions appear:
Is SolrJ a recommended way to access Solr or should I prefer the HTTP
interface?

How can I (j)unit-test Solr? (e.g. create+delete index via Java call)

Is Lucene faster than Solr? ... do you have experiences, preferable with
the same index?

The background is an application which uses Lucene at the moment but I
hardly need the facetting feature of Solr and I don't want to implement
it in Lucene for myself.

Regards,
Peter.

[1]
http://stackoverflow.com/questions/2856427/situations-to-prefer-apache-lucene-over-solr

Re: TikaEntityProcessor on Solr 1.4?

2010-05-19 Thread Noble Paul നോബിള്‍ नोब्ळ्

I guess it should work because Tika Entityprocessor does not use any
new 1.4 APIs

On Wed, May 19, 2010 at 1:17 AM, Sixten Otto  wrote:
> Sorry to repeat this question, but I realized that it probably
> belonged in its own thread:
>
> The TikaEntityProcessor class that enables DataImportHandler to
> process business documents was added after the release of Solr 1.4,
> along with some other changes (like the binary DataSources) to support
> it. Obviously, there hasn't been an official release of Solr since
> then. Has anyone tried back-porting those changes to Solr 1.4?
>
> (I do see that the question was asked last month, without any
> response: http://www.lucidimagination.com/search/document/5d2d25bc57c370e9)
>
> The patches for these issues don't seem all that complex or pervasive,
> but it's hard for me (as a Solr n00b) to tell whether this is really
> all that's involved:
> https://issues.apache.org/jira/browse/SOLR-1583
> https://issues.apache.org/jira/browse/SOLR-1358
>
> Sixten
>



-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com

Custom sorting

2010-05-19 Thread dan sutton

Hi,

I have a requirement to do the following:

For up to the first 10 results (i.e. only on the first page) show
sponsored category ads, in order of bid, but no more than 2 / category,
and only if all sponsored cat' ads are more that min% of the highest
score. e.g. If I had the following:

min% =1


doc score bid  cat_id sponsored
  1   100   x   x 0
  255x   x 0

  3502   2 1
  4202   2 1
  5052   2 1

  6801   1 1
  7701   1 1
  8601   1 1

x = dont care

sorted order would be:

3
4

6
7

1
8
2
5

I'm not sure if this can be implemented with a custom comparator as I
need access to the final score to enforce min%, I'm thinking I'm
probably going to have to implement a subclass of QParserPlugin with a
custom sort. but was wondering if there were alternatives ?

Many thanks in advance.
Dan

Re: Solr Architecture discussion

2010-05-19 Thread rabahb


Do you have any insights that could help me and other people that might be
interested in that discussion?
Thanks.

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Architecture-discussion-tp825708p828658.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Storing RandomSortField

2010-05-19 Thread Leonardo Menezes

Hey,
   for random sorting, random values are generated in runtime using the seed
you passed as one of the parameters to generate the value, among other
things. this way, if the value you use as seed is the same in different
request, the sorting order should be the same. you could also, for debbuing
purposes, edit the random sort field class and put some traces in there, so
it could print the id of the document and the value generated for example.
but the values wont be stored on the idx.

cheers

On Wed, May 19, 2010 at 10:00 AM, Marco Martinez <
mmarti...@paradigmatecnologico.com> wrote:

> Hi Alexandre,
>
> I am not totally sure about this, but the random sort field its only used
> to
> do a random sort on your searchs, and you will to pass differents values to
> have differents sorts, so this only applies in the searchs, so no value is
> indexed. You will find more information here:
>
> http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField.html
>
> Marco Martínez Bautista
> http://www.paradigmatecnologico.com
> Avenida de Europa, 26. Ática 5. 3ª Planta
> 28224 Pozuelo de Alarcón
> Tel.: 91 352 59 42
>
>
> 2010/5/18 Alexandre Rocco 
>
> > Hi guys,
> >
> > Is there any way to mak a RandomSortField be stored?
> > I'm trying to do it for debugging purposes,
> > My intention is to take a look at the values that are stored there to
> > determine the sorting that is being applied to the results.
> >
> > I tried to make it a stored field as:
> > 
> >
> > And also tried to create another text field, copying the result from the
> > random field like this:
> > 
> > 
> >
> > Neither of the approaches worked.
> > Is there any restriction on this kind of field that prevents it from
> being
> > displayed in the results?
> >
> > Thanks,
> > Alexandre
> >
>

Re: Storing RandomSortField

2010-05-19 Thread Marco Martinez

Hi Alexandre,

I am not totally sure about this, but the random sort field its only used to
do a random sort on your searchs, and you will to pass differents values to
have differents sorts, so this only applies in the searchs, so no value is
indexed. You will find more information here:
http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField.html

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/18 Alexandre Rocco 

> Hi guys,
>
> Is there any way to mak a RandomSortField be stored?
> I'm trying to do it for debugging purposes,
> My intention is to take a look at the values that are stored there to
> determine the sorting that is being applied to the results.
>
> I tried to make it a stored field as:
> 
>
> And also tried to create another text field, copying the result from the
> random field like this:
> 
> 
>
> Neither of the approaches worked.
> Is there any restriction on this kind of field that prevents it from being
> displayed in the results?
>
> Thanks,
> Alexandre
>

jmx issue with solr

2010-05-19 Thread Na_D


Hi,

I am trying to start solr with the following command :

java -Dsolr.solr.home="./example-DIH/solr/" -Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port=3000


On doing so an error is reported :

Error: Password file read access must be restricted: C:\Program
Files\Java\jdk1.
6.0_18\jre\lib\management\jmxremote.password


The jmxremote.password file is there in the lib\management folder and the
same has been set to read-only.
still the error persists.I am using Windows XP SP3 Version 2002, just
mentioning the same if its of any help.
Please do put in your suggestions.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/jmx-issue-with-solr-tp828478p828478.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: disable caches in real time

2010-05-19 Thread Marco Martinez

Hi Chris,

Thank you for your answer.

I've always undestand that if you do a commit (replication does it), a new
searcher is open, and you lose performance (queries per second) while the
caches are regenerated. I think i don't explain correctly my situation
before, with my schema i want to avoid this loss of performance in an
enviroment with frequent updates.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/18 Chris Hostetter 

> : I want to know if there is any approach to disable caches in a specific
> core
> : from a multicore server.
>
> only via hte config.
>
> : I have a multicore server where the core0 will be listen to the queries
> and
> : other core (core1) that will be replicated from a master server. Once the
> : replication has been done, i will swap the cores. My point is that i want
> to
> : disable the caches in the core that is in charge of the replication to
> save
> : memory in the machine.
>
> that seems bizarely complicated -- replication can work against a "live"
> core, no need to do the swap yourself, the replicationHandler takes care
> of this for your transparently (ie: you have one core, replicating from a
> master -- the old index will be searched by users, and have caches, and
> when the new version of the index is ready, the replication handler will
> swap the *index* in that core (but the core itself never changes) ... it
> can even autowarm the caches on the new index for you before the swap if
> you configure it that way.
>
> -Hoss
>
>

47 matches

Mail list logo