Re: MongoDB and Solr

2012-05-29 Thread Gora Mohanty
On 30 May 2012 03:51, rjain15  wrote:
> Hi Gora,
>
> I am working on a Mobile App, which is updating/accessing/searching data and
> I have created a simple prototype using Solr and the Update JSON / Get JSON
> functions of Solr.
>
> I came across some discussion on MongoDB and how it natively stores JSON
> data, and hence as I was looking at scalability of data storage/indexing, I
> was pausing to understand if I am on the right track of just using Solr or
> should I combine Solr with MongoDB as I am reading this blog post...
[...]

A discussion on web architecture is off-topic for
this list, and will also probably draw in people with
strong opinions. Here is a brief personal opinion,
but you are probably better off trying out a couple
of different architectural prototypes, and/or talking
to someone with experience in scalable sites.

First of all, you should consider whether you really
need a NoSQL store. This would depend on the
scale, and requirements of your app. IMHO, RDBMSes
now are proven systems with many years of learning
behind them. Thus, your question should be why
NoSQL, rather than the other way around.

Solr for search should do fine, and you already
know how to get JSON in and out of it. Incidentally,
we also tested out Solr as a NoSQL store (raw data,
and not JSON, though), and were quite happy with
the performance.

Regards,
Gora


RE: useFastVectorHighlighter doesn't work

2012-05-29 Thread ZHANG Liang F
Thanks a lot, It's quite clear now. 

-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.com] 
Sent: 2012年5月29日 16:37
To: solr-user@lucene.apache.org
Subject: RE: useFastVectorHighlighter doesn't work

> So for highlight, stored="true" is
> required in any circumstance, right?

Exactly. http://wiki.apache.org/solr/FieldOptionsByUseCase



Re: MongoDB and Solr

2012-05-29 Thread Walter Underwood
Solr does not natively store/index/search arbitrary JSON documents.

It accepts JSON in a specific format for document input.

wunder

On May 29, 2012, at 3:21 PM, rjain15 wrote:

> Hi Gora, 
> 
> I am working on a Mobile App, which is updating/accessing/searching data and
> I have created a simple prototype using Solr and the Update JSON / Get JSON
> functions of Solr. 
> 
> I came across some discussion on MongoDB and how it natively stores JSON
> data, and hence as I was looking at scalability of data storage/indexing, I
> was pausing to understand if I am on the right track of just using Solr or
> should I combine Solr with MongoDB as I am reading this blog post...
> 
> http://blog.knuthaugen.no/2010/04/cooking-with-mongodb-and-solr.html
> http://blog.knuthaugen.no/2010/04/cooking-with-mongodb-and-solr.html
> 
> Maybe this is an incorrect question, as you say -- MongoDB might be an
> entirely different beast. 
> 
> Apologies for a novice question. My point was, for Mobile / Consumer Web
> Apps -- what are the architectural considerations. I don't want it to be a
> overkill, hence if solr can natively store/index/search json documents, then
> that is the solution I can build on top of. 
> 
> 
> Thanks
> Rajesh
> 






Re: MongoDB and Solr

2012-05-29 Thread rjain15
Hi Gora, 

I am working on a Mobile App, which is updating/accessing/searching data and
I have created a simple prototype using Solr and the Update JSON / Get JSON
functions of Solr. 

I came across some discussion on MongoDB and how it natively stores JSON
data, and hence as I was looking at scalability of data storage/indexing, I
was pausing to understand if I am on the right track of just using Solr or
should I combine Solr with MongoDB as I am reading this blog post...

http://blog.knuthaugen.no/2010/04/cooking-with-mongodb-and-solr.html
http://blog.knuthaugen.no/2010/04/cooking-with-mongodb-and-solr.html

Maybe this is an incorrect question, as you say -- MongoDB might be an
entirely different beast. 

Apologies for a novice question. My point was, for Mobile / Consumer Web
Apps -- what are the architectural considerations. I don't want it to be a
overkill, hence if solr can natively store/index/search json documents, then
that is the solution I can build on top of. 


Thanks
Rajesh



--
View this message in context: 
http://lucene.472066.n3.nabble.com/MongoDB-and-Solr-tp3986636p3986729.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: how to reduce the result size to 2-3 lines and expand based on user interest

2012-05-29 Thread srini
hi iorixxx,

Sorry I missed your reply. Let me put my requirement in another way.

I have a description field which holds more text(2-3 para graphs) and it is
indexed.

When User search for any word, if solr finds that word in description I want
to show the content probably 2-3 lines which matches the search word? Any
ideas how to do this?

Thanks in Advance!!!
Srini


--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-reduce-the-result-size-to-2-3-lines-and-expand-based-on-user-interest-tp3985692p3986727.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Multi-words synonyms matching

2012-05-29 Thread Lance Norskog
I recently have had the same use case. I wound up doing this: in both
index and query time, the synonyms file is 'expand=false'. All
multi-word synonyms map to one single-word synonym (per group). This
way, only the main word is indexed or queried.

If the synonym file changes, you have to re-index the matching content.

On Tue, May 29, 2012 at 1:27 PM, elisabeth benoit
 wrote:
> Hello Bernd,
>
> Thanks a lot for your answer. I'll work on this.
>
> Best regards,
> Elisabeth
>
> 2012/5/29 Bernd Fehling 
>
>> Hello Elisabeth,
>>
>> my synonyms.txt is like your 2nd example:
>>
>> naturwald, φυσικό\ δάσος, естествена\ гора, prírodný\ les, naravni\ gozd,
>> foresta\ naturale, natuurbos, natural\ forest, bosque\ natural,
>> természetes\ erdő,
>> natūralus\ miškas, prirodna\ šuma, dabiskais\ mežs, floresta\ natural,
>> naturskov,
>> forêt\ naturelle, naturskog, přírodní\ les, luonnonmetsä, pădure\ naturală,
>> las\ naturalny, natürlicher\ wald
>>
>>
>> An example from my system with debugging turned on and searching for
>> "naturwald":
>>
>> 
>>  naturwald
>>  naturwald
>>  textth:naturwald textth:"φυσικό δάσος"
>> textth:"естествена гора"
>> textth:"prírodný les" textth:"naravni gozd" textth:"foresta naturale"
>> textth:natuurbos
>> textth:"natural forest" textth:"bosque natural" textth:"természetes erdő"
>> textth:"natūralus miškas" textth:"prirodna šuma" textth:"dabiskais mežs"
>> textth:"floresta natural" textth:naturskov textth:"forêt naturelle"
>> textth:naturskog
>> textth:"přírodní les" textth:luonnonmetsä textth:"pădure naturală"
>> textth:"las naturalny"
>> textth:"natürlicher wald"
>> ...
>>
>> As you can see my search for "naturwald" extends to single and multiword
>> synonyms e.g. "forêt naturelle"
>>
>>
>> My SynonymFilterFactory has the following settings:
>>
>> org.apache.solr.analysis.SynonymFilterFactory
>> {tokenizerFactory=solr.KeywordTokenizerFactory,
>> synonyms=synonyms_eurovoc_desc_desc_ufall.txt, expand=true, format=solr,
>> ignoreCase=true,
>> luceneMatchVersion=LUCENE_36}
>>
>> But as I already mentioned, there is much more work to be done to get it
>> running than
>> just using SynonymFilterFactory.
>>
>> Regards
>> Bernd
>>
>>
>>
>> Am 23.05.2012 08:49, schrieb elisabeth benoit:
>> > Hello Bernd,
>> >
>> > Thanks for your advice.
>> >
>> > I have one question: how did you manage to map one word to a multiwords
>> > synonym???
>> >
>> > I've tried (in synonyms.txt)
>> >
>> > mairie, hotel de ville
>> >
>> > mairie, hotel\ de\ ville
>> >
>> > mairie => mairie, hotel de ville
>> >
>> > mairie => mairie, hotel\ de\ ville
>> >
>> > but nothing prevents mairie from matching with "hotel"...
>> >
>> > The only way I found is to use
>> > tokenizerFactory="solr.KeywordTokenizerFactory" in my synonyms
>> declaration
>> > in schema.xml, but then since "mairie" is not alone in my index field, it
>> > doesn't match.
>> >
>> >
>> > best regards,
>> > Elisabeth
>> >
>> >
>> >
>> >
>> > the only way I found, I schema.xml, is to use
>> >
>> >
>> >
>> > 2012/5/15 Bernd Fehling 
>> >
>> >> Without reading the whole thread let me say that you should not trust
>> >> the solr admin analysis. It takes the whole multiword search and runs
>> >> it all together at once through each analyzer step (factory).
>> >> But this is not how the real system works. First pitfall, the query
>> parser
>> >> is also splitting at white space (if not a phrase query). Due to this,
>> >> a multiword query is send chunk after chunk through the analyzer and,
>> >> second pitfall, each chunk runs through the whole analyzer by its own.
>> >>
>> >> So if you are dealing with multiword synonyms you have the following
>> >> problems. Either you turn your query into a phrase so that the whole
>> >> phrase is analyzed at once and therefore looked up as multiword synonym
>> >> but phrase queries are not analyzed !!! OR you send your query chunk
>> >> by chunk through the analyzer but then they are not multiwords anymore
>> >> and are not found in your synonyms.txt.
>> >>
>> >> From my experience I can say that it requires some deep work to get it
>> done
>> >> but it is possible. I have connected a thesaurus to solr which is doing
>> >> query time expansion (no need to reindex if the thesaurus changes).
>> >> The thesaurus holds synonyms and "used for terms" in 24 languages. So
>> >> it is also some kind of language translation. And naturally the
>> thesaurus
>> >> translates from single term to multi term synonyms and vice versa.
>> >>
>> >> Regards,
>> >> Bernd
>> >>
>> >>
>> >> Am 14.05.2012 13:54, schrieb elisabeth benoit:
>> >>> Just for the record, I'd like to conclude this thread
>> >>>
>> >>> First, you were right, there was no behaviour difference between fq
>> and q
>> >>> parameters.
>> >>>
>> >>> I realized that:
>> >>>
>> >>> 1) my synonym (hotel de ville) has a stopword in it (de) and since I
>> used
>> >>> tokenizerFactory="solr.KeywordTokenizerFactory" in my synonyms
>> >> declaration,
>> >>> there was no stopword removal 

Re: Multi-words synonyms matching

2012-05-29 Thread elisabeth benoit
Hello Bernd,

Thanks a lot for your answer. I'll work on this.

Best regards,
Elisabeth

2012/5/29 Bernd Fehling 

> Hello Elisabeth,
>
> my synonyms.txt is like your 2nd example:
>
> naturwald, φυσικό\ δάσος, естествена\ гора, prírodný\ les, naravni\ gozd,
> foresta\ naturale, natuurbos, natural\ forest, bosque\ natural,
> természetes\ erdő,
> natūralus\ miškas, prirodna\ šuma, dabiskais\ mežs, floresta\ natural,
> naturskov,
> forêt\ naturelle, naturskog, přírodní\ les, luonnonmetsä, pădure\ naturală,
> las\ naturalny, natürlicher\ wald
>
>
> An example from my system with debugging turned on and searching for
> "naturwald":
>
> 
>  naturwald
>  naturwald
>  textth:naturwald textth:"φυσικό δάσος"
> textth:"естествена гора"
> textth:"prírodný les" textth:"naravni gozd" textth:"foresta naturale"
> textth:natuurbos
> textth:"natural forest" textth:"bosque natural" textth:"természetes erdő"
> textth:"natūralus miškas" textth:"prirodna šuma" textth:"dabiskais mežs"
> textth:"floresta natural" textth:naturskov textth:"forêt naturelle"
> textth:naturskog
> textth:"přírodní les" textth:luonnonmetsä textth:"pădure naturală"
> textth:"las naturalny"
> textth:"natürlicher wald"
> ...
>
> As you can see my search for "naturwald" extends to single and multiword
> synonyms e.g. "forêt naturelle"
>
>
> My SynonymFilterFactory has the following settings:
>
> org.apache.solr.analysis.SynonymFilterFactory
> {tokenizerFactory=solr.KeywordTokenizerFactory,
> synonyms=synonyms_eurovoc_desc_desc_ufall.txt, expand=true, format=solr,
> ignoreCase=true,
> luceneMatchVersion=LUCENE_36}
>
> But as I already mentioned, there is much more work to be done to get it
> running than
> just using SynonymFilterFactory.
>
> Regards
> Bernd
>
>
>
> Am 23.05.2012 08:49, schrieb elisabeth benoit:
> > Hello Bernd,
> >
> > Thanks for your advice.
> >
> > I have one question: how did you manage to map one word to a multiwords
> > synonym???
> >
> > I've tried (in synonyms.txt)
> >
> > mairie, hotel de ville
> >
> > mairie, hotel\ de\ ville
> >
> > mairie => mairie, hotel de ville
> >
> > mairie => mairie, hotel\ de\ ville
> >
> > but nothing prevents mairie from matching with "hotel"...
> >
> > The only way I found is to use
> > tokenizerFactory="solr.KeywordTokenizerFactory" in my synonyms
> declaration
> > in schema.xml, but then since "mairie" is not alone in my index field, it
> > doesn't match.
> >
> >
> > best regards,
> > Elisabeth
> >
> >
> >
> >
> > the only way I found, I schema.xml, is to use
> >
> >
> >
> > 2012/5/15 Bernd Fehling 
> >
> >> Without reading the whole thread let me say that you should not trust
> >> the solr admin analysis. It takes the whole multiword search and runs
> >> it all together at once through each analyzer step (factory).
> >> But this is not how the real system works. First pitfall, the query
> parser
> >> is also splitting at white space (if not a phrase query). Due to this,
> >> a multiword query is send chunk after chunk through the analyzer and,
> >> second pitfall, each chunk runs through the whole analyzer by its own.
> >>
> >> So if you are dealing with multiword synonyms you have the following
> >> problems. Either you turn your query into a phrase so that the whole
> >> phrase is analyzed at once and therefore looked up as multiword synonym
> >> but phrase queries are not analyzed !!! OR you send your query chunk
> >> by chunk through the analyzer but then they are not multiwords anymore
> >> and are not found in your synonyms.txt.
> >>
> >> From my experience I can say that it requires some deep work to get it
> done
> >> but it is possible. I have connected a thesaurus to solr which is doing
> >> query time expansion (no need to reindex if the thesaurus changes).
> >> The thesaurus holds synonyms and "used for terms" in 24 languages. So
> >> it is also some kind of language translation. And naturally the
> thesaurus
> >> translates from single term to multi term synonyms and vice versa.
> >>
> >> Regards,
> >> Bernd
> >>
> >>
> >> Am 14.05.2012 13:54, schrieb elisabeth benoit:
> >>> Just for the record, I'd like to conclude this thread
> >>>
> >>> First, you were right, there was no behaviour difference between fq
> and q
> >>> parameters.
> >>>
> >>> I realized that:
> >>>
> >>> 1) my synonym (hotel de ville) has a stopword in it (de) and since I
> used
> >>> tokenizerFactory="solr.KeywordTokenizerFactory" in my synonyms
> >> declaration,
> >>> there was no stopword removal in the indewed expression, so when
> >> requesting
> >>> "hotel de ville", after stopwords removal in query, Solr was comparing
> >>> "hotel de ville"
> >>> with "hotel ville"
> >>>
> >>> but my queries never even got to that point since
> >>>
> >>> 2) I made a mistake using "mairie" alone in the admin interface when
> >>> testing my schema. The real field was something like "collectivités
> >>> territoriales mairie",
> >>> so the synonym "hotel de ville" was not even applied, because of the
> >>> tokenizerFactory="solr.Keyw

Re: Many Cores with Solr

2012-05-29 Thread Mike Douglass
That was one of my concerns. To date I've been using lucene directly and
pointing it at an index for the current authenticated user. solr cores
seemed to come close to that.

Is the issue with a lot of cores just creating a lot or using many cores
concurrently? 


Erik Hatcher-4 wrote
> 
> You do get relevancy related "leakage" though.  With users content all in
> the same index and using the same field names, term and document
> frequencies across the index will be used for scoring.  This may be (and
> has been) a good reason to keep separately searchable content in different
> indexes/cores.
> 


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Many-Cores-with-Solr-tp3161889p3986710.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: suggestions developing a multi-version concurrency control (MVCC) mechanism

2012-05-29 Thread Lance Norskog
Solr uses a flat schema. You can store old versions, but you have to
encode them somehow and save them as data.

On Tue, May 29, 2012 at 7:20 AM, Nicholas Ball
 wrote:
>
> Hmmm interesting, that will definitely work and may be the way to go.
> Ideally, I'd rather store the older versions within a field of the newest
> if possible.
> Can one create a custom field that holds other objects?
>
> Nick
>
> On Mon, 28 May 2012 17:07:06 -0700, Lance Norskog 
> wrote:
>> You can use the document id and timestamp as a compound unique id.
>> Then the search would also sort by id, then by timestamp. Result
>> grouping might let you pick the most recent document from each of the
>> sorted docs.
>>
>> On Mon, May 28, 2012 at 3:15 PM, Nicholas Ball
>>  wrote:
>>>
>>> Hello all,
>>>
>>> For the first step of the distributed snapshot isolation system I'm
>>> developing for Solr, I'm going to need to have a MVCC mechanism as
>>> opposed
>>> to the single-version concurrency control mechanism already developed
>>> (DistributedUpdateProcessor class). I'm trying to find the very best
> way
>>> to
>>> develop this into Solr 4.x (trunk) and so any help would be greatly
>>> appreciated!
>>>
>>> Essentially I need to be able to store multiple version of a document
> so
>>> that when you look up a document with a given timestamp, you're given
> the
>>> correct version (anything the same or older, not fresher). The older
>>> versioned documents need to be stored in the index itself to ensure
> they
>>> are durable and can be manipulated as other Solr data can be.
>>>
>>> One way to do this is to store the old versioned Solr documents within
>>> the
>>> latest Solr Document, but I'm not sure this is even possible?
>>> Alternatively, I could have the latest versioned Document store the
>>> unique
>>> keys which point to other older documents. The problem with this is
> that
>>> it
>>> complicates things having various partial objects which all combine as
>>> one
>>> logically document.
>>>
>>> Are there any suggestions as to the best way to develop this feature?
>>>
>>> Thank you in advance for any help you can spare!
>>>
>>> Nicholas



-- 
Lance Norskog
goks...@gmail.com


Re: Many Cores with Solr

2012-05-29 Thread Michael Della Bitta
In our particular case, we're using this index to do prefix searches
for autocomplete of sparse keyword data, so we don't have much to
worry about on this front, but I do agree that it's a consideration
for those use cases that do reveal information via ranking.

Michael Della Bitta


Appinions, Inc. -- Where Influence Isn’t a Game.
http://www.appinions.com


On Tue, May 29, 2012 at 4:00 PM, Erik Hatcher  wrote:
> You do get relevancy related "leakage" though.  With users content all in the 
> same index and using the same field names, term and document frequencies 
> across the index will be used for scoring.  This may be (and has been) a good 
> reason to keep separately searchable content in different indexes/cores.
>
>        Erik
>
>
> On May 29, 2012, at 15:07 , Mike Douglass wrote:
>
>> Thank you.
>>
>> That sounds good - are we sure to get no leakage with this approach?
>>
>> I'd be indexing personal information which must not be delivered without
>> authentication.
>>
>> The solr instance is front-ended by bedework which can handle the auth and
>> adding a query term.
>>
>>> IMO it would be a better (from Solr's perspective) to handle the security
>>> w/ the application code.  Each query could include a "?fq=userID:12345..."
>>> which would limit results to only what that user is allowed to see.
>>
>>
>>
>> --
>> View this message in context: 
>> http://lucene.472066.n3.nabble.com/Many-Cores-with-Solr-tp3161889p3986675.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Many Cores with Solr

2012-05-29 Thread Erik Hatcher
You do get relevancy related "leakage" though.  With users content all in the 
same index and using the same field names, term and document frequencies across 
the index will be used for scoring.  This may be (and has been) a good reason 
to keep separately searchable content in different indexes/cores.

Erik


On May 29, 2012, at 15:07 , Mike Douglass wrote:

> Thank you.
> 
> That sounds good - are we sure to get no leakage with this approach?
> 
> I'd be indexing personal information which must not be delivered without
> authentication.
> 
> The solr instance is front-ended by bedework which can handle the auth and
> adding a query term.
> 
>> IMO it would be a better (from Solr's perspective) to handle the security
>> w/ the application code.  Each query could include a "?fq=userID:12345..."
>> which would limit results to only what that user is allowed to see.
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Many-Cores-with-Solr-tp3161889p3986675.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Many Cores with Solr

2012-05-29 Thread Michael Della Bitta
It's a similar approach as using SQL to filter the rows brought back
for a particular user from a table. It's strong as long as you write
your queries correctly, you store your data properly, and you guard
against injection and privilege escalation. There's an added bonus in
this case in that the user's submitted text isn't in the same query as
the part that limits the rows they have access to, but if you're doing
proper escaping of the query text, that shouldn't be relied on anyway.

Michael Della Bitta


Appinions, Inc. -- Where Influence Isn’t a Game.
http://www.appinions.com


On Tue, May 29, 2012 at 3:07 PM, Mike Douglass  wrote:
> Thank you.
>
> That sounds good - are we sure to get no leakage with this approach?
>
> I'd be indexing personal information which must not be delivered without
> authentication.
>
> The solr instance is front-ended by bedework which can handle the auth and
> adding a query term.


Re: MongoDB and Solr

2012-05-29 Thread Gora Mohanty
On 29 May 2012 22:27, rjain15  wrote:
> Hi
>
> I am building web app/mobile app, where users can update information
> frequently and there is a search function to quick search the information
> using different types of searches.
>
> Most of the data is going to be posted in JSON Format and stored in JSON
> format
>
> I have a few questions on the architecture choices, I am relatively new to
> Solr and MongoDB.
>
> 1. Should I use MongoDB to store the JSON documents, or does Solr natively
> store the documents in the data directory

Sorry, but you do not provide nearly enough information
for people to be able to make sensible suggestions. What
is your use case? MongoDB is largely a different beast from
Solr. What do you think merits its use, and where does it
fit in your scheme of things? In many cases, one could have
both MongoDB, and Solr. In other cases, one or the other
might better fit the bill.

> 2. Does Solr require a specific schema for the JSON document.

You can POST a JSON document to Solr, and get
JSON output back. Not sure if this meets your needs,
but please take a look at:
http://wiki.apache.org/solr/UpdateJSON
http://wiki.apache.org/solr/SolJSON

Regards,
Gora


Re: MongoDB and Solr

2012-05-29 Thread rjain15
Hi 

This is a sample schema, but it can be more nested as I build the app. As
more students enroll, or more classes are added, it will grow. 





colleges
[
"college":
{
"id" : "college Id"
"classes":
[
{
"id": "0001",
"type": "speech",
"name": "Speech Class",
"credits": 3,
"students":
{

{ "id": "1001", "name": "ABC", },

{ "id": "1002", "name": "PQQ",... },

{ "id": "1003", "name": "AAA",... },

{ "id": "1004", "name": "ASA",... }
},
"instructors":
[
{ "id": "5001", 
"name": "ASAS" },
{ "id": "5002", 
"name": "ASAA" },
]
},
]   
"locations":
[
{ "id": "6001", "address": "Address-1" 
},
{ "id": "6001", "address": "Address-2" 
},
]
}
]   



--
View this message in context: 
http://lucene.472066.n3.nabble.com/MongoDB-and-Solr-tp3986637p3986676.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Many Cores with Solr

2012-05-29 Thread Mike Douglass
Thank you.

That sounds good - are we sure to get no leakage with this approach?

I'd be indexing personal information which must not be delivered without
authentication.

The solr instance is front-ended by bedework which can handle the auth and
adding a query term.

> IMO it would be a better (from Solr's perspective) to handle the security
> w/ the application code.  Each query could include a "?fq=userID:12345..."
> which would limit results to only what that user is allowed to see.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Many-Cores-with-Solr-tp3161889p3986675.html
Sent from the Solr - User mailing list archive at Nabble.com.


Example setup of using Solr 3.6.0 with Jetty 7 (7.6.3)?

2012-05-29 Thread Aaron Daubman
Greetings,

Has anybody gotten Solr 3.6.0 to work well with Jetty 7.6.3, and if so,
would you mind sharing your config files / directory structure / other
useful details?

Thanks,
 Aaron


Re: MongoDB and Solr

2012-05-29 Thread Jack Krupansky
Could you give us an example of one of your documents. Then we can give you 
better feedback on what makes sense within Solr.


-- Jack Krupansky

-Original Message- 
From: rjain15

Sent: Tuesday, May 29, 2012 2:20 PM
To: solr-user@lucene.apache.org
Subject: Re: MongoDB and Solr

Hi Jack

Thanks for the information. I do have multi-level nesting of JSON data.

So back to my questions, apologize for repeating...

1. Should I use MongoDB to store the JSON documents, or does Solr natively
store the documents in the data directory

2. Does Solr require a specific schema for the JSON document.

Thanks
Rajesh


--
View this message in context: 
http://lucene.472066.n3.nabble.com/MongoDB-and-Solr-tp3986637p3986662.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: MongoDB and Solr

2012-05-29 Thread Michael Della Bitta
1. Yes, and 2. Yes. :)

Solr's adding more NoSQL-like features for 4.0, but in the meantime,
you're better off storing documents with a complex schema in a
document store and using Solr for findability. Basically the schema
for a document in Solr/Lucene is flat (although it can contain
arbitrarily-named fields), so your document will require some sort of
transformation for indexing.

Michael Della Bitta


Appinions, Inc. -- Where Influence Isn’t a Game.
http://www.appinions.com


On Tue, May 29, 2012 at 2:20 PM, rjain15  wrote:
> Hi Jack
>
> Thanks for the information. I do have multi-level nesting of JSON data.
>
> So back to my questions, apologize for repeating...
>
> 1. Should I use MongoDB to store the JSON documents, or does Solr natively
> store the documents in the data directory
>
> 2. Does Solr require a specific schema for the JSON document.
>
> Thanks
> Rajesh
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/MongoDB-and-Solr-tp3986637p3986662.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: MongoDB and Solr

2012-05-29 Thread rjain15
Hi Jack

Thanks for the information. I do have multi-level nesting of JSON data. 

So back to my questions, apologize for repeating...

1. Should I use MongoDB to store the JSON documents, or does Solr natively 
store the documents in the data directory 

2. Does Solr require a specific schema for the JSON document. 

Thanks
Rajesh


--
View this message in context: 
http://lucene.472066.n3.nabble.com/MongoDB-and-Solr-tp3986637p3986662.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: MongoDB and Solr

2012-05-29 Thread Jack Krupansky
Although Solr uses XML format for document update and query, JSON is a 
supported option.


To post documents in JSON, see:
http://wiki.apache.org/solr/UpdateJSON

To retrieve query results in JSON, see:
http://wiki.apache.org/solr/SolJSON

That works well for relatively flat data (each field has a simple value or 
list of values), but less well if you have complex structure within an 
individual field value (e.g., multi-level nesting of JSON for a single field 
value.) For the latter, you would have to store the JSON as a string for 
such a field.


-- Jack Krupansky

-Original Message- 
From: rjain15

Sent: Tuesday, May 29, 2012 12:57 PM
To: solr-user@lucene.apache.org
Subject: MongoDB and Solr

Hi

I am building web app/mobile app, where users can update information
frequently and there is a search function to quick search the information
using different types of searches.

Most of the data is going to be posted in JSON Format and stored in JSON
format

I have a few questions on the architecture choices, I am relatively new to
Solr and MongoDB.

1. Should I use MongoDB to store the JSON documents, or does Solr natively
store the documents in the data directory

2. Does Solr require a specific schema for the JSON document.


Thanks
Rajesh



--
View this message in context: 
http://lucene.472066.n3.nabble.com/MongoDB-and-Solr-tp3986637.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: UpdateRequestProcessor : flattened values

2012-05-29 Thread Jack Krupansky
Sounds good. Then all that will be needed is a way to disable the SolrCell 
flattening so that other update processors can see the unflattened field 
values before they are handled off to a ConcatFieldUpdateProcessor them.


-- Jack Krupansky

-Original Message- 
From: Chris Hostetter

Sent: Tuesday, May 29, 2012 12:43 PM
To: solr-user@lucene.apache.org
Subject: Re: UpdateRequestProcessor : flattened values


: And it might make sense to have a "multi-value flattening" attribute for 
Solr

: itself rather than in SolrCell.

Coming in 4.0...

https://builds.apache.org/view/G-L/view/Lucene/job/Solr-trunk/javadoc/org/apache/solr/update/processor/ConcatFieldUpdateProcessorFactory.html

DOC
Concatenates multiple values for fields matching the specified conditions
using a configurable delimiter which defaults to ", ".

By default, this processor concatenates the values for any field name
which according to the schema is multiValued="false" and uses TextField or
StrField
DOC



-Hoss 



MongoDB and Solr

2012-05-29 Thread rjain15
Hi

I am building web app/mobile app, where users can update information
frequently and there is a search function to quick search the information
using different types of searches. 

Most of the data is going to be posted in JSON Format and stored in JSON
format

I have a few questions on the architecture choices, I am relatively new to
Solr and MongoDB.  

1. Should I use MongoDB to store the JSON documents, or does Solr natively
store the documents in the data directory

2. Does Solr require a specific schema for the JSON document. 


Thanks
Rajesh



--
View this message in context: 
http://lucene.472066.n3.nabble.com/MongoDB-and-Solr-tp3986637.html
Sent from the Solr - User mailing list archive at Nabble.com.


MongoDB and Solr

2012-05-29 Thread rjain15
Hi

I am building web app/mobile app, where users can update information
frequently and there is a search function to quick search the information
using different types of searches. 

Most of the data is going to be posted in JSON Format and stored in JSON
format

I have a few questions on the architecture choices, I am relatively new to
Solr and MongoDB.  

1. Should I use MongoDB to store the JSON documents, or does Solr natively
store the documents in the data directory

2. Does Solr require a specific schema for the JSON document. 


Thanks
Rajesh



--
View this message in context: 
http://lucene.472066.n3.nabble.com/MongoDB-and-Solr-tp3986636.html
Sent from the Solr - User mailing list archive at Nabble.com.


Relevancy ranking for synonym matches

2012-05-29 Thread Gau
I was wondering if there is any solution for this.
Currently I expand my results to match the synonyms at query time.

So if I entered James, I would get results for Jim, Gomes, Game etc as they
would be expanded by matching the synonyms for James. But then since this is
just a one word match, tf, idf and other parameters dont make sense. I have
reset those factors to 1. Hence the results I get have an equal score.

What I really want to do is, sort these results by Levenstein Distance
without using ~ sign. The issue in using ~ sign is, if I have a synonym
which is radically different (say Greg for James), if I use James~0, Greg
would not even match closely with James and the number of results returned
would be less than the actual number of synonym matches.

So my usecase is, without reducing the number of results, I want to sort
them by Levenstein Distance, or closest string match to the original query

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Relevancy-ranking-for-synonym-matches-tp3986634.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr backup / replication internals

2012-05-29 Thread Ganesh
Hi,

Could any one explain me about the internals of Backup / Replication. Please 
give me more information like do's and don'ts of Backup / Replication.

1. Is the backup / replication incremental ? 

2. While taking backup / replication, Whether Solr could add / update the index?

3. Backup command and Backup script does file copy. Is there any difference 
between them. 

Regards
Ganesh


Re: UpdateRequestProcessor : flattened values

2012-05-29 Thread Chris Hostetter

: And it might make sense to have a "multi-value flattening" attribute for Solr
: itself rather than in SolrCell.

Coming in 4.0...

https://builds.apache.org/view/G-L/view/Lucene/job/Solr-trunk/javadoc/org/apache/solr/update/processor/ConcatFieldUpdateProcessorFactory.html

DOC
Concatenates multiple values for fields matching the specified conditions 
using a configurable delimiter which defaults to ", ".

By default, this processor concatenates the values for any field name 
which according to the schema is multiValued="false" and uses TextField or 
StrField
DOC



-Hoss


Re: Many Cores with Solr

2012-05-29 Thread Michael Della Bitta
That's what we do. It has the advantage of letting the general queries
be cached once across all users.

Michael

On Tue, May 29, 2012 at 12:39 PM, Klostermeyer, Michael
 wrote:
> IMO it would be a better (from Solr's perspective) to handle the security w/ 
> the application code.  Each query could include a "?fq=userID:12345..." which 
> would limit results to only what that user is allowed to see.
>
> Mike
>

-- 
Appinions, Inc. -- Where Influence Isn’t a Game. http://www.appinions.com


RE: Many Cores with Solr

2012-05-29 Thread Klostermeyer, Michael
IMO it would be a better (from Solr's perspective) to handle the security w/ 
the application code.  Each query could include a "?fq=userID:12345..." which 
would limit results to only what that user is allowed to see.
 
Mike

-Original Message-
From: Mike Douglass [mailto:mikeadougl...@gmail.com] 
Sent: Wednesday, May 23, 2012 4:02 PM
To: solr-user@lucene.apache.org
Subject: Re: Many Cores with Solr

My interest in this is the desire to create one index per user of a system - 
the issue here is privacy - data indexed for one user should not be visible to 
other users.

For this purpose solr will be hidden behind a proxy which steers authenticated 
sessions to the appropriat ecore.

Does this seem like a valid/feasible approach?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Many-Cores-with-Solr-tp3161889p3985789.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: TF-IDF vector

2012-05-29 Thread Jack Krupansky

"Does the tf-idf vector represents one doc or set of docs?"

IDF is calculated across all docs that contain the term.

TF is calculated for a single document containing the term.

Each term of each doc will have its own tf-idf.

-- Jack Krupansky

-Original Message- 
From: Allen 
Sent: Tuesday, May 29, 2012 12:11 PM 
To: solr-user@lucene.apache.org 
Subject: TF-IDF vector 


Hi List,

I am curious about the meaning of tf-idf vector after reading this
http://wiki.apache.org/solr/TermVectorComponent.

The tf flag returns me the tf vector for just one doc. The df flag
returns me the df vector of all the docs in the index.

Does the tf-idf vector represents one doc or set of docs?

Too, can I specify a subset of docs which the df vector is calculated
on rather than the entire set of docs?


TF-IDF vector

2012-05-29 Thread Allen
Hi List,

I am curious about the meaning of tf-idf vector after reading this
http://wiki.apache.org/solr/TermVectorComponent.

The tf flag returns me the tf vector for just one doc. The df flag
returns me the df vector of all the docs in the index.

Does the tf-idf vector represents one doc or set of docs?

Too, can I specify a subset of docs which the df vector is calculated
on rather than the entire set of docs?


sort in local params and rows parameter

2012-05-29 Thread jhusman
Hello,

we're having some issues with a Solr query and are unsure if we've
encountered a bug or just don't understand the expected behaviour. Any help
would be appreciated.

The problem is this: we're running a query using the browser that for
debugging purposes looks like this:
q={!sort%3D"eventId%20asc"}a&rows=2

here eventId is a long field in our schema. The sort works fine, but the
query returns 10 results (out of 35), clearly ignoring the rows parameter.
For reference, q=a&rows=2 only returns 2 results (again out of 35).

We can go around this by introducing rows as a local parameter instead:
q={!sort%3D"eventId%20asc"+rows%3D2}a

this only returns 2 results, as expected.
So, it seems that using sort as a local parameter causes solr to ignore the
external rows parameter. This does not seem to be true for any local
parameters, only "sort" (as far as we can tell).

Why is this happening?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/sort-in-local-params-and-rows-parameter-tp3986615.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Is optimize needed on slaves if it replicates from optimized master?

2012-05-29 Thread Walter Underwood
You do not need to use optimize at all.

Solr continually merges segments ("optimizes") as needed.

wunder

On May 29, 2012, at 6:08 AM, sudarshan wrote:

> Hi Walter,
> Thank you. Do you mean that optimize need not be used at all?
> If Solr merges segments (when needed as you said), is there a criteria
> during which Solr does this automatically. If I want the search to be faster
> and Solr does not optimize for quite a long time, would it not compromise my
> query processing rate?
> 
> To All,
> I have another doubt. If I optimize and replicate, for the
> first time it would transfer all the segments from the master to slave
> irrespective of the modified segment(s). After first replication, how the
> transfer would be made  - again all segments are replicated or only the
> modified segments are replicated? I believe after the first replication
> (master and slave in sync), only the modified segments would be transferred
> just like the  non-optimized index transfer. Am I right? 
> 
> Regards,
> Sudarshan  
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Is-optimize-needed-on-slaves-if-it-replicates-from-optimized-master-tp3241604p3986597.html
> Sent from the Solr - User mailing list archive at Nabble.com.







Re: suggestions developing a multi-version concurrency control (MVCC) mechanism

2012-05-29 Thread Nicholas Ball

Hmmm interesting, that will definitely work and may be the way to go.
Ideally, I'd rather store the older versions within a field of the newest
if possible.
Can one create a custom field that holds other objects?

Nick

On Mon, 28 May 2012 17:07:06 -0700, Lance Norskog 
wrote:
> You can use the document id and timestamp as a compound unique id.
> Then the search would also sort by id, then by timestamp. Result
> grouping might let you pick the most recent document from each of the
> sorted docs.
> 
> On Mon, May 28, 2012 at 3:15 PM, Nicholas Ball
>  wrote:
>>
>> Hello all,
>>
>> For the first step of the distributed snapshot isolation system I'm
>> developing for Solr, I'm going to need to have a MVCC mechanism as
>> opposed
>> to the single-version concurrency control mechanism already developed
>> (DistributedUpdateProcessor class). I'm trying to find the very best
way
>> to
>> develop this into Solr 4.x (trunk) and so any help would be greatly
>> appreciated!
>>
>> Essentially I need to be able to store multiple version of a document
so
>> that when you look up a document with a given timestamp, you're given
the
>> correct version (anything the same or older, not fresher). The older
>> versioned documents need to be stored in the index itself to ensure
they
>> are durable and can be manipulated as other Solr data can be.
>>
>> One way to do this is to store the old versioned Solr documents within
>> the
>> latest Solr Document, but I'm not sure this is even possible?
>> Alternatively, I could have the latest versioned Document store the
>> unique
>> keys which point to other older documents. The problem with this is
that
>> it
>> complicates things having various partial objects which all combine as
>> one
>> logically document.
>>
>> Are there any suggestions as to the best way to develop this feature?
>>
>> Thank you in advance for any help you can spare!
>>
>> Nicholas


Re: Solr - 1143

2012-05-29 Thread Jack Krupansky
That issue is marked as a duplicate of SOLR-3134, which has a patch for Solr 
3.5.


https://issues.apache.org/jira/browse/SOLR-3134

-- Jack Krupansky

-Original Message- 
From: Ramprakash Ramamoorthy

Sent: Tuesday, May 29, 2012 3:03 AM
To: solr-user@lucene.apache.org
Subject: Solr - 1143

Dear all,

 A small doubt. I realised I will have to apply the patch
mentioned in Solr Jira 1143 to return partial results when one of my shards
is dead/slow.

 But the patch has no version explicitly specified. I am using
Solr 3.5.0 and can I apply the patch to my installation as such?

--
With Thanks and Regards,
Ramprakash Ramamoorthy,
Engineer Trainee,
Zoho Corporation.
+91 9626975420 



Re: Multicore Issue - Server Restart

2012-05-29 Thread lboutros
Hi Suajtha,

each webapps has its own solr home ?

Ludovic.

-
Jouve
France.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Multicore-Issue-Server-Restart-tp3986516p3986602.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Is optimize needed on slaves if it replicates from optimized master?

2012-05-29 Thread sudarshan
Hi Walter,
 Thank you. Do you mean that optimize need not be used at all?
If Solr merges segments (when needed as you said), is there a criteria
during which Solr does this automatically. If I want the search to be faster
and Solr does not optimize for quite a long time, would it not compromise my
query processing rate?

To All,
 I have another doubt. If I optimize and replicate, for the
first time it would transfer all the segments from the master to slave
irrespective of the modified segment(s). After first replication, how the
transfer would be made  - again all segments are replicated or only the
modified segments are replicated? I believe after the first replication
(master and slave in sync), only the modified segments would be transferred
just like the  non-optimized index transfer. Am I right? 

Regards,
Sudarshan  

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-optimize-needed-on-slaves-if-it-replicates-from-optimized-master-tp3241604p3986597.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Multi-words synonyms matching

2012-05-29 Thread Bernd Fehling
Hello Elisabeth,

my synonyms.txt is like your 2nd example:

naturwald, φυσικό\ δάσος, естествена\ гора, prírodný\ les, naravni\ gozd,
foresta\ naturale, natuurbos, natural\ forest, bosque\ natural, természetes\ 
erdő,
natūralus\ miškas, prirodna\ šuma, dabiskais\ mežs, floresta\ natural, 
naturskov,
forêt\ naturelle, naturskog, přírodní\ les, luonnonmetsä, pădure\ naturală,
las\ naturalny, natürlicher\ wald


An example from my system with debugging turned on and searching for 
"naturwald":


  naturwald
  naturwald
  textth:naturwald textth:"φυσικό δάσος" 
textth:"естествена гора"
textth:"prírodný les" textth:"naravni gozd" textth:"foresta naturale" 
textth:natuurbos
textth:"natural forest" textth:"bosque natural" textth:"természetes erdő"
textth:"natūralus miškas" textth:"prirodna šuma" textth:"dabiskais mežs"
textth:"floresta natural" textth:naturskov textth:"forêt naturelle" 
textth:naturskog
textth:"přírodní les" textth:luonnonmetsä textth:"pădure naturală" textth:"las 
naturalny"
textth:"natürlicher wald"
...

As you can see my search for "naturwald" extends to single and multiword 
synonyms e.g. "forêt naturelle"


My SynonymFilterFactory has the following settings:

org.apache.solr.analysis.SynonymFilterFactory
{tokenizerFactory=solr.KeywordTokenizerFactory, 
synonyms=synonyms_eurovoc_desc_desc_ufall.txt, expand=true, format=solr, 
ignoreCase=true,
luceneMatchVersion=LUCENE_36}

But as I already mentioned, there is much more work to be done to get it 
running than
just using SynonymFilterFactory.

Regards
Bernd



Am 23.05.2012 08:49, schrieb elisabeth benoit:
> Hello Bernd,
> 
> Thanks for your advice.
> 
> I have one question: how did you manage to map one word to a multiwords
> synonym???
> 
> I've tried (in synonyms.txt)
> 
> mairie, hotel de ville
> 
> mairie, hotel\ de\ ville
> 
> mairie => mairie, hotel de ville
> 
> mairie => mairie, hotel\ de\ ville
> 
> but nothing prevents mairie from matching with "hotel"...
> 
> The only way I found is to use
> tokenizerFactory="solr.KeywordTokenizerFactory" in my synonyms declaration
> in schema.xml, but then since "mairie" is not alone in my index field, it
> doesn't match.
> 
> 
> best regards,
> Elisabeth
> 
> 
> 
> 
> the only way I found, I schema.xml, is to use
> 
> 
> 
> 2012/5/15 Bernd Fehling 
> 
>> Without reading the whole thread let me say that you should not trust
>> the solr admin analysis. It takes the whole multiword search and runs
>> it all together at once through each analyzer step (factory).
>> But this is not how the real system works. First pitfall, the query parser
>> is also splitting at white space (if not a phrase query). Due to this,
>> a multiword query is send chunk after chunk through the analyzer and,
>> second pitfall, each chunk runs through the whole analyzer by its own.
>>
>> So if you are dealing with multiword synonyms you have the following
>> problems. Either you turn your query into a phrase so that the whole
>> phrase is analyzed at once and therefore looked up as multiword synonym
>> but phrase queries are not analyzed !!! OR you send your query chunk
>> by chunk through the analyzer but then they are not multiwords anymore
>> and are not found in your synonyms.txt.
>>
>> From my experience I can say that it requires some deep work to get it done
>> but it is possible. I have connected a thesaurus to solr which is doing
>> query time expansion (no need to reindex if the thesaurus changes).
>> The thesaurus holds synonyms and "used for terms" in 24 languages. So
>> it is also some kind of language translation. And naturally the thesaurus
>> translates from single term to multi term synonyms and vice versa.
>>
>> Regards,
>> Bernd
>>
>>
>> Am 14.05.2012 13:54, schrieb elisabeth benoit:
>>> Just for the record, I'd like to conclude this thread
>>>
>>> First, you were right, there was no behaviour difference between fq and q
>>> parameters.
>>>
>>> I realized that:
>>>
>>> 1) my synonym (hotel de ville) has a stopword in it (de) and since I used
>>> tokenizerFactory="solr.KeywordTokenizerFactory" in my synonyms
>> declaration,
>>> there was no stopword removal in the indewed expression, so when
>> requesting
>>> "hotel de ville", after stopwords removal in query, Solr was comparing
>>> "hotel de ville"
>>> with "hotel ville"
>>>
>>> but my queries never even got to that point since
>>>
>>> 2) I made a mistake using "mairie" alone in the admin interface when
>>> testing my schema. The real field was something like "collectivités
>>> territoriales mairie",
>>> so the synonym "hotel de ville" was not even applied, because of the
>>> tokenizerFactory="solr.KeywordTokenizerFactory" in my synonym definition
>>> not splitting field into words when parsing
>>>
>>> So my problem is not solved, and I'm considering solving it outside of
>> Solr
>>> scope, unless someone else has a clue
>>>
>>> Thanks again,
>>> Elisabeth
>>>
>>>
>>>
>>> 2012/4/25 Erick Erickson 
>>>
 A little farther down the debug info output you'll f

Query elevation / boosting or something else to guarantee document position

2012-05-29 Thread Wenca

Hi all,

I have an index with thousands of products with various fields 
(manufacturer, price, popularity, type, color, ...) and I want to 
guarantee at least one product by a particular manufacturer to be within 
the first 5 results.


The search is done mainly by using filter params and results are ordered 
by function e.g.: "product(price, popularity) asc" or by  "discount desc"


And I need to guarantee that if there is any product matching the given 
filters made by a concrete manufacturer, then it will be on the 5th 
position at worst, even if the position by the order function is worse.


It seems to me that the Query elevation component is not the right thing 
for me. I don't know the query in advance (or the set of filter 
criteria) and I don't know concrete product that will be the best for 
the criteria within the order.


And also I don't think that I can construct a function with such 
requirements to use it directly for ordering the results.


Of course I can make a second query in case there is no desired product 
on the first page of results and put it there, but it requires 
additional request to solr and complicates results processing and 
further pagination.


Can anybody suggest any solution?

Thanks
Wenca


A few random questions about solr queries.

2012-05-29 Thread santamaria2
*1)* With faceting, how does facet.query perform in comparison to
facet.field? I'm just wondering this as in my use case, I need to facet over
a field -- which would get me the top n facets for that field, but I also
need to show the count for a "selected filter" which might have a relatively
low count so it doesn't appear in the top n returned facets. So the solution
would be to 'ensure' its presence by adding a 'facet.query=cat:val' in
addition to my facet.field=cat.

I want to do this to quite a few fields.

Related/example-based question:
When I facet over a field, and something gets returned, eg: John Smith (83),
and I also 'ensure' this facet's presence by having it in
facet.query=author:"John Smith", are two different calculations performed?
Or is the facet returned by facet.field also used by facet.query to obtain
the count?



*2) *Is there a performance issue if I have around, say, 20 facet.query
conditions along with 10 facet.fields? 3/10 of those fields have around
100,000 possible values. Remaining have a few hundred each.



*3)* I've rummaged around a bit, looking for info on when to use q vs fq. I
want to clear my doubts for a certain use case.

Where should my date range queries go? In q or fq? The default settings in
my site show results from the past 90 days with buttons to show stuff from
the last month and week as well. But the user is allowed to use a slider to
apply any date range... this is allowed, but it's not /that/ common. 
I definitely use fq for filtering various tags. Choosing a tag is a common
activity.

Should the date range query go in fq? As I mentioned, the default view shows
stuff from the past 90 days. So on each new day does this like invalidate
stuff in the cache? Or is stuff stored in the filtered cache in some way
that makes it easy to fetch stuff from the past 89 days when a query is
performed the next day?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/A-few-random-questions-about-solr-queries-tp3986562.html
Sent from the Solr - User mailing list archive at Nabble.com.


[SolrCloud] Replication Factor

2012-05-29 Thread Antoine LE FLOC'H
Hello all,

The page http://wiki.apache.org/solr/NewSolrCloudDesign is mentioning

"Replication Factor"

It is a feature supported by Katta. Is it actually supported by SolrCloud ?

A more general question: katta had some pretty good features like this one.
Why is katta not active anymore ? Is there a way to run equivalent
functionalities with another Solr based framework today, if these doesn't
exist in SolrCloud yet ?

Thank you.


RE: useFastVectorHighlighter doesn't work

2012-05-29 Thread Ahmet Arslan
> So for highlight, stored="true" is
> required in any circumstance, right?

Exactly. http://wiki.apache.org/solr/FieldOptionsByUseCase



RE: useFastVectorHighlighter doesn't work

2012-05-29 Thread ZHANG Liang F
So for highlight, stored="true" is required in any circumstance, right?

 

-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.com] 
Sent: 2012年5月29日 16:04
To: solr-user@lucene.apache.org
Subject: RE: useFastVectorHighlighter doesn't work

> The reason why I use useFastVectorHighlighter is because I want to set 
> stored="false", and with more settings like  termVectors="true" 
> termPositions="true"
> termOffsets="true". If stored="true", what is the difference between 
> normal highlight and useFastVectorHighlighter? What is the right 
> situation for using useFastVectorHighlighter?

term*="true" makes sense only for stored="true". FastVectorHighlighter requires 
and makes use of term*="true" for speedup.


RE: useFastVectorHighlighter doesn't work

2012-05-29 Thread Ahmet Arslan
> The reason why I use useFastVectorHighlighter is because I
> want to set stored="false", and with more settings
> like  termVectors="true" termPositions="true"
> termOffsets="true". If stored="true", what is the difference
> between normal highlight and useFastVectorHighlighter? What
> is the right situation for using useFastVectorHighlighter?

term*="true" makes sense only for stored="true". FastVectorHighlighter requires 
and makes use of term*="true" for speedup.