date:20120514

Re: Problem with AND clause in multi core search query

2012-05-14 Thread Tommaso Teofili

The latter is supposed to work:
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=column1
:"A" OR column2:"B"

The first query cannot work as there is no document neither in core0 nor in
core1 which has A in field column1 and B in field column2 but only
documents which have B in column2 (in core1) OR A in column1 (in core0).

Regards.
Tommaso

2012/5/15 ravicv 

> Hi,
>
> I have 2 cores configured in my solr instance.
>
> Both cores are using same schema.
>
> I have indexed column1 in core0 and column2 in core1
>
> My search query is
>
>
> http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=column1
> :"A"
> AND column2:"B"
>
> No result found
>
>
>
> http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=column1
> :"A"
> OR column2:"B"
>
> Whether AND is supported in multi core search?
>
> Thanks,
> ravi
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Problem-with-AND-clause-in-multi-core-search-query-tp3983800.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: problem with date searching.

2012-05-14 Thread ayyappan

select/?defType=dismax&q=+ibrahim.hamid+2012-02-02T04:00:52Z&qf=+userid+scanneddate&version=2.2&start=0&rows=50&indent=on&wt=json&&debugQuery=on

--
View this message in context: 
http://lucene.472066.n3.nabble.com/problem-with-date-searching-tp3961761p3983802.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: problem with date searching.

2012-05-14 Thread ayyappan

In fact I am able to see "scanneddate" field  when i added query like this 

"responseHeader":{
-
-
  "q":" ibrahim.hamid 2012-02-02T04:00:52Z",
  "qf":" userid scanneddate",
  "wt":"json",
  "defType":"dismax",
  "version":"2.2",
  "rows":"50"}},
  "response":{"numFound":20,"start":0,"docs":[
  {
  ---
--
"scanneddate":["2012-02-02T04:00:52Z"],

},

--
View this message in context: 
http://lucene.472066.n3.nabble.com/problem-with-date-searching-tp3961761p3983801.html
Sent from the Solr - User mailing list archive at Nabble.com.

Problem with AND clause in multi core search query

2012-05-14 Thread ravicv

Hi,

I have 2 cores configured in my solr instance.

Both cores are using same schema.

I have indexed column1 in core0 and column2 in core1

My search query is 

http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=column1:"A";
AND column2:"B"

No result found


http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=column1:"A";
OR column2:"B"

Whether AND is supported in multi core search?

Thanks,
ravi

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problem-with-AND-clause-in-multi-core-search-query-tp3983800.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Getting payloads for matching term in search result

2012-05-14 Thread s . hermann


Hello,


On 05/14/2012 10:33 PM, Mikhail Khludnev wrote:

It's not really out-of-the-box, but not a big deal
http://www.lucidimagination.com/blog/2010/04/18/refresh-getting-started-with-payloads/


yeah I know, but I do not know where to put/plugin the code on solrs server 
side. For testing purposes I have done that already with lucene directly so I 
thought there must be a way with solr aswell.

Regards,

Silvio

document cache

2012-05-14 Thread shinkanze

 hi ,

I want to know the internal mechanism how document cache works .

specifically its flushing cycle ...

i.e does it gets flushed  on every commit /replication .

regards 

Rajat Rastogi


--
View this message in context: 
http://lucene.472066.n3.nabble.com/document-cache-tp3983796.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: socket timeout

2012-05-14 Thread Jon Kirton

Here is the stacktrace for the timeout:

2012-05-09 13:08:30,521 [http-8080-62] DEBUG solr.SolrService  -
org.apache.solr.client.solrj.SolrServerException:
java.net.SocketTimeoutException: Read timed out

org.apache.solr.client.solrj.SolrServerException:
java.net.SocketTimeoutException: Read timed out

at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:483)

at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)

at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)

at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:122)


On Mon, May 14, 2012 at 9:29 PM, Jon Kirton  wrote:

> Intermittently, a socket timeout occurs from a search request.  Is there a
> config param I can set in solrconfig.xml to specify socket timeouts for
> version 1.4.1 ?
>

socket timeout

2012-05-14 Thread Jon Kirton

Intermittently, a socket timeout occurs from a search request.  Is there a
config param I can set in solrconfig.xml to specify socket timeouts for
version 1.4.1 ?

Re: Unexpected query rewrite from WordDelimiterFilterFactory and SynonymFilterFactory

2012-05-14 Thread Jack Krupansky

If it is important enough for you, you could expand multi-word and compound 
word synonyms as a preprocessing step and generate an "OR" expression in the 
query.


-- Jack Krupansky

-Original Message- 
From: Chung Wu

Sent: Monday, May 14, 2012 8:25 PM
To: solr-user@lucene.apache.org
Subject: Re: Unexpected query rewrite from WordDelimiterFilterFactory and 
SynonymFilterFactory


Thanks Jack!  It's too bad I can't have catenate and generateParts both set
to "1" at query time.  If I set catenate to "0", then I miss the case where
"wifi" is indexed but "wi-fi" is queried.  If I set generateParts to "0",
then I miss the case where "wi fi" is queried but "wi-fi" is canceled.   I
guess I'll just have to pick one!

Chung

On Mon, May 14, 2012 at 4:50 PM, Jack Krupansky 
wrote:



The extra terms are okay at index time - they simply overlap the base
words and make composite terms more searchable, but you need to have a
separate query analyzer that sets the various catenate options to "0" 
since

the query generator doesn't know what to do with the extra terms. Synonyms
are a little more tricky - the simplest thing is to disable them in the
index analyzer and do them only in the query analyzer - and multi-term
synonyms don't work well, except for replacement synonyms at index time.

See the "text_en_splitting" field type in the example schema.

-- Jack Krupansky

-Original Message- From: Chung Wu
Sent: Monday, May 14, 2012 7:01 PM
To: solr-user@lucene.apache.org
Subject: Unexpected query rewrite from WordDelimiterFilterFactory and
SynonymFilterFactory


Hi all!

I'm using Solr 3.6, and I'm seeing unexpected query rewriting when either
using WordDelimiterFilterFactory with catenateWords="1", or with
SynonymFilterFactory with multi-word synonyms.

For example, in this type where a WordDelimiterFilterFactory is used for
the query analyzer, with catenateWords="1":

  

  
  

  

For the query "wi-fi", the term positions after the
WordDelimiterFilterFactory looks like this:

position 1 2 term text wi fi wifi startOffset 0 3 0 endOffset 2 5 5
typewordwordword


And looking at debug output, the parsed query looks like this, which is
surprising:

test1:"**wi-fi"
test1:"wi-**fi"
**MultiPhraseQuery(test1:"wi (fi wifi)")
***test1:"wi (fi wifi)*"


I see similar things happening if I use SynonymFilterFactory with
multi-word synonyms (maybe related to this bug:
https://issues.apache.org/**jira/browse/SOLR-3390;
I originally asked about
it here:
http://stackoverflow.com/**questions/10218224/in-solr-**
expanding-multi-word-synonyms-**and-term-positions
)

Any ideas on what I'm supposed to do to make this work as expected?

Thanks!

Chung

Re: facet range query question

2012-05-14 Thread andy

THANKS for your relay

--
View this message in context: 
http://lucene.472066.n3.nabble.com/facet-range-query-question-tp3976026p3983783.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Unexpected query rewrite from WordDelimiterFilterFactory and SynonymFilterFactory

2012-05-14 Thread Chung Wu

Thanks Jack!  It's too bad I can't have catenate and generateParts both set
to "1" at query time.  If I set catenate to "0", then I miss the case where
"wifi" is indexed but "wi-fi" is queried.  If I set generateParts to "0",
then I miss the case where "wi fi" is queried but "wi-fi" is canceled.   I
guess I'll just have to pick one!

Chung

On Mon, May 14, 2012 at 4:50 PM, Jack Krupansky wrote:

> The extra terms are okay at index time - they simply overlap the base
> words and make composite terms more searchable, but you need to have a
> separate query analyzer that sets the various catenate options to "0" since
> the query generator doesn't know what to do with the extra terms. Synonyms
> are a little more tricky - the simplest thing is to disable them in the
> index analyzer and do them only in the query analyzer - and multi-term
> synonyms don't work well, except for replacement synonyms at index time.
>
> See the "text_en_splitting" field type in the example schema.
>
> -- Jack Krupansky
>
> -Original Message- From: Chung Wu
> Sent: Monday, May 14, 2012 7:01 PM
> To: solr-user@lucene.apache.org
> Subject: Unexpected query rewrite from WordDelimiterFilterFactory and
> SynonymFilterFactory
>
>
> Hi all!
>
> I'm using Solr 3.6, and I'm seeing unexpected query rewriting when either
> using WordDelimiterFilterFactory with catenateWords="1", or with
> SynonymFilterFactory with multi-word synonyms.
>
> For example, in this type where a WordDelimiterFilterFactory is used for
> the query analyzer, with catenateWords="1":
>
>positionIncrementGap="100" autoGeneratePhraseQueries="**true">
> 
>   
>generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
> 
>   
>
> For the query "wi-fi", the term positions after the
> WordDelimiterFilterFactory looks like this:
>
> position 1 2 term text wi fi wifi startOffset 0 3 0 endOffset 2 5 5
> typewordwordword
>
>
> And looking at debug output, the parsed query looks like this, which is
> surprising:
>
> test1:"**wi-fi"
> test1:"wi-**fi"
> **MultiPhraseQuery(test1:"wi (fi wifi)")
> ***test1:"wi (fi wifi)*"
>
>
> I see similar things happening if I use SynonymFilterFactory with
> multi-word synonyms (maybe related to this bug:
> https://issues.apache.org/**jira/browse/SOLR-3390;
> I originally asked about
> it here:
> http://stackoverflow.com/**questions/10218224/in-solr-**
> expanding-multi-word-synonyms-**and-term-positions
> )
>
> Any ideas on what I'm supposed to do to make this work as expected?
>
> Thanks!
>
> Chung
>

Re: adding an OR to a fq makes some doc that matched not match anymore

2012-05-14 Thread Jack Krupansky


Don't forget to uuencode the spaces as "+" or "%20".

Playing around, I noticed that putting parens around the negative term 
changed the results:


I'm not sure whether that is a bug or not.

In any case, try:

/suggest?q=suggest_terms:lap*&fq=type:P&fq=((-type:B)+OR+name:aa)

-- Jack Krupansky

-Original Message- 
From: jmlucjav

Sent: Monday, May 14, 2012 7:22 PM
To: solr-user@lucene.apache.org
Subject: adding an OR to a fq makes some doc that matched not match anymore

Hi,

I am trying to understand this scenario (Solr3.6):
- /suggest?q=suggest_terms:lap*&fq=type:P&fq=(-type:B)
numFound=1

- I add a OR to the second fq. That fq is already fulfilled by the found
doc, so adding a doc will also fulfill right?
/suggest?q=suggest_terms:lap*&fq=type:P&fq=(-type:B OR name:aa)
numFound=0

is there a logical explanation??
thanks
xab


--
View this message in context: 
http://lucene.472066.n3.nabble.com/adding-an-OR-to-a-fq-makes-some-doc-that-matched-not-match-anymore-tp3983775.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Unexpected query rewrite from WordDelimiterFilterFactory and SynonymFilterFactory

2012-05-14 Thread Jack Krupansky

The extra terms are okay at index time - they simply overlap the base words 
and make composite terms more searchable, but you need to have a separate 
query analyzer that sets the various catenate options to "0" since the query 
generator doesn't know what to do with the extra terms. Synonyms are a 
little more tricky - the simplest thing is to disable them in the index 
analyzer and do them only in the query analyzer - and multi-term synonyms 
don't work well, except for replacement synonyms at index time.


See the "text_en_splitting" field type in the example schema.

-- Jack Krupansky

-Original Message- 
From: Chung Wu

Sent: Monday, May 14, 2012 7:01 PM
To: solr-user@lucene.apache.org
Subject: Unexpected query rewrite from WordDelimiterFilterFactory and 
SynonymFilterFactory


Hi all!

I'm using Solr 3.6, and I'm seeing unexpected query rewriting when either
using WordDelimiterFilterFactory with catenateWords="1", or with
SynonymFilterFactory with multi-word synonyms.

For example, in this type where a WordDelimiterFilterFactory is used for
the query analyzer, with catenateWords="1":

   
 
   
   
 
   

For the query "wi-fi", the term positions after the
WordDelimiterFilterFactory looks like this:

position 1 2 term text wi fi wifi startOffset 0 3 0 endOffset 2 5 5
typewordwordword


And looking at debug output, the parsed query looks like this, which is
surprising:

test1:"wi-fi"
test1:"wi-fi"
MultiPhraseQuery(test1:"wi (fi wifi)")
*test1:"wi (fi wifi)*"

I see similar things happening if I use SynonymFilterFactory with
multi-word synonyms (maybe related to this bug:
https://issues.apache.org/jira/browse/SOLR-3390; I originally asked about
it here:
http://stackoverflow.com/questions/10218224/in-solr-expanding-multi-word-synonyms-and-term-positions
)

Any ideas on what I'm supposed to do to make this work as expected?

Thanks!

Chung

Re: Boosting on field empty or not

2012-05-14 Thread Jack Krupansky

Reading more closely, I see that there is a workaround: add a space after 
any left parenthesis.

So, try this:

q=chairs+AND+(+regularprice:*^5+OR+(+*:*+-regularprice:*)^5)

Here's an example of my own:

http://localhost:8983/solr/select/?q=the+AND+(+price:*^5+OR+(+*:*+-price:*)^0.5)&defType=edismax&debugQuery=true&qf=text

-- Jack Krupansky

-Original Message- 
From: Jack Krupansky

Sent: Monday, May 14, 2012 7:21 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

Oh well, it looks like my suggestion is running into "SOLR-3377 - eDismax: A
fielded query wrapped by parens is not recognized".

See:
https://issues.apache.org/jira/browse/SOLR-3377

That issue has a patch, but not yet committed.

That explains why it works in the traditional Solr/Lucene query parser, but
not in edismax.

http://localhost:8983/solr/select/?q=the+AND+(price:*^5+OR+(*:*+-price:*)^0.5)

-- Jack Krupansky

-Original Message- 
From: Jack Krupansky

Sent: Monday, May 14, 2012 4:49 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

It may not matter, but the spaces in the query should be uuencoded as "+".

I have the query working with Solr query, but it is giving me no docs for
edismax for some reason. But, it does seem to work if I reverse the order of
the query terms to be:

http://localhost:8983/solr/select/?q=*:*+AND+((*:*+-price:*)^0.5+OR+price:*^5)&defType=edismax&debugQuery=true&qf=text

Let me try a couple more things.

-- Jack Krupansky

-Original Message- 
From: Donald Organ

Sent: Monday, May 14, 2012 4:19 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

&q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)&sort=score
desc

Same effect.

On Mon, May 14, 2012 at 4:12 PM, Jack Krupansky
wrote:

Change the second boost to 0.5 to de-boost doc that are missing the field
value. You had them the same.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 4:01 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK it looks like the query change is working but it looks like it boosting
everything even documents that have that field empty

On Mon, May 14, 2012 at 3:41 PM, Donald Organ wrote:

 OK i must be missing something:

defType=edismax&start=0&rows=**24&facet=true&qf=nameSuggest^**10 name^10
codeTXT^2 description^1 brand_search^0 cat_search^10&spellcheck=true&**
spellcheck.collate=true&**spellcheck.q=chairs&facet.**
mincount=1&fl=code,score&q=**chairs AND (regularprice:*^5 OR (*:*
-regularprice:*)^5)&sort=score desc

On Mon, May 14, 2012 at 3:36 PM, Jack Krupansky 
**wrote:

 "(*:* -regularprice:*)5" should be "(*:* -regularprice:*)^0.5" - the

missing boost operator.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 3:31 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

Still doesnt appear to be working.  Here is the full Query string:

defType=edismax&start=0&rows=24&facet=true&qf=nameSuggest^10
name^10
codeTXT^2 description^1 brand_search^0
cat_search^10&spellcheck=true&spellcheck.collate=true&**
spellcheck.q=chairs&facet.mincount=1&fl=code,score&q=chairs
AND (regularprice:*^5 OR (*:* -regularprice:*)5)

On Mon, May 14, 2012 at 3:28 PM, Jack Krupansky 
**wrote:

 Sorry, make that:

&q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)

I forgot that pure negative queries are broken again, so you need the
*:*
in there.

I noticed that you second boost operator was missing as well.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 3:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK i just tried:

&q=chairs AND (regularprice:*^5 OR (-regularprice:*)5)

And that gives me 0 results

On Mon, May 14, 2012 at 2:51 PM, Jack Krupansky <
j...@basetechnology.com
>*
*wrote:

 foo AND (field:*^2.0 OR (-field:*)^0.5)

So, if a doc has anything in the field, it gets boosted, and if the 
doc

does not have anything in the field, de-boost it. Choose the boost
factors
to suit your desired boosting effect.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 2:38 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK maybe i need to describe this a little more.

Basically I want documents that have a given field populated to have a
higher score than the documents that dont.  So if you search for foo I
want
documents that contain foo, but i want the documents that have field a
populated to have a higher score...

Is there a way to do this?

On Mon, May 14, 2012 at 2:22 PM, Jack Krupansky <
j...@basetechnology.com
>*
*wrote:

 In a query or filter query you can write +field:* to require that a
field

 be populated or +(-field:*) to require that it not be populated

-- Jack Krupansky

-Origin

adding an OR to a fq makes some doc that matched not match anymore

2012-05-14 Thread jmlucjav

Hi, 

I am trying to understand this scenario (Solr3.6):
- /suggest?q=suggest_terms:lap*&fq=type:P&fq=(-type:B)
numFound=1

- I add a OR to the second fq. That fq is already fulfilled by the found
doc, so adding a doc will also fulfill right?
/suggest?q=suggest_terms:lap*&fq=type:P&fq=(-type:B OR name:aa)
numFound=0

is there a logical explanation??
thanks 
xab


--
View this message in context: 
http://lucene.472066.n3.nabble.com/adding-an-OR-to-a-fq-makes-some-doc-that-matched-not-match-anymore-tp3983775.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Boosting on field empty or not

2012-05-14 Thread Jack Krupansky

Oh well, it looks like my suggestion is running into "SOLR-3377 - eDismax: A 
fielded query wrapped by parens is not recognized".

See:
https://issues.apache.org/jira/browse/SOLR-3377

That issue has a patch, but not yet committed.

That explains why it works in the traditional Solr/Lucene query parser, but 
not in edismax.

http://localhost:8983/solr/select/?q=the+AND+(price:*^5+OR+(*:*+-price:*)^0.5)

-- Jack Krupansky

-Original Message- 
From: Jack Krupansky

Sent: Monday, May 14, 2012 4:49 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

It may not matter, but the spaces in the query should be uuencoded as "+".

I have the query working with Solr query, but it is giving me no docs for
edismax for some reason. But, it does seem to work if I reverse the order of
the query terms to be:

http://localhost:8983/solr/select/?q=*:*+AND+((*:*+-price:*)^0.5+OR+price:*^5)&defType=edismax&debugQuery=true&qf=text

Let me try a couple more things.

-- Jack Krupansky

-Original Message- 
From: Donald Organ

Sent: Monday, May 14, 2012 4:19 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

&q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)&sort=score
desc

Same effect.

On Mon, May 14, 2012 at 4:12 PM, Jack Krupansky
wrote:

Change the second boost to 0.5 to de-boost doc that are missing the field
value. You had them the same.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 4:01 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK it looks like the query change is working but it looks like it boosting
everything even documents that have that field empty

On Mon, May 14, 2012 at 3:41 PM, Donald Organ wrote:

 OK i must be missing something:

defType=edismax&start=0&rows=**24&facet=true&qf=nameSuggest^**10 name^10
codeTXT^2 description^1 brand_search^0 cat_search^10&spellcheck=true&**
spellcheck.collate=true&**spellcheck.q=chairs&facet.**
mincount=1&fl=code,score&q=**chairs AND (regularprice:*^5 OR (*:*
-regularprice:*)^5)&sort=score desc

On Mon, May 14, 2012 at 3:36 PM, Jack Krupansky 
**wrote:

 "(*:* -regularprice:*)5" should be "(*:* -regularprice:*)^0.5" - the

missing boost operator.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 3:31 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

Still doesnt appear to be working.  Here is the full Query string:

defType=edismax&start=0&rows=24&facet=true&qf=nameSuggest^10
name^10
codeTXT^2 description^1 brand_search^0
cat_search^10&spellcheck=true&spellcheck.collate=true&**
spellcheck.q=chairs&facet.mincount=1&fl=code,score&q=chairs
AND (regularprice:*^5 OR (*:* -regularprice:*)5)

On Mon, May 14, 2012 at 3:28 PM, Jack Krupansky 
**wrote:

 Sorry, make that:

&q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)

I forgot that pure negative queries are broken again, so you need the
*:*
in there.

I noticed that you second boost operator was missing as well.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 3:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK i just tried:

&q=chairs AND (regularprice:*^5 OR (-regularprice:*)5)

And that gives me 0 results

On Mon, May 14, 2012 at 2:51 PM, Jack Krupansky <
j...@basetechnology.com
>*
*wrote:

 foo AND (field:*^2.0 OR (-field:*)^0.5)

So, if a doc has anything in the field, it gets boosted, and if the 
doc

does not have anything in the field, de-boost it. Choose the boost
factors
to suit your desired boosting effect.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 2:38 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK maybe i need to describe this a little more.

Basically I want documents that have a given field populated to have a
higher score than the documents that dont.  So if you search for foo I
want
documents that contain foo, but i want the documents that have field a
populated to have a higher score...

Is there a way to do this?

On Mon, May 14, 2012 at 2:22 PM, Jack Krupansky <
j...@basetechnology.com
>*
*wrote:

 In a query or filter query you can write +field:* to require that a
field

 be populated or +(-field:*) to require that it not be populated

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 2:10 PM
To: solr-user
Subject: Boosting on field empty or not

Is there a way to boost a document based on whether the field is 
empty

or
not.  I am looking to boost documents that have a specific field
populated.

Re: Index an xml filed that is saved in a database

2012-05-14 Thread Jack Krupansky

Granted, a proper XML parse of the input field is better. I didn't see an 
obvious solution at first, but I did run across this:


"Use a fielddatasource for reading field from database and then use 
xpathentityprocessor. Field datasource will give you the stream that is 
needed by xpathentity processor."


See:
http://osdir.com/ml/solr-user.lucene.apache.org/2011-02/msg00769.html

-- Jack Krupansky

-Original Message- 
From: Michael Della Bitta

Sent: Monday, May 14, 2012 5:51 PM
To: solr-user@lucene.apache.org
Subject: Re: Index an xml filed that is saved in a database

That answer may serve the OP well, but I can't help but propagate this
link when the idea of parsing XML with regex comes up:

http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

:)

Michael


On Mon, 2012-05-14 at 17:03 -0400, Jack Krupansky wrote:

A regex transformer should do the trick:

http://wiki.apache.org/solr/DataImportHandler#RegexTransformer

-- Jack Krupansky

-Original Message- 
From: Ramo Karahasan

Sent: Monday, May 14, 2012 4:54 PM
To: solr-user@lucene.apache.org
Subject: Index an xml filed that is saved in a database

Hi,



I have an XML document saved in a column of a database table. Is it 
possible

to index just one part of that xml string, e.g. . with
the DIH handler or is it necessary to extract this information previously?



Thanks,

Ramo

Re: Index an xml filed that is saved in a database

2012-05-14 Thread Michael Della Bitta

That answer may serve the OP well, but I can't help but propagate this
link when the idea of parsing XML with regex comes up:

http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

:)

Michael 


On Mon, 2012-05-14 at 17:03 -0400, Jack Krupansky wrote:
> A regex transformer should do the trick:
> 
> http://wiki.apache.org/solr/DataImportHandler#RegexTransformer
> 
> -- Jack Krupansky
> 
> -Original Message- 
> From: Ramo Karahasan
> Sent: Monday, May 14, 2012 4:54 PM
> To: solr-user@lucene.apache.org
> Subject: Index an xml filed that is saved in a database
> 
> Hi,
> 
> 
> 
> I have an XML document saved in a column of a database table. Is it possible
> to index just one part of that xml string, e.g. . with
> the DIH handler or is it necessary to extract this information previously?
> 
> 
> 
> Thanks,
> 
> Ramo
>

- Solr 4.0 - How do I enable JSP support ? ...

2012-05-14 Thread Naga Vijayapuram

Hello,

How do I enable JSP support in Solr 4.0 ?

Thanks
Naga

Re: Urgent! Highlighting not working as expected

2012-05-14 Thread Jack Krupansky

The highlighting will be based only on the fields in which matching 
occurred. Are you using edismax and with multiple fields in qf, or the 
traditional Solr (Lucene) query parser that only matches in the default 
field or an explicit field?


-- Jack Krupansky

-Original Message- 
From: TJ Tong

Sent: Monday, May 14, 2012 4:44 PM
To: solr-user@lucene.apache.org
Subject: Urgent! Highlighting not working as expected

Dear all,

I queried Solr (3.5) with this: q=text:"G-Money"&hl=true&hl.fl=*, where text
is a "text" field and all the other fields were copied to it. I got three
records returned, however, only one field (also "text" field) was
highlighted:



G-MONEY HETZEL






But the other two also have matched fields (that is why they are returned),
but they are "string" field, they were not highlighted. Also, in the same
record "cr_149107", the "string" field "cr_firstname" has exactly matched
string "G-Money", but it was not highlighted. But if I search on this field:
q=cr_firstname:"G-Money"&hl=true&hl.fl=*, it will be highlighted. Any idea
what shall I do to let both "text" and "string" fields highlighted?

Thanks in advance!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Urgent-Highlighting-not-working-as-expected-tp3983755.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Shards multi core slower then single big core

2012-05-14 Thread Otis Gospodnetic

Aha!  See, Kuli, I wasn't making it up! ;)

Otis 

Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 



>
> From: Robert Stewart 
>To: solr-user@lucene.apache.org 
>Sent: Monday, May 14, 2012 11:23 AM
>Subject: Re: Solr Shards multi core slower then single big core
> 
>We used to have one large index - then moved to 10 shards (7 million docs 
>each) - parallel search across all shards, and we get better performance that 
>way.  We use a 40 core box with 128GB ram.  We do a lot of faceting so maybe 
>that is why since facets can be built in parallel on different threads/cores.  
>We also have indexes on fast local disks (6 15K RPM disks using raid stripes).
>
>
>On May 14, 2012, at 10:42 AM, Michael Della Bitta wrote:
>
>> Hi, all,
>> 
>> I've been running into murmurs about this idea elsewhere:
>> 
>> http://stackoverflow.com/questions/8698762/run-multiple-big-solr-shard-instances-on-one-physical-machine
>> 
>> http://java.dzone.com/articles/optimizing-solr-or-how-7x-your?mz=33057-solr_lucene
>> 
>> Michael
>> 
>> On Mon, May 14, 2012 at 10:29 AM, Otis Gospodnetic
>>  wrote:
>>> Hi Kuli,
>>> 
>>> As long as there are enough CPUs with spare cycles and disk IO is not a 
>>> bottleneck, this works faster.  This was 12+ months ago.
>>> 
>>> Otis
>>> 
>>> Performance Monitoring for Solr / ElasticSearch / HBase - 
>>> http://sematext.com/spm
>>> 
>>> 
>>> 
 
 From: Michael Kuhlmann 
 To: solr-user@lucene.apache.org
 Sent: Monday, May 14, 2012 10:21 AM
 Subject: Re: Solr Shards multi core slower then single big core
 
 Am 14.05.2012 16:18, schrieb Otis Gospodnetic:
> Hi Kuli,
> 
> In a client engagement, I did see this (N shards on 1 beefy box with lots 
> of RAM and CPU cores) be faster than 1 big index.
> 
 
 I want to believe you, but I also want to understand. Can you explain
 why? And did this only happen for single requests, or even under heavy 
 load?
 
 Greetings,
 Kuli
 
 
 
>
>
>
>

Re: Update JSON not working for me

2012-05-14 Thread rjain15

I haven't modified any schema or config. I am going to do it all over...clean
install. 

I tried with 3.6 and I have the same issue. 

I am going to try with 4.x one more time, its been painful, I am so excited
to use Solr for my project, and seems I am stuck on the basics. 

Thanks
Rajesh


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Update-JSON-not-working-for-me-tp3983709p3983759.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Index an xml filed that is saved in a database

2012-05-14 Thread Jack Krupansky


A regex transformer should do the trick:

http://wiki.apache.org/solr/DataImportHandler#RegexTransformer

-- Jack Krupansky

-Original Message- 
From: Ramo Karahasan

Sent: Monday, May 14, 2012 4:54 PM
To: solr-user@lucene.apache.org
Subject: Index an xml filed that is saved in a database

Hi,



I have an XML document saved in a column of a database table. Is it possible
to index just one part of that xml string, e.g. . with
the DIH handler or is it necessary to extract this information previously?



Thanks,

Ramo

Index an xml filed that is saved in a database

2012-05-14 Thread Ramo Karahasan

Hi,

 

I have an XML document saved in a column of a database table. Is it possible
to index just one part of that xml string, e.g. . with
the DIH handler or is it necessary to extract this information previously?

 

Thanks,

Ramo

Urgent! Highlighting not working as expected

2012-05-14 Thread TJ Tong

Dear all,

I queried Solr (3.5) with this: q=text:"G-Money"&hl=true&hl.fl=*, where text
is a "text" field and all the other fields were copied to it. I got three
records returned, however, only one field (also "text" field) was
highlighted: 



G-MONEY HETZEL






But the other two also have matched fields (that is why they are returned),
but they are "string" field, they were not highlighted. Also, in the same
record "cr_149107", the "string" field "cr_firstname" has exactly matched
string "G-Money", but it was not highlighted. But if I search on this field:
q=cr_firstname:"G-Money"&hl=true&hl.fl=*, it will be highlighted. Any idea
what shall I do to let both "text" and "string" fields highlighted? 

Thanks in advance!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Urgent-Highlighting-not-working-as-expected-tp3983755.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Boosting on field empty or not

2012-05-14 Thread Jack Krupansky

It may not matter, but the spaces in the query should be uuencoded as "+".

I have the query working with Solr query, but it is giving me no docs for 
edismax for some reason. But, it does seem to work if I reverse the order of 
the query terms to be:

http://localhost:8983/solr/select/?q=*:*+AND+((*:*+-price:*)^0.5+OR+price:*^5)&defType=edismax&debugQuery=true&qf=text

Let me try a couple more things.

-- Jack Krupansky

-Original Message- 
From: Donald Organ

Sent: Monday, May 14, 2012 4:19 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

&q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)&sort=score 
desc

Same effect.

On Mon, May 14, 2012 at 4:12 PM, Jack Krupansky 
wrote:

Change the second boost to 0.5 to de-boost doc that are missing the field
value. You had them the same.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 4:01 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK it looks like the query change is working but it looks like it boosting
everything even documents that have that field empty

On Mon, May 14, 2012 at 3:41 PM, Donald Organ wrote:

 OK i must be missing something:

defType=edismax&start=0&rows=**24&facet=true&qf=nameSuggest^**10 name^10
codeTXT^2 description^1 brand_search^0 cat_search^10&spellcheck=true&**
spellcheck.collate=true&**spellcheck.q=chairs&facet.**
mincount=1&fl=code,score&q=**chairs AND (regularprice:*^5 OR (*:*
-regularprice:*)^5)&sort=score desc

On Mon, May 14, 2012 at 3:36 PM, Jack Krupansky 
**wrote:

 "(*:* -regularprice:*)5" should be "(*:* -regularprice:*)^0.5" - the

missing boost operator.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 3:31 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

Still doesnt appear to be working.  Here is the full Query string:

defType=edismax&start=0&rows=24&facet=true&qf=nameSuggest^10
name^10
codeTXT^2 description^1 brand_search^0
cat_search^10&spellcheck=true&spellcheck.collate=true&**
spellcheck.q=chairs&facet.mincount=1&fl=code,score&q=chairs
AND (regularprice:*^5 OR (*:* -regularprice:*)5)

On Mon, May 14, 2012 at 3:28 PM, Jack Krupansky 
**wrote:

 Sorry, make that:

&q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)

I forgot that pure negative queries are broken again, so you need the
*:*
in there.

I noticed that you second boost operator was missing as well.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 3:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK i just tried:

&q=chairs AND (regularprice:*^5 OR (-regularprice:*)5)

And that gives me 0 results

On Mon, May 14, 2012 at 2:51 PM, Jack Krupansky <
j...@basetechnology.com
>*
*wrote:

 foo AND (field:*^2.0 OR (-field:*)^0.5)

So, if a doc has anything in the field, it gets boosted, and if the 
doc

does not have anything in the field, de-boost it. Choose the boost
factors
to suit your desired boosting effect.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 2:38 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK maybe i need to describe this a little more.

Basically I want documents that have a given field populated to have a
higher score than the documents that dont.  So if you search for foo I
want
documents that contain foo, but i want the documents that have field a
populated to have a higher score...

Is there a way to do this?

On Mon, May 14, 2012 at 2:22 PM, Jack Krupansky <
j...@basetechnology.com
>*
*wrote:

 In a query or filter query you can write +field:* to require that a
field

 be populated or +(-field:*) to require that it not be populated

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 2:10 PM
To: solr-user
Subject: Boosting on field empty or not

Is there a way to boost a document based on whether the field is 
empty

or
not.  I am looking to boost documents that have a specific field
populated.

Re: Getting payloads for matching term in search result

2012-05-14 Thread Mikhail Khludnev

It's not really out-of-the-box, but not a big deal
http://www.lucidimagination.com/blog/2010/04/18/refresh-getting-started-with-payloads/

On Mon, May 14, 2012 at 5:13 PM,  wrote:

> Good day
>
> currently I have a field defined as can be seen below:
>
>  class="solr.TextField">
>  
>
>   delimiter="|" encoder="identity" />
>  
> 
>
> 
>
> Basically the content for that field has the following form:
>
>  "Wiedersehn|x1062y1755 macht|x1340y1758 Freude|x1502y1758"
>
> where the stuff after the pipe is the payload data (some coordinates).
> What I want is to get that payload data at query time.
> E.g. I search for "macht" and in the result document from solr there will
> be the payload data "x1340y1758".
>
> Is there a way out of the box with solr. I have done this in plain lucene
> once with the TermPositions, so I know it might be possible to adopt this
> to solr.
>
> Silvio
>



-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics

Re: Boosting on field empty or not

2012-05-14 Thread Donald Organ

OK thats giving me only documents that have the field populated


On Mon, May 14, 2012 at 4:28 PM, Donald Organ wrote:

> OK i think i found the proper way to do what i was trying to do:
>
> &q=chairs AND (regularprice:[0 TO *]^5)
>
>
> On Mon, May 14, 2012 at 4:25 PM, Donald Organ wrote:
>
>> I've even tried upping the boost to 10 and the de-boost to 1but yet
>> its still applying the boost to all the documents returned.  So it matter
>> if this is a money field?
>>
>> On Mon, May 14, 2012 at 4:19 PM, Donald Organ wrote:
>>
>>> &q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)&sort=score 
>>> desc
>>>
>>>
>>> Same effect.
>>>
>>>
>>> On Mon, May 14, 2012 at 4:12 PM, Jack Krupansky >> > wrote:
>>>
 Change the second boost to 0.5 to de-boost doc that are missing the
 field value. You had them the same.

 -- Jack Krupansky

 -Original Message- From: Donald Organ
 Sent: Monday, May 14, 2012 4:01 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Boosting on field empty or not

 OK it looks like the query change is working but it looks like it
 boosting
 everything even documents that have that field empty

 On Mon, May 14, 2012 at 3:41 PM, Donald Organ >>> >wrote:

  OK i must be missing something:
>
>
> defType=edismax&start=0&rows=**24&facet=true&qf=nameSuggest^**10
> name^10 codeTXT^2 description^1 brand_search^0
> cat_search^10&spellcheck=true&**spellcheck.collate=true&**
> spellcheck.q=chairs&facet.**mincount=1&fl=code,score&q=**chairs AND
> (regularprice:*^5 OR (*:* -regularprice:*)^5)&sort=score desc
>
>
> On Mon, May 14, 2012 at 3:36 PM, Jack Krupansky <
> j...@basetechnology.com>**wrote:
>
>  "(*:* -regularprice:*)5" should be "(*:* -regularprice:*)^0.5" - the
>> missing boost operator.
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: Donald Organ
>> Sent: Monday, May 14, 2012 3:31 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Boosting on field empty or not
>>
>> Still doesnt appear to be working.  Here is the full Query string:
>>
>>
>> defType=edismax&start=0&rows=24&facet=true&qf=nameSuggest^10
>> name^10
>> codeTXT^2 description^1 brand_search^0
>> cat_search^10&spellcheck=true&spellcheck.collate=true&**
>> spellcheck.q=chairs&facet.mincount=1&fl=code,score&q=chairs
>> AND (regularprice:*^5 OR (*:* -regularprice:*)5)
>>
>>
>> On Mon, May 14, 2012 at 3:28 PM, Jack Krupansky <
>> j...@basetechnology.com>
>> **wrote:
>>
>>  Sorry, make that:
>>
>>>
>>> &q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)
>>>
>>> I forgot that pure negative queries are broken again, so you need
>>> the *:*
>>> in there.
>>>
>>> I noticed that you second boost operator was missing as well.
>>>
>>> -- Jack Krupansky
>>>
>>> -Original Message- From: Donald Organ
>>> Sent: Monday, May 14, 2012 3:24 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Boosting on field empty or not
>>>
>>> OK i just tried:
>>>
>>> &q=chairs AND (regularprice:*^5 OR (-regularprice:*)5)
>>>
>>>
>>> And that gives me 0 results
>>>
>>>
>>> On Mon, May 14, 2012 at 2:51 PM, Jack Krupansky <
>>> j...@basetechnology.com
>>> >*
>>> *wrote:
>>>
>>>  foo AND (field:*^2.0 OR (-field:*)^0.5)
>>>
>>>
 So, if a doc has anything in the field, it gets boosted, and if the
 doc
 does not have anything in the field, de-boost it. Choose the boost
 factors
 to suit your desired boosting effect.

 -- Jack Krupansky

 -Original Message- From: Donald Organ
 Sent: Monday, May 14, 2012 2:38 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Boosting on field empty or not

 OK maybe i need to describe this a little more.

 Basically I want documents that have a given field populated to
 have a
 higher score than the documents that dont.  So if you search for
 foo I
 want
 documents that contain foo, but i want the documents that have
 field a
 populated to have a higher score...

 Is there a way to do this?



 On Mon, May 14, 2012 at 2:22 PM, Jack Krupansky <
 j...@basetechnology.com
 >*
 *wrote:

  In a query or filter query you can write +field:* to require that a
 field

  be populated or +(-field:*) to require that it not be populated

>
> -- Jack Krupansky
>
> -Original Message- From: Donald Organ
> Sent: Monday, May 14, 2012 2:10 PM
> To: solr-user
> Subject:

Re: Boosting on field empty or not

2012-05-14 Thread Donald Organ

OK i think i found the proper way to do what i was trying to do:

&q=chairs AND (regularprice:[0 TO *]^5)


On Mon, May 14, 2012 at 4:25 PM, Donald Organ wrote:

> I've even tried upping the boost to 10 and the de-boost to 1but yet
> its still applying the boost to all the documents returned.  So it matter
> if this is a money field?
>
> On Mon, May 14, 2012 at 4:19 PM, Donald Organ wrote:
>
>> &q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)&sort=score desc
>>
>>
>> Same effect.
>>
>>
>> On Mon, May 14, 2012 at 4:12 PM, Jack Krupansky 
>> wrote:
>>
>>> Change the second boost to 0.5 to de-boost doc that are missing the
>>> field value. You had them the same.
>>>
>>> -- Jack Krupansky
>>>
>>> -Original Message- From: Donald Organ
>>> Sent: Monday, May 14, 2012 4:01 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Boosting on field empty or not
>>>
>>> OK it looks like the query change is working but it looks like it
>>> boosting
>>> everything even documents that have that field empty
>>>
>>> On Mon, May 14, 2012 at 3:41 PM, Donald Organ >> >wrote:
>>>
>>>  OK i must be missing something:


 defType=edismax&start=0&rows=**24&facet=true&qf=nameSuggest^**10
 name^10 codeTXT^2 description^1 brand_search^0
 cat_search^10&spellcheck=true&**spellcheck.collate=true&**
 spellcheck.q=chairs&facet.**mincount=1&fl=code,score&q=**chairs AND
 (regularprice:*^5 OR (*:* -regularprice:*)^5)&sort=score desc


 On Mon, May 14, 2012 at 3:36 PM, Jack Krupansky <
 j...@basetechnology.com>**wrote:

  "(*:* -regularprice:*)5" should be "(*:* -regularprice:*)^0.5" - the
> missing boost operator.
>
> -- Jack Krupansky
>
> -Original Message- From: Donald Organ
> Sent: Monday, May 14, 2012 3:31 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Boosting on field empty or not
>
> Still doesnt appear to be working.  Here is the full Query string:
>
>
> defType=edismax&start=0&rows=24&facet=true&qf=nameSuggest^10
> name^10
> codeTXT^2 description^1 brand_search^0
> cat_search^10&spellcheck=true&spellcheck.collate=true&**
> spellcheck.q=chairs&facet.mincount=1&fl=code,score&q=chairs
> AND (regularprice:*^5 OR (*:* -regularprice:*)5)
>
>
> On Mon, May 14, 2012 at 3:28 PM, Jack Krupansky <
> j...@basetechnology.com>
> **wrote:
>
>  Sorry, make that:
>
>>
>> &q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)
>>
>> I forgot that pure negative queries are broken again, so you need the
>> *:*
>> in there.
>>
>> I noticed that you second boost operator was missing as well.
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: Donald Organ
>> Sent: Monday, May 14, 2012 3:24 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Boosting on field empty or not
>>
>> OK i just tried:
>>
>> &q=chairs AND (regularprice:*^5 OR (-regularprice:*)5)
>>
>>
>> And that gives me 0 results
>>
>>
>> On Mon, May 14, 2012 at 2:51 PM, Jack Krupansky <
>> j...@basetechnology.com
>> >*
>> *wrote:
>>
>>  foo AND (field:*^2.0 OR (-field:*)^0.5)
>>
>>
>>> So, if a doc has anything in the field, it gets boosted, and if the
>>> doc
>>> does not have anything in the field, de-boost it. Choose the boost
>>> factors
>>> to suit your desired boosting effect.
>>>
>>> -- Jack Krupansky
>>>
>>> -Original Message- From: Donald Organ
>>> Sent: Monday, May 14, 2012 2:38 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Boosting on field empty or not
>>>
>>> OK maybe i need to describe this a little more.
>>>
>>> Basically I want documents that have a given field populated to have
>>> a
>>> higher score than the documents that dont.  So if you search for foo
>>> I
>>> want
>>> documents that contain foo, but i want the documents that have field
>>> a
>>> populated to have a higher score...
>>>
>>> Is there a way to do this?
>>>
>>>
>>>
>>> On Mon, May 14, 2012 at 2:22 PM, Jack Krupansky <
>>> j...@basetechnology.com
>>> >*
>>> *wrote:
>>>
>>>  In a query or filter query you can write +field:* to require that a
>>> field
>>>
>>>  be populated or +(-field:*) to require that it not be populated
>>>

 -- Jack Krupansky

 -Original Message- From: Donald Organ
 Sent: Monday, May 14, 2012 2:10 PM
 To: solr-user
 Subject: Boosting on field empty or not

 Is there a way to boost a document based on whether the field is
 empty
 or
 not.  I am looking to boost documents that have a specific field
 populated.





>>>

Re: Boosting on field empty or not

2012-05-14 Thread Donald Organ

I've even tried upping the boost to 10 and the de-boost to 1but yet its
still applying the boost to all the documents returned.  So it matter if
this is a money field?

On Mon, May 14, 2012 at 4:19 PM, Donald Organ wrote:

> &q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)&sort=score desc
>
>
> Same effect.
>
>
> On Mon, May 14, 2012 at 4:12 PM, Jack Krupansky 
> wrote:
>
>> Change the second boost to 0.5 to de-boost doc that are missing the field
>> value. You had them the same.
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: Donald Organ
>> Sent: Monday, May 14, 2012 4:01 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Boosting on field empty or not
>>
>> OK it looks like the query change is working but it looks like it boosting
>> everything even documents that have that field empty
>>
>> On Mon, May 14, 2012 at 3:41 PM, Donald Organ > >wrote:
>>
>>  OK i must be missing something:
>>>
>>>
>>> defType=edismax&start=0&rows=**24&facet=true&qf=nameSuggest^**10
>>> name^10 codeTXT^2 description^1 brand_search^0
>>> cat_search^10&spellcheck=true&**spellcheck.collate=true&**
>>> spellcheck.q=chairs&facet.**mincount=1&fl=code,score&q=**chairs AND
>>> (regularprice:*^5 OR (*:* -regularprice:*)^5)&sort=score desc
>>>
>>>
>>> On Mon, May 14, 2012 at 3:36 PM, Jack Krupansky >> >**wrote:
>>>
>>>  "(*:* -regularprice:*)5" should be "(*:* -regularprice:*)^0.5" - the
 missing boost operator.

 -- Jack Krupansky

 -Original Message- From: Donald Organ
 Sent: Monday, May 14, 2012 3:31 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Boosting on field empty or not

 Still doesnt appear to be working.  Here is the full Query string:


 defType=edismax&start=0&rows=24&facet=true&qf=nameSuggest^10
 name^10
 codeTXT^2 description^1 brand_search^0
 cat_search^10&spellcheck=true&spellcheck.collate=true&**
 spellcheck.q=chairs&facet.mincount=1&fl=code,score&q=chairs
 AND (regularprice:*^5 OR (*:* -regularprice:*)5)


 On Mon, May 14, 2012 at 3:28 PM, Jack Krupansky <
 j...@basetechnology.com>
 **wrote:

  Sorry, make that:

>
> &q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)
>
> I forgot that pure negative queries are broken again, so you need the
> *:*
> in there.
>
> I noticed that you second boost operator was missing as well.
>
> -- Jack Krupansky
>
> -Original Message- From: Donald Organ
> Sent: Monday, May 14, 2012 3:24 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Boosting on field empty or not
>
> OK i just tried:
>
> &q=chairs AND (regularprice:*^5 OR (-regularprice:*)5)
>
>
> And that gives me 0 results
>
>
> On Mon, May 14, 2012 at 2:51 PM, Jack Krupansky <
> j...@basetechnology.com
> >*
> *wrote:
>
>  foo AND (field:*^2.0 OR (-field:*)^0.5)
>
>
>> So, if a doc has anything in the field, it gets boosted, and if the
>> doc
>> does not have anything in the field, de-boost it. Choose the boost
>> factors
>> to suit your desired boosting effect.
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: Donald Organ
>> Sent: Monday, May 14, 2012 2:38 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Boosting on field empty or not
>>
>> OK maybe i need to describe this a little more.
>>
>> Basically I want documents that have a given field populated to have a
>> higher score than the documents that dont.  So if you search for foo I
>> want
>> documents that contain foo, but i want the documents that have field a
>> populated to have a higher score...
>>
>> Is there a way to do this?
>>
>>
>>
>> On Mon, May 14, 2012 at 2:22 PM, Jack Krupansky <
>> j...@basetechnology.com
>> >*
>> *wrote:
>>
>>  In a query or filter query you can write +field:* to require that a
>> field
>>
>>  be populated or +(-field:*) to require that it not be populated
>>
>>>
>>> -- Jack Krupansky
>>>
>>> -Original Message- From: Donald Organ
>>> Sent: Monday, May 14, 2012 2:10 PM
>>> To: solr-user
>>> Subject: Boosting on field empty or not
>>>
>>> Is there a way to boost a document based on whether the field is
>>> empty
>>> or
>>> not.  I am looking to boost documents that have a specific field
>>> populated.
>>>
>>>
>>>
>>>
>>>
>>
>

>>>
>>
>

Re: Boosting on field empty or not

2012-05-14 Thread Donald Organ

&q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)&sort=score desc


Same effect.


On Mon, May 14, 2012 at 4:12 PM, Jack Krupansky wrote:

> Change the second boost to 0.5 to de-boost doc that are missing the field
> value. You had them the same.
>
> -- Jack Krupansky
>
> -Original Message- From: Donald Organ
> Sent: Monday, May 14, 2012 4:01 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Boosting on field empty or not
>
> OK it looks like the query change is working but it looks like it boosting
> everything even documents that have that field empty
>
> On Mon, May 14, 2012 at 3:41 PM, Donald Organ  >wrote:
>
>  OK i must be missing something:
>>
>>
>> defType=edismax&start=0&rows=**24&facet=true&qf=nameSuggest^**10 name^10
>> codeTXT^2 description^1 brand_search^0 cat_search^10&spellcheck=true&**
>> spellcheck.collate=true&**spellcheck.q=chairs&facet.**
>> mincount=1&fl=code,score&q=**chairs AND (regularprice:*^5 OR (*:*
>> -regularprice:*)^5)&sort=score desc
>>
>>
>> On Mon, May 14, 2012 at 3:36 PM, Jack Krupansky 
>> **wrote:
>>
>>  "(*:* -regularprice:*)5" should be "(*:* -regularprice:*)^0.5" - the
>>> missing boost operator.
>>>
>>> -- Jack Krupansky
>>>
>>> -Original Message- From: Donald Organ
>>> Sent: Monday, May 14, 2012 3:31 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Boosting on field empty or not
>>>
>>> Still doesnt appear to be working.  Here is the full Query string:
>>>
>>>
>>> defType=edismax&start=0&rows=24&facet=true&qf=nameSuggest^10
>>> name^10
>>> codeTXT^2 description^1 brand_search^0
>>> cat_search^10&spellcheck=true&spellcheck.collate=true&**
>>> spellcheck.q=chairs&facet.mincount=1&fl=code,score&q=chairs
>>> AND (regularprice:*^5 OR (*:* -regularprice:*)5)
>>>
>>>
>>> On Mon, May 14, 2012 at 3:28 PM, Jack Krupansky >> >
>>> **wrote:
>>>
>>>  Sorry, make that:
>>>

 &q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)

 I forgot that pure negative queries are broken again, so you need the
 *:*
 in there.

 I noticed that you second boost operator was missing as well.

 -- Jack Krupansky

 -Original Message- From: Donald Organ
 Sent: Monday, May 14, 2012 3:24 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Boosting on field empty or not

 OK i just tried:

 &q=chairs AND (regularprice:*^5 OR (-regularprice:*)5)


 And that gives me 0 results


 On Mon, May 14, 2012 at 2:51 PM, Jack Krupansky <
 j...@basetechnology.com
 >*
 *wrote:

  foo AND (field:*^2.0 OR (-field:*)^0.5)


> So, if a doc has anything in the field, it gets boosted, and if the doc
> does not have anything in the field, de-boost it. Choose the boost
> factors
> to suit your desired boosting effect.
>
> -- Jack Krupansky
>
> -Original Message- From: Donald Organ
> Sent: Monday, May 14, 2012 2:38 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Boosting on field empty or not
>
> OK maybe i need to describe this a little more.
>
> Basically I want documents that have a given field populated to have a
> higher score than the documents that dont.  So if you search for foo I
> want
> documents that contain foo, but i want the documents that have field a
> populated to have a higher score...
>
> Is there a way to do this?
>
>
>
> On Mon, May 14, 2012 at 2:22 PM, Jack Krupansky <
> j...@basetechnology.com
> >*
> *wrote:
>
>  In a query or filter query you can write +field:* to require that a
> field
>
>  be populated or +(-field:*) to require that it not be populated
>
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: Donald Organ
>> Sent: Monday, May 14, 2012 2:10 PM
>> To: solr-user
>> Subject: Boosting on field empty or not
>>
>> Is there a way to boost a document based on whether the field is empty
>> or
>> not.  I am looking to boost documents that have a specific field
>> populated.
>>
>>
>>
>>
>>
>

>>>
>>
>

Re: Boosting on field empty or not

2012-05-14 Thread Jack Krupansky

Change the second boost to 0.5 to de-boost doc that are missing the field 
value. You had them the same.


-- Jack Krupansky

-Original Message- 
From: Donald Organ

Sent: Monday, May 14, 2012 4:01 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK it looks like the query change is working but it looks like it boosting
everything even documents that have that field empty

On Mon, May 14, 2012 at 3:41 PM, Donald Organ wrote:


OK i must be missing something:


defType=edismax&start=0&rows=24&facet=true&qf=nameSuggest^10 name^10 
codeTXT^2 description^1 brand_search^0 
cat_search^10&spellcheck=true&spellcheck.collate=true&spellcheck.q=chairs&facet.mincount=1&fl=code,score&q=chairs 
AND (regularprice:*^5 OR (*:* -regularprice:*)^5)&sort=score desc



On Mon, May 14, 2012 at 3:36 PM, Jack Krupansky 
wrote:



"(*:* -regularprice:*)5" should be "(*:* -regularprice:*)^0.5" - the
missing boost operator.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 3:31 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

Still doesnt appear to be working.  Here is the full Query string:


defType=edismax&start=0&rows=**24&facet=true&qf=nameSuggest^**10 name^10
codeTXT^2 description^1 brand_search^0
cat_search^10&spellcheck=true&**spellcheck.collate=true&**
spellcheck.q=chairs&facet.**mincount=1&fl=code,score&q=**chairs
AND (regularprice:*^5 OR (*:* -regularprice:*)5)


On Mon, May 14, 2012 at 3:28 PM, Jack Krupansky 
**wrote:

 Sorry, make that:


&q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)

I forgot that pure negative queries are broken again, so you need the 
*:*

in there.

I noticed that you second boost operator was missing as well.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 3:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK i just tried:

&q=chairs AND (regularprice:*^5 OR (-regularprice:*)5)


And that gives me 0 results


On Mon, May 14, 2012 at 2:51 PM, Jack Krupansky *
*wrote:

 foo AND (field:*^2.0 OR (-field:*)^0.5)



So, if a doc has anything in the field, it gets boosted, and if the doc
does not have anything in the field, de-boost it. Choose the boost
factors
to suit your desired boosting effect.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 2:38 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK maybe i need to describe this a little more.

Basically I want documents that have a given field populated to have a
higher score than the documents that dont.  So if you search for foo I
want
documents that contain foo, but i want the documents that have field a
populated to have a higher score...

Is there a way to do this?



On Mon, May 14, 2012 at 2:22 PM, Jack Krupansky <
j...@basetechnology.com
>*
*wrote:

 In a query or filter query you can write +field:* to require that a
field

 be populated or +(-field:*) to require that it not be populated


-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 2:10 PM
To: solr-user
Subject: Boosting on field empty or not

Is there a way to boost a document based on whether the field is empty
or
not.  I am looking to boost documents that have a specific field
populated.

Re: Update JSON not working for me

2012-05-14 Thread Yonik Seeley

On Mon, May 14, 2012 at 3:11 PM, Rajesh Jain  wrote:
> Hi Yonik
>
> i tried without the json in the URL, the result was same but in XML format

Interesting... the XML response is fine (just not ideal).

When I tried it, I did get a JSON response (perhaps I'm running a
later version of trunk... the unified update handler is very new)

$ curl 'http://localhost:8983/solr/update?commit=true' --data-binary
@books.json -H 'Content-type:application/json'
{"responseHeader":{"status":0,"QTime":133}}

-Yonik
http://lucidimagination.com


>
> C:\Tools\Solr\apache-solr-4.0-2012-05-04_08-23-31\example\exampledocs>C:\tools\curl\curl
> http://localhost:8983/solr/update?commit=true --data-binary @money.json -H
> 'Content-type:application/json'
> 
> 
> 0 name="QTime">45 /lst>
> 
>
>
>
>
> On Mon, May 14, 2012 at 2:58 PM, Yonik Seeley 
> wrote:
>>
>> I think this may be due to https://issues.apache.org/jira/browse/SOLR-2857
>> JIRA is down right now so I can't check, but I thought the intent was
>> to have some back compat.
>>
>> Try changing the URL from /update/json to just /update in the meantime
>>
>> -Yonik
>> http://lucidimagination.com
>>
>>
>> On Mon, May 14, 2012 at 2:42 PM, Rajesh Jain  wrote:
>> > Hi Jack
>> >
>> > I am following the http://wiki.apache.org/solr/UpdateJSON tutorials.
>> >
>> > The first example is of books.json, which  I executed, but I dont see
>> > any
>> > books
>> >
>> > http://localhost:8983/solr/collection1/browse?q=cat%3Dbooks
>> >
>> > 0 results found in 26 ms Page 0 of 0
>> >
>> > I modified the books.json to add my own book, but still no result. The
>> > money.xml works, so I converted the money.xml to money.json and added an
>> > extra currency. I don't see the new currency.
>> >
>> > My question is, how do I know if the UpdateJSON action was valid, if I
>> > don't see them in the
>> > http://localhost:8983/solr/collection1/browse?q=cat%3Dbooks
>> >
>> > Is there a way to find what is happening - maybe through log files?
>> >
>> > I am new to Solr, please help
>> >
>> > Thanks
>> > Rajesh
>> >
>> >
>> >
>> >
>> > On Mon, May 14, 2012 at 2:33 PM, Jack Krupansky
>> > wrote:
>> >
>> >> Check the examples of update/json here:
>> >>
>> >>
>> >> http://wiki.apache.org/solr/**UpdateJSON
>> >>
>> >> In your case, either leave out the "add" level or add a "doc" level
>> >> below
>> >> it.
>> >>
>> >> For example:
>> >>
>> >> curl
>> >> http://localhost:8983/solr/**update/json-H
>> >> 'Content-type:application/
>> >> **json' -d '
>> >> {
>> >> "add": {"doc": {"id" : "TestDoc1", "title" : "test1"} },
>> >> "add": {"doc": {"id" : "TestDoc2", "title" : "another test"} }
>> >> }'
>> >>
>> >> -- Jack Krupansky
>> >>
>> >> -Original Message- From: Rajesh Jain
>> >> Sent: Monday, May 14, 2012 1:27 PM
>> >> To: solr-user@lucene.apache.org
>> >> Cc: Rajesh Jain
>> >> Subject: Update JSON not working for me
>> >>
>> >>
>> >> Hi,
>> >>
>> >> I am using the 4.x version of Solr, and following the UpdateJSON Solr
>> >> Wiki
>> >>
>> >> 1. When I try to update using :
>> >>
>> >> curl
>> >> 'http://localhost:8983/solr/**update/json?commit=true
>> >> '
>> >> --data-binary @books.json -H 'Content-type:application/**json'
>> >>
>> >> I don't see any Category as Books in Velocity based Solr Browser the
>> >>
>> >> http://localhost:8983/solr/**collection1/browse/
>> >> ?
>> >>
>> >> I see the following message on the startup window when I run this
>> >> command
>> >> C:\Tools\Solr\apache-solr-4.0-**2012-05-04_08-23-31\example\**
>> >> exampledocs>C:\tools\curl\curl
>> >>
>> >> http://localhost:8983/solr/**update/json?commit=true--data-binary
>> >> @books
>> >> .json -H 'Content-type:application/**json'
>> >> {
>> >>  "responseHeader":{
>> >>   "status":0,
>> >>   "QTime":47}}
>> >>
>> >> 2. I wrote my own JSON file where I added an extra "add" directive
>> >>
>> >> My JSON File
>> >> [
>> >>  {
>> >> "add":{
>> >> "id" : "MXN",
>> >> "cat" : ["currency"],
>> >> "name" : "One Peso",
>> >> "inStock" : true,
>> >> "price_c" : "1,MXN",
>> >> "manu" : "384",
>> >> "manu_id_s" : "Bank Mexico",
>> >> "features":"Coins and notes"
>> >>     }
>> >>   }
>> >> ]
>> >>
>> >> I still don't see the addition in the existing Currency Categories.
>> >>
>> >>
>> >> Please let me know if the UPDATEJSON works in 4.x or is this only for
>> >> 3.6?
>> >>
>> >> Thanks
>> >> Rajesh
>> >>
>
>

Re: Boosting on field empty or not

2012-05-14 Thread Donald Organ

OK it looks like the query change is working but it looks like it boosting
everything even documents that have that field empty

On Mon, May 14, 2012 at 3:41 PM, Donald Organ wrote:

> OK i must be missing something:
>
>
> defType=edismax&start=0&rows=24&facet=true&qf=nameSuggest^10 name^10 
> codeTXT^2 description^1 brand_search^0 
> cat_search^10&spellcheck=true&spellcheck.collate=true&spellcheck.q=chairs&facet.mincount=1&fl=code,score&q=chairs
>  AND (regularprice:*^5 OR (*:* -regularprice:*)^5)&sort=score desc
>
>
> On Mon, May 14, 2012 at 3:36 PM, Jack Krupansky 
> wrote:
>
>> "(*:* -regularprice:*)5" should be "(*:* -regularprice:*)^0.5" - the
>> missing boost operator.
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: Donald Organ
>> Sent: Monday, May 14, 2012 3:31 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Boosting on field empty or not
>>
>> Still doesnt appear to be working.  Here is the full Query string:
>>
>>
>> defType=edismax&start=0&rows=**24&facet=true&qf=nameSuggest^**10 name^10
>> codeTXT^2 description^1 brand_search^0
>> cat_search^10&spellcheck=true&**spellcheck.collate=true&**
>> spellcheck.q=chairs&facet.**mincount=1&fl=code,score&q=**chairs
>> AND (regularprice:*^5 OR (*:* -regularprice:*)5)
>>
>>
>> On Mon, May 14, 2012 at 3:28 PM, Jack Krupansky 
>> **wrote:
>>
>>  Sorry, make that:
>>>
>>> &q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)
>>>
>>> I forgot that pure negative queries are broken again, so you need the *:*
>>> in there.
>>>
>>> I noticed that you second boost operator was missing as well.
>>>
>>> -- Jack Krupansky
>>>
>>> -Original Message- From: Donald Organ
>>> Sent: Monday, May 14, 2012 3:24 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Boosting on field empty or not
>>>
>>> OK i just tried:
>>>
>>> &q=chairs AND (regularprice:*^5 OR (-regularprice:*)5)
>>>
>>>
>>> And that gives me 0 results
>>>
>>>
>>> On Mon, May 14, 2012 at 2:51 PM, Jack Krupansky >> >*
>>> *wrote:
>>>
>>>  foo AND (field:*^2.0 OR (-field:*)^0.5)
>>>

 So, if a doc has anything in the field, it gets boosted, and if the doc
 does not have anything in the field, de-boost it. Choose the boost
 factors
 to suit your desired boosting effect.

 -- Jack Krupansky

 -Original Message- From: Donald Organ
 Sent: Monday, May 14, 2012 2:38 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Boosting on field empty or not

 OK maybe i need to describe this a little more.

 Basically I want documents that have a given field populated to have a
 higher score than the documents that dont.  So if you search for foo I
 want
 documents that contain foo, but i want the documents that have field a
 populated to have a higher score...

 Is there a way to do this?



 On Mon, May 14, 2012 at 2:22 PM, Jack Krupansky <
 j...@basetechnology.com
 >*
 *wrote:

  In a query or filter query you can write +field:* to require that a
 field

  be populated or +(-field:*) to require that it not be populated
>
> -- Jack Krupansky
>
> -Original Message- From: Donald Organ
> Sent: Monday, May 14, 2012 2:10 PM
> To: solr-user
> Subject: Boosting on field empty or not
>
> Is there a way to boost a document based on whether the field is empty
> or
> not.  I am looking to boost documents that have a specific field
> populated.
>
>
>
>

>>>
>>
>

Re: Update JSON not working for me

2012-05-14 Thread Jack Krupansky

I just tried update/json myself with example (no changes) for both 3.6 and 
the same trunk build as you used, and it works fine for me - I get 4 docs 
for cat:book.


Did you modify the schema or config?

-- Jack Krupansky

-Original Message- 
From: Rajesh Jain

Sent: Monday, May 14, 2012 3:11 PM
To: solr-user@lucene.apache.org ; yo...@lucidimagination.com
Subject: Re: Update JSON not working for me

Hi Yonik

i tried without the json in the URL, the result was same but in XML format

C:\Tools\Solr\apache-solr-4.0-2012-05-04_08-23-31\example\exampledocs>C:\tools\curl\curl
http://localhost:8983/solr/update?commit=true --data-binary @money.json -H
'Content-type:application/json'


045





On Mon, May 14, 2012 at 2:58 PM, Yonik Seeley 
wrote:



I think this may be due to https://issues.apache.org/jira/browse/SOLR-2857
JIRA is down right now so I can't check, but I thought the intent was
to have some back compat.

Try changing the URL from /update/json to just /update in the meantime

-Yonik
http://lucidimagination.com


On Mon, May 14, 2012 at 2:42 PM, Rajesh Jain  wrote:
> Hi Jack
>
> I am following the http://wiki.apache.org/solr/UpdateJSON tutorials.
>
> The first example is of books.json, which  I executed, but I dont see 
> any

> books
>
> http://localhost:8983/solr/collection1/browse?q=cat%3Dbooks
>
> 0 results found in 26 ms Page 0 of 0
>
> I modified the books.json to add my own book, but still no result. The
> money.xml works, so I converted the money.xml to money.json and added an
> extra currency. I don't see the new currency.
>
> My question is, how do I know if the UpdateJSON action was valid, if I
> don't see them in the
> http://localhost:8983/solr/collection1/browse?q=cat%3Dbooks
>
> Is there a way to find what is happening - maybe through log files?
>
> I am new to Solr, please help
>
> Thanks
> Rajesh
>
>
>
>
> On Mon, May 14, 2012 at 2:33 PM, Jack Krupansky wrote:
>
>> Check the examples of update/json here:
>>
>> http://wiki.apache.org/solr/**UpdateJSON<
http://wiki.apache.org/solr/UpdateJSON>
>>
>> In your case, either leave out the "add" level or add a "doc" level
below
>> it.
>>
>> For example:
>>
>> curl http://localhost:8983/solr/**update/json<
http://localhost:8983/solr/update/json>-H 'Content-type:application/
>> **json' -d '
>> {
>> "add": {"doc": {"id" : "TestDoc1", "title" : "test1"} },
>> "add": {"doc": {"id" : "TestDoc2", "title" : "another test"} }
>> }'
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: Rajesh Jain
>> Sent: Monday, May 14, 2012 1:27 PM
>> To: solr-user@lucene.apache.org
>> Cc: Rajesh Jain
>> Subject: Update JSON not working for me
>>
>>
>> Hi,
>>
>> I am using the 4.x version of Solr, and following the UpdateJSON Solr
Wiki
>>
>> 1. When I try to update using :
>>
>> curl 'http://localhost:8983/solr/**update/json?commit=true<
http://localhost:8983/solr/update/json?commit=true>
>> '
>> --data-binary @books.json -H 'Content-type:application/**json'
>>
>> I don't see any Category as Books in Velocity based Solr Browser the
>> http://localhost:8983/solr/**collection1/browse/<
http://localhost:8983/solr/collection1/browse/>
>> ?
>>
>> I see the following message on the startup window when I run this
command
>> C:\Tools\Solr\apache-solr-4.0-**2012-05-04_08-23-31\example\**
>> exampledocs>C:\tools\curl\curl
>> http://localhost:8983/solr/**update/json?commit=true<
http://localhost:8983/solr/update/json?commit=true>--data-binary
>> @books
>> .json -H 'Content-type:application/**json'
>> {
>>  "responseHeader":{
>>   "status":0,
>>   "QTime":47}}
>>
>> 2. I wrote my own JSON file where I added an extra "add" directive
>>
>> My JSON File
>> [
>>  {
>> "add":{
>> "id" : "MXN",
>> "cat" : ["currency"],
>> "name" : "One Peso",
>> "inStock" : true,
>> "price_c" : "1,MXN",
>> "manu" : "384",
>> "manu_id_s" : "Bank Mexico",
>> "features":"Coins and notes"
>> }
>>   }
>> ]
>>
>> I still don't see the addition in the existing Currency Categories.
>>
>>
>> Please let me know if the UPDATEJSON works in 4.x or is this only for
3.6?
>>
>> Thanks
>> Rajesh
>>

Re: Boosting on field empty or not

2012-05-14 Thread Donald Organ

OK i must be missing something:


defType=edismax&start=0&rows=24&facet=true&qf=nameSuggest^10 name^10
codeTXT^2 description^1 brand_search^0
cat_search^10&spellcheck=true&spellcheck.collate=true&spellcheck.q=chairs&facet.mincount=1&fl=code,score&q=chairs
AND (regularprice:*^5 OR (*:* -regularprice:*)^5)&sort=score desc


On Mon, May 14, 2012 at 3:36 PM, Jack Krupansky wrote:

> "(*:* -regularprice:*)5" should be "(*:* -regularprice:*)^0.5" - the
> missing boost operator.
>
> -- Jack Krupansky
>
> -Original Message- From: Donald Organ
> Sent: Monday, May 14, 2012 3:31 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Boosting on field empty or not
>
> Still doesnt appear to be working.  Here is the full Query string:
>
>
> defType=edismax&start=0&rows=**24&facet=true&qf=nameSuggest^**10 name^10
> codeTXT^2 description^1 brand_search^0
> cat_search^10&spellcheck=true&**spellcheck.collate=true&**
> spellcheck.q=chairs&facet.**mincount=1&fl=code,score&q=**chairs
> AND (regularprice:*^5 OR (*:* -regularprice:*)5)
>
>
> On Mon, May 14, 2012 at 3:28 PM, Jack Krupansky *
> *wrote:
>
>  Sorry, make that:
>>
>> &q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)
>>
>> I forgot that pure negative queries are broken again, so you need the *:*
>> in there.
>>
>> I noticed that you second boost operator was missing as well.
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: Donald Organ
>> Sent: Monday, May 14, 2012 3:24 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Boosting on field empty or not
>>
>> OK i just tried:
>>
>> &q=chairs AND (regularprice:*^5 OR (-regularprice:*)5)
>>
>>
>> And that gives me 0 results
>>
>>
>> On Mon, May 14, 2012 at 2:51 PM, Jack Krupansky > >*
>> *wrote:
>>
>>  foo AND (field:*^2.0 OR (-field:*)^0.5)
>>
>>>
>>> So, if a doc has anything in the field, it gets boosted, and if the doc
>>> does not have anything in the field, de-boost it. Choose the boost
>>> factors
>>> to suit your desired boosting effect.
>>>
>>> -- Jack Krupansky
>>>
>>> -Original Message- From: Donald Organ
>>> Sent: Monday, May 14, 2012 2:38 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Boosting on field empty or not
>>>
>>> OK maybe i need to describe this a little more.
>>>
>>> Basically I want documents that have a given field populated to have a
>>> higher score than the documents that dont.  So if you search for foo I
>>> want
>>> documents that contain foo, but i want the documents that have field a
>>> populated to have a higher score...
>>>
>>> Is there a way to do this?
>>>
>>>
>>>
>>> On Mon, May 14, 2012 at 2:22 PM, Jack Krupansky >> >*
>>> *wrote:
>>>
>>>  In a query or filter query you can write +field:* to require that a
>>> field
>>>
>>>  be populated or +(-field:*) to require that it not be populated

 -- Jack Krupansky

 -Original Message- From: Donald Organ
 Sent: Monday, May 14, 2012 2:10 PM
 To: solr-user
 Subject: Boosting on field empty or not

 Is there a way to boost a document based on whether the field is empty
 or
 not.  I am looking to boost documents that have a specific field
 populated.




>>>
>>
>

Re: Boosting on field empty or not

2012-05-14 Thread Jack Krupansky

"(*:* -regularprice:*)5" should be "(*:* -regularprice:*)^0.5" - the missing 
boost operator.


-- Jack Krupansky

-Original Message- 
From: Donald Organ

Sent: Monday, May 14, 2012 3:31 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

Still doesnt appear to be working.  Here is the full Query string:


defType=edismax&start=0&rows=24&facet=true&qf=nameSuggest^10 name^10
codeTXT^2 description^1 brand_search^0
cat_search^10&spellcheck=true&spellcheck.collate=true&spellcheck.q=chairs&facet.mincount=1&fl=code,score&q=chairs
AND (regularprice:*^5 OR (*:* -regularprice:*)5)


On Mon, May 14, 2012 at 3:28 PM, Jack Krupansky 
wrote:



Sorry, make that:

&q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)

I forgot that pure negative queries are broken again, so you need the *:*
in there.

I noticed that you second boost operator was missing as well.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 3:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK i just tried:

&q=chairs AND (regularprice:*^5 OR (-regularprice:*)5)


And that gives me 0 results


On Mon, May 14, 2012 at 2:51 PM, Jack Krupansky *
*wrote:

 foo AND (field:*^2.0 OR (-field:*)^0.5)


So, if a doc has anything in the field, it gets boosted, and if the doc
does not have anything in the field, de-boost it. Choose the boost 
factors

to suit your desired boosting effect.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 2:38 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK maybe i need to describe this a little more.

Basically I want documents that have a given field populated to have a
higher score than the documents that dont.  So if you search for foo I
want
documents that contain foo, but i want the documents that have field a
populated to have a higher score...

Is there a way to do this?



On Mon, May 14, 2012 at 2:22 PM, Jack Krupansky *
*wrote:

 In a query or filter query you can write +field:* to require that a 
field



be populated or +(-field:*) to require that it not be populated

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 2:10 PM
To: solr-user
Subject: Boosting on field empty or not

Is there a way to boost a document based on whether the field is empty 
or

not.  I am looking to boost documents that have a specific field
populated.

Re: Boosting on field empty or not

2012-05-14 Thread Donald Organ

Still doesnt appear to be working.  Here is the full Query string:


defType=edismax&start=0&rows=24&facet=true&qf=nameSuggest^10 name^10
codeTXT^2 description^1 brand_search^0
cat_search^10&spellcheck=true&spellcheck.collate=true&spellcheck.q=chairs&facet.mincount=1&fl=code,score&q=chairs
AND (regularprice:*^5 OR (*:* -regularprice:*)5)


On Mon, May 14, 2012 at 3:28 PM, Jack Krupansky wrote:

> Sorry, make that:
>
> &q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)
>
> I forgot that pure negative queries are broken again, so you need the *:*
> in there.
>
> I noticed that you second boost operator was missing as well.
>
> -- Jack Krupansky
>
> -Original Message- From: Donald Organ
> Sent: Monday, May 14, 2012 3:24 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Boosting on field empty or not
>
> OK i just tried:
>
> &q=chairs AND (regularprice:*^5 OR (-regularprice:*)5)
>
>
> And that gives me 0 results
>
>
> On Mon, May 14, 2012 at 2:51 PM, Jack Krupansky *
> *wrote:
>
>  foo AND (field:*^2.0 OR (-field:*)^0.5)
>>
>> So, if a doc has anything in the field, it gets boosted, and if the doc
>> does not have anything in the field, de-boost it. Choose the boost factors
>> to suit your desired boosting effect.
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: Donald Organ
>> Sent: Monday, May 14, 2012 2:38 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Boosting on field empty or not
>>
>> OK maybe i need to describe this a little more.
>>
>> Basically I want documents that have a given field populated to have a
>> higher score than the documents that dont.  So if you search for foo I
>> want
>> documents that contain foo, but i want the documents that have field a
>> populated to have a higher score...
>>
>> Is there a way to do this?
>>
>>
>>
>> On Mon, May 14, 2012 at 2:22 PM, Jack Krupansky > >*
>> *wrote:
>>
>>  In a query or filter query you can write +field:* to require that a field
>>
>>> be populated or +(-field:*) to require that it not be populated
>>>
>>> -- Jack Krupansky
>>>
>>> -Original Message- From: Donald Organ
>>> Sent: Monday, May 14, 2012 2:10 PM
>>> To: solr-user
>>> Subject: Boosting on field empty or not
>>>
>>> Is there a way to boost a document based on whether the field is empty or
>>> not.  I am looking to boost documents that have a specific field
>>> populated.
>>>
>>>
>>>
>>
>

Re: Boosting on field empty or not

2012-05-14 Thread Jack Krupansky


Sorry, make that:

&q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)

I forgot that pure negative queries are broken again, so you need the *:* in 
there.


I noticed that you second boost operator was missing as well.

-- Jack Krupansky

-Original Message- 
From: Donald Organ

Sent: Monday, May 14, 2012 3:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK i just tried:

&q=chairs AND (regularprice:*^5 OR (-regularprice:*)5)


And that gives me 0 results


On Mon, May 14, 2012 at 2:51 PM, Jack Krupansky 
wrote:



foo AND (field:*^2.0 OR (-field:*)^0.5)

So, if a doc has anything in the field, it gets boosted, and if the doc
does not have anything in the field, de-boost it. Choose the boost factors
to suit your desired boosting effect.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 2:38 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK maybe i need to describe this a little more.

Basically I want documents that have a given field populated to have a
higher score than the documents that dont.  So if you search for foo I 
want

documents that contain foo, but i want the documents that have field a
populated to have a higher score...

Is there a way to do this?



On Mon, May 14, 2012 at 2:22 PM, Jack Krupansky *
*wrote:

 In a query or filter query you can write +field:* to require that a field

be populated or +(-field:*) to require that it not be populated

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 2:10 PM
To: solr-user
Subject: Boosting on field empty or not

Is there a way to boost a document based on whether the field is empty or
not.  I am looking to boost documents that have a specific field
populated.

Re: Boosting on field empty or not

2012-05-14 Thread Donald Organ

OK i just tried:

&q=chairs AND (regularprice:*^5 OR (-regularprice:*)5)


And that gives me 0 results


On Mon, May 14, 2012 at 2:51 PM, Jack Krupansky wrote:

> foo AND (field:*^2.0 OR (-field:*)^0.5)
>
> So, if a doc has anything in the field, it gets boosted, and if the doc
> does not have anything in the field, de-boost it. Choose the boost factors
> to suit your desired boosting effect.
>
> -- Jack Krupansky
>
> -Original Message- From: Donald Organ
> Sent: Monday, May 14, 2012 2:38 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Boosting on field empty or not
>
> OK maybe i need to describe this a little more.
>
> Basically I want documents that have a given field populated to have a
> higher score than the documents that dont.  So if you search for foo I want
> documents that contain foo, but i want the documents that have field a
> populated to have a higher score...
>
> Is there a way to do this?
>
>
>
> On Mon, May 14, 2012 at 2:22 PM, Jack Krupansky *
> *wrote:
>
>  In a query or filter query you can write +field:* to require that a field
>> be populated or +(-field:*) to require that it not be populated
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: Donald Organ
>> Sent: Monday, May 14, 2012 2:10 PM
>> To: solr-user
>> Subject: Boosting on field empty or not
>>
>> Is there a way to boost a document based on whether the field is empty or
>> not.  I am looking to boost documents that have a specific field
>> populated.
>>
>>
>

Re: Kernel methods in SOLR

2012-05-14 Thread Lance Norskog

Lucene provides these vectors as 'term vectors' or 'term frequency
vectors'. The MoreLikeThis feature does queries against these (I
think).

http://www.lucidimagination.com/search/?q=term+vectors
http://www.lucidimagination.com/search/?q=MoreLikeThis

On Mon, May 14, 2012 at 11:07 AM, Dmitry Kan  wrote:
> Peyman,
>
> Did you have a look at this?
>
> https://issues.apache.org/jira/browse/LUCENE-2959
>
> the pluggable ranking functions. Can be a good starting point for you.
>
> Dmitry
>
> On Mon, Apr 23, 2012 at 7:29 PM, Peyman Faratin wrote:
>
>> Hi
>>
>> Has there been any work that tries to integrate Kernel methods [1] with
>> SOLR? I am interested in using kernel methods to solve synonym, hyponym and
>> polysemous (disambiguation) problems which SOLR's Vector space model ("bag
>> of words") does not capture.
>>
>> For example, imagine we have only 3 words in our corpus, "puma", "cougar"
>> and "feline". The 3 words have obviously interdependencies (puma
>> disambiguates to cougar, cougar and puma are instances of felines -
>> hyponyms). Now, imagine 2 docs, d1 and d2, that have the following TF-IDF
>> vectors.
>>
>>                 puma, cougar, feline
>> d1       =   [  2,        0,         0]
>> d2       =   [  0,        1,         0]
>>
>> i.e. d1 has no mention of term cougar or feline and conversely, d2 has no
>> mention of terms puma or feline. Hence under the vector approach d1 and d2
>> are not related at all (and each interpretation of the terms have a unique
>> vector). Which is not what we want to conclude.
>>
>> What I need is to include a kernel matrix (as data) such as the following
>> that captures these relationships:
>>
>>                       puma, cougar, feline
>> puma    =   [  1,        1,         0.4]
>> cougar  =   [  1,        1,         0.4]
>> feline  =   [  0.4,     0.4,         1]
>>
>> then recompute the TF-IDF vector as a product of (1) the original vector
>> and (2) the kernel matrix, resulting in
>>
>>                 puma, cougar, feline
>> d1       =   [  2,        2,         0.8]
>> d2       =   [  1,        1,         0.4]
>>
>> (note, the new vectors are much less sparse).
>>
>> I can solve this problem (inefficiently) at the application layer but I
>> was wondering if there has been any attempts within the community to solve
>> similar problems, efficiently without paying a hefty response time price?
>>
>> thank you
>>
>> Peyman
>>
>> [1] http://en.wikipedia.org/wiki/Kernel_methods
>
>
>
>
> --
> Regards,
>
> Dmitry Kan



-- 
Lance Norskog
goks...@gmail.com

Re: Update JSON not working for me

2012-05-14 Thread Rajesh Jain

Hi Yonik

i tried without the json in the URL, the result was same but in XML format

C:\Tools\Solr\apache-solr-4.0-2012-05-04_08-23-31\example\exampledocs>C:\tools\curl\curl
http://localhost:8983/solr/update?commit=true --data-binary @money.json -H
'Content-type:application/json'


045





On Mon, May 14, 2012 at 2:58 PM, Yonik Seeley wrote:

> I think this may be due to https://issues.apache.org/jira/browse/SOLR-2857
> JIRA is down right now so I can't check, but I thought the intent was
> to have some back compat.
>
> Try changing the URL from /update/json to just /update in the meantime
>
> -Yonik
> http://lucidimagination.com
>
>
> On Mon, May 14, 2012 at 2:42 PM, Rajesh Jain  wrote:
> > Hi Jack
> >
> > I am following the http://wiki.apache.org/solr/UpdateJSON tutorials.
> >
> > The first example is of books.json, which  I executed, but I dont see any
> > books
> >
> > http://localhost:8983/solr/collection1/browse?q=cat%3Dbooks
> >
> > 0 results found in 26 ms Page 0 of 0
> >
> > I modified the books.json to add my own book, but still no result. The
> > money.xml works, so I converted the money.xml to money.json and added an
> > extra currency. I don't see the new currency.
> >
> > My question is, how do I know if the UpdateJSON action was valid, if I
> > don't see them in the
> > http://localhost:8983/solr/collection1/browse?q=cat%3Dbooks
> >
> > Is there a way to find what is happening - maybe through log files?
> >
> > I am new to Solr, please help
> >
> > Thanks
> > Rajesh
> >
> >
> >
> >
> > On Mon, May 14, 2012 at 2:33 PM, Jack Krupansky  >wrote:
> >
> >> Check the examples of update/json here:
> >>
> >> http://wiki.apache.org/solr/**UpdateJSON<
> http://wiki.apache.org/solr/UpdateJSON>
> >>
> >> In your case, either leave out the "add" level or add a "doc" level
> below
> >> it.
> >>
> >> For example:
> >>
> >> curl http://localhost:8983/solr/**update/json<
> http://localhost:8983/solr/update/json>-H 'Content-type:application/
> >> **json' -d '
> >> {
> >> "add": {"doc": {"id" : "TestDoc1", "title" : "test1"} },
> >> "add": {"doc": {"id" : "TestDoc2", "title" : "another test"} }
> >> }'
> >>
> >> -- Jack Krupansky
> >>
> >> -Original Message- From: Rajesh Jain
> >> Sent: Monday, May 14, 2012 1:27 PM
> >> To: solr-user@lucene.apache.org
> >> Cc: Rajesh Jain
> >> Subject: Update JSON not working for me
> >>
> >>
> >> Hi,
> >>
> >> I am using the 4.x version of Solr, and following the UpdateJSON Solr
> Wiki
> >>
> >> 1. When I try to update using :
> >>
> >> curl 'http://localhost:8983/solr/**update/json?commit=true<
> http://localhost:8983/solr/update/json?commit=true>
> >> '
> >> --data-binary @books.json -H 'Content-type:application/**json'
> >>
> >> I don't see any Category as Books in Velocity based Solr Browser the
> >> http://localhost:8983/solr/**collection1/browse/<
> http://localhost:8983/solr/collection1/browse/>
> >> ?
> >>
> >> I see the following message on the startup window when I run this
> command
> >> C:\Tools\Solr\apache-solr-4.0-**2012-05-04_08-23-31\example\**
> >> exampledocs>C:\tools\curl\curl
> >> http://localhost:8983/solr/**update/json?commit=true<
> http://localhost:8983/solr/update/json?commit=true>--data-binary
> >> @books
> >> .json -H 'Content-type:application/**json'
> >> {
> >>  "responseHeader":{
> >>   "status":0,
> >>   "QTime":47}}
> >>
> >> 2. I wrote my own JSON file where I added an extra "add" directive
> >>
> >> My JSON File
> >> [
> >>  {
> >> "add":{
> >> "id" : "MXN",
> >> "cat" : ["currency"],
> >> "name" : "One Peso",
> >> "inStock" : true,
> >> "price_c" : "1,MXN",
> >> "manu" : "384",
> >> "manu_id_s" : "Bank Mexico",
> >> "features":"Coins and notes"
> >> }
> >>   }
> >> ]
> >>
> >> I still don't see the addition in the existing Currency Categories.
> >>
> >>
> >> Please let me know if the UPDATEJSON works in 4.x or is this only for
> 3.6?
> >>
> >> Thanks
> >> Rajesh
> >>
>

Re: Update JSON not working for me

2012-05-14 Thread Rajesh Jain

Jack

I tried with cat=book which was in books.json and in my smaller demo file,
I had it as books, in either case it doesn't seem to work.

>From the example on http://wiki.apache.org/solr/UpdateJSON
http://localhost:8983/solr/select?q=title:monsters&wt=json&indent=true
This should result in out book, but

My output is

{
  "responseHeader":{
"status":0,
"QTime":1,
"params":{
  "indent":"true",
  "wt":"json",
  "q":"title:monsters"}},
  "response":{"numFound":0,"start":0,"docs":[]
  }}


Where is the json data stored ? Is it in the data folder, I can remove
everything from the folder and see what happens?

Thanks
Rajesh


On Mon, May 14, 2012 at 3:03 PM, Jack Krupansky wrote:

> The books.json in example/exampledocs has:
>
> "cat" : ["book","hardcover"],
>
> That is "book" singular, not "books" plural as in your query. There is no
> stemming since it is a string field, not text.
>
>
> -- Jack Krupansky
>
> -Original Message- From: Rajesh Jain
> Sent: Monday, May 14, 2012 2:42 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Update JSON not working for me
>
>
> Hi Jack
>
> I am following the 
> http://wiki.apache.org/solr/**UpdateJSONtutorials.
>
> The first example is of books.json, which  I executed, but I dont see any
> books
>
> http://localhost:8983/solr/**collection1/browse?q=cat%**3Dbooks
>
> 0 results found in 26 ms Page 0 of 0
>
> I modified the books.json to add my own book, but still no result. The
> money.xml works, so I converted the money.xml to money.json and added an
> extra currency. I don't see the new currency.
>
> My question is, how do I know if the UpdateJSON action was valid, if I
> don't see them in the
> http://localhost:8983/solr/**collection1/browse?q=cat%**3Dbooks
>
> Is there a way to find what is happening - maybe through log files?
>
> I am new to Solr, please help
>
> Thanks
> Rajesh
>
>
>
>
> On Mon, May 14, 2012 at 2:33 PM, Jack Krupansky *
> *wrote:
>
>  Check the examples of update/json here:
>>
>> http://wiki.apache.org/solr/UpdateJSON
>> 
>> >
>>
>>
>> In your case, either leave out the "add" level or add a "doc" level below
>> it.
>>
>> For example:
>>
>> curl 
>> http://localhost:8983/solr/update/json
>> >-H
>> 'Content-type:application/
>> **json' -d '
>>
>> {
>> "add": {"doc": {"id" : "TestDoc1", "title" : "test1"} },
>> "add": {"doc": {"id" : "TestDoc2", "title" : "another test"} }
>> }'
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: Rajesh Jain
>> Sent: Monday, May 14, 2012 1:27 PM
>> To: solr-user@lucene.apache.org
>> Cc: Rajesh Jain
>> Subject: Update JSON not working for me
>>
>>
>> Hi,
>>
>> I am using the 4.x version of Solr, and following the UpdateJSON Solr Wiki
>>
>> 1. When I try to update using :
>>
>> curl 
>> 'http://localhost:8983/solr/update/json?commit=true
>> http://localhost:8983/solr/update/json?commit=true>
>> >
>> '
>> --data-binary @books.json -H 'Content-type:application/json'
>>
>>
>> I don't see any Category as Books in Velocity based Solr Browser the
>> http://localhost:8983/solr/collection1/browse/
>> 
>> >
>> ?
>>
>> I see the following message on the startup window when I run this command
>> C:\Tools\Solr\apache-solr-4.0-2012-05-04_08-23-31\example\
>>
>> exampledocs>C:\tools\curl\curl
>> http://localhost:8983/solr/update/json?commit=true
>> http://localhost:8983/solr/update/json?commit=true>
>> >--data-binary
>> @books
>> .json -H 'Content-type:application/json'
>>
>> {
>>  "responseHeader":{
>>  "status":0,
>>  "QTime":47}}
>>
>> 2. I wrote my own JSON file where I added an extra "add" directive
>>
>> My JSON File
>> [
>>  {
>> "add":{
>> "id" : "MXN",
>> "cat" : ["currency"],
>> "name" : "One Peso",
>> "inStock" : true,
>> "price_c" : "1,MXN",
>> "manu" : "384",
>> "manu_id_s" : "Bank Mexico",
>> "features":"Coins and notes"
>>}
>>  }
>> ]
>>
>> I still don't see the addition in the existing Currency Categories.
>>
>>
>> Please let me know if the UPDATEJSON works in 4.x or is this only for 3.6?
>>
>> Thanks
>> Rajesh
>>
>>
>

Re: Update JSON not working for me

2012-05-14 Thread Jack Krupansky


The books.json in example/exampledocs has:

"cat" : ["book","hardcover"],

That is "book" singular, not "books" plural as in your query. There is no 
stemming since it is a string field, not text.


-- Jack Krupansky

-Original Message- 
From: Rajesh Jain

Sent: Monday, May 14, 2012 2:42 PM
To: solr-user@lucene.apache.org
Subject: Re: Update JSON not working for me

Hi Jack

I am following the http://wiki.apache.org/solr/UpdateJSON tutorials.

The first example is of books.json, which  I executed, but I dont see any
books

http://localhost:8983/solr/collection1/browse?q=cat%3Dbooks

0 results found in 26 ms Page 0 of 0

I modified the books.json to add my own book, but still no result. The
money.xml works, so I converted the money.xml to money.json and added an
extra currency. I don't see the new currency.

My question is, how do I know if the UpdateJSON action was valid, if I
don't see them in the
http://localhost:8983/solr/collection1/browse?q=cat%3Dbooks

Is there a way to find what is happening - maybe through log files?

I am new to Solr, please help

Thanks
Rajesh




On Mon, May 14, 2012 at 2:33 PM, Jack Krupansky 
wrote:



Check the examples of update/json here:

http://wiki.apache.org/solr/**UpdateJSON

In your case, either leave out the "add" level or add a "doc" level below
it.

For example:

curl 
http://localhost:8983/solr/**update/json-H 
'Content-type:application/

**json' -d '
{
"add": {"doc": {"id" : "TestDoc1", "title" : "test1"} },
"add": {"doc": {"id" : "TestDoc2", "title" : "another test"} }
}'

-- Jack Krupansky

-Original Message- From: Rajesh Jain
Sent: Monday, May 14, 2012 1:27 PM
To: solr-user@lucene.apache.org
Cc: Rajesh Jain
Subject: Update JSON not working for me


Hi,

I am using the 4.x version of Solr, and following the UpdateJSON Solr Wiki

1. When I try to update using :

curl 
'http://localhost:8983/solr/**update/json?commit=true

'
--data-binary @books.json -H 'Content-type:application/**json'

I don't see any Category as Books in Velocity based Solr Browser the
http://localhost:8983/solr/**collection1/browse/
?

I see the following message on the startup window when I run this command
C:\Tools\Solr\apache-solr-4.0-**2012-05-04_08-23-31\example\**
exampledocs>C:\tools\curl\curl
http://localhost:8983/solr/**update/json?commit=true--data-binary
@books
.json -H 'Content-type:application/**json'
{
 "responseHeader":{
  "status":0,
  "QTime":47}}

2. I wrote my own JSON file where I added an extra "add" directive

My JSON File
[
 {
"add":{
"id" : "MXN",
"cat" : ["currency"],
"name" : "One Peso",
"inStock" : true,
"price_c" : "1,MXN",
"manu" : "384",
"manu_id_s" : "Bank Mexico",
"features":"Coins and notes"
}
  }
]

I still don't see the addition in the existing Currency Categories.


Please let me know if the UPDATEJSON works in 4.x or is this only for 3.6?

Thanks
Rajesh

Re: Update JSON not working for me

2012-05-14 Thread Yonik Seeley

I think this may be due to https://issues.apache.org/jira/browse/SOLR-2857
JIRA is down right now so I can't check, but I thought the intent was
to have some back compat.

Try changing the URL from /update/json to just /update in the meantime

-Yonik
http://lucidimagination.com


On Mon, May 14, 2012 at 2:42 PM, Rajesh Jain  wrote:
> Hi Jack
>
> I am following the http://wiki.apache.org/solr/UpdateJSON tutorials.
>
> The first example is of books.json, which  I executed, but I dont see any
> books
>
> http://localhost:8983/solr/collection1/browse?q=cat%3Dbooks
>
> 0 results found in 26 ms Page 0 of 0
>
> I modified the books.json to add my own book, but still no result. The
> money.xml works, so I converted the money.xml to money.json and added an
> extra currency. I don't see the new currency.
>
> My question is, how do I know if the UpdateJSON action was valid, if I
> don't see them in the
> http://localhost:8983/solr/collection1/browse?q=cat%3Dbooks
>
> Is there a way to find what is happening - maybe through log files?
>
> I am new to Solr, please help
>
> Thanks
> Rajesh
>
>
>
>
> On Mon, May 14, 2012 at 2:33 PM, Jack Krupansky 
> wrote:
>
>> Check the examples of update/json here:
>>
>> http://wiki.apache.org/solr/**UpdateJSON
>>
>> In your case, either leave out the "add" level or add a "doc" level below
>> it.
>>
>> For example:
>>
>> curl 
>> http://localhost:8983/solr/**update/json-H
>>  'Content-type:application/
>> **json' -d '
>> {
>> "add": {"doc": {"id" : "TestDoc1", "title" : "test1"} },
>> "add": {"doc": {"id" : "TestDoc2", "title" : "another test"} }
>> }'
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: Rajesh Jain
>> Sent: Monday, May 14, 2012 1:27 PM
>> To: solr-user@lucene.apache.org
>> Cc: Rajesh Jain
>> Subject: Update JSON not working for me
>>
>>
>> Hi,
>>
>> I am using the 4.x version of Solr, and following the UpdateJSON Solr Wiki
>>
>> 1. When I try to update using :
>>
>> curl 
>> 'http://localhost:8983/solr/**update/json?commit=true
>> '
>> --data-binary @books.json -H 'Content-type:application/**json'
>>
>> I don't see any Category as Books in Velocity based Solr Browser the
>> http://localhost:8983/solr/**collection1/browse/
>> ?
>>
>> I see the following message on the startup window when I run this command
>> C:\Tools\Solr\apache-solr-4.0-**2012-05-04_08-23-31\example\**
>> exampledocs>C:\tools\curl\curl
>> http://localhost:8983/solr/**update/json?commit=true--data-binary
>> @books
>> .json -H 'Content-type:application/**json'
>> {
>>  "responseHeader":{
>>   "status":0,
>>   "QTime":47}}
>>
>> 2. I wrote my own JSON file where I added an extra "add" directive
>>
>> My JSON File
>> [
>>  {
>> "add":{
>> "id" : "MXN",
>> "cat" : ["currency"],
>> "name" : "One Peso",
>> "inStock" : true,
>> "price_c" : "1,MXN",
>> "manu" : "384",
>> "manu_id_s" : "Bank Mexico",
>> "features":"Coins and notes"
>>     }
>>   }
>> ]
>>
>> I still don't see the addition in the existing Currency Categories.
>>
>>
>> Please let me know if the UPDATEJSON works in 4.x or is this only for 3.6?
>>
>> Thanks
>> Rajesh
>>

Re: Boosting on field empty or not

2012-05-14 Thread Jack Krupansky


foo AND (field:*^2.0 OR (-field:*)^0.5)

So, if a doc has anything in the field, it gets boosted, and if the doc does 
not have anything in the field, de-boost it. Choose the boost factors to 
suit your desired boosting effect.


-- Jack Krupansky

-Original Message- 
From: Donald Organ

Sent: Monday, May 14, 2012 2:38 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK maybe i need to describe this a little more.

Basically I want documents that have a given field populated to have a
higher score than the documents that dont.  So if you search for foo I want
documents that contain foo, but i want the documents that have field a
populated to have a higher score...

Is there a way to do this?



On Mon, May 14, 2012 at 2:22 PM, Jack Krupansky 
wrote:



In a query or filter query you can write +field:* to require that a field
be populated or +(-field:*) to require that it not be populated

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 2:10 PM
To: solr-user
Subject: Boosting on field empty or not

Is there a way to boost a document based on whether the field is empty or
not.  I am looking to boost documents that have a specific field
populated.

Re: Problems with field names in solr functions

2012-05-14 Thread Yonik Seeley

In trunk, see:
* SOLR-2335: New 'field("...")' function syntax for refering to complex
  field names (containing whitespace or special characters) in functions.

The schema in trunk also specifies:
   

-Yonik
http://lucidimagination.com


On Thu, May 10, 2012 at 11:28 AM, Iker Huerga  wrote:
> Hi all,
>
> I am having problems when sorting solr documents using solr functions due
> to the field names.
>
>
> Imagine we want to sort the solr documents based on the sum of the scores
> of the matching fields. These field are created as follows
>
>
> 
>
>
> The idea is that these fields store float values as in this example * name="foo/bar-1234"> 50.45*
>
>
>
> The examples below illustrate the issue
>
>
> This query - http://URL/solr/select/?q=(*foo/bar-1234*:*)+AND+(
> 
> *foo/bar*
> *-2345*:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(
> *foo/bar-1234*
>  , 
> 
> *foo/bar*
> *-2345* )+desc&wt=json
>
>
>
> it gives me the following exception
>
> *
> *
>
> *The request sent by the client was syntactically incorrect (sort param
> could not be parsed as a query, and is not a field that exists in the
> index: sum(foo/bar-1234,foo/bar-2345)).*
>
>
> Whereas if I rename the field removing the "/" and "-" the following query
> will work -
>
> http://URL/solr/select/?q=(*bar1234*:*)+AND+(*bar2345*:*)&version=2.2&start=0&rows=10&indent=on&sort=sum(
> 
> *bar1234*
> ,
> *bar2345*
>  )+desc&wt=json
>
>
>
>  "response":{"numFound":2,"start":0,"docs":[
>
>      {
>
>        "primaryDescRes":"DescRes2",
>
>        " 
> *bar1234*
>  ":45.54,
>
>        " 
> *bar2345*
>  ":100.0},
>
>      {
>
>        "primaryDescRes":"DescRes1",
>
>        " 
> *bar1234*
>  ":100.5,
>
>        " 
> *bar2345*
>  ":25.22}]
>
>  }}
>
>
>
> I tried escaping the character as indicated in solr documentation [1], i.e.
> foo%2Fbar-12345 instead of foo/bar-12345, without success
>
>
>
> Could this be caused by the query parser?
>
>
> I would be extremely grateful if you could let me know any workaround for
> this
>
>
>
> Best
>
> Iker
>
>
>
> [1]
> http://wiki.apache.org/solr/SolrQuerySyntax#NOTE:_URL_Escaping_Special_Characters
>
> --
> Iker Huerga
> http://www.ikerhuerga.com/

Re: Update JSON not working for me

2012-05-14 Thread Rajesh Jain

Hi Jack

I am following the http://wiki.apache.org/solr/UpdateJSON tutorials.

The first example is of books.json, which  I executed, but I dont see any
books

http://localhost:8983/solr/collection1/browse?q=cat%3Dbooks

0 results found in 26 ms Page 0 of 0

I modified the books.json to add my own book, but still no result. The
money.xml works, so I converted the money.xml to money.json and added an
extra currency. I don't see the new currency.

My question is, how do I know if the UpdateJSON action was valid, if I
don't see them in the
http://localhost:8983/solr/collection1/browse?q=cat%3Dbooks

Is there a way to find what is happening - maybe through log files?

I am new to Solr, please help

Thanks
Rajesh




On Mon, May 14, 2012 at 2:33 PM, Jack Krupansky wrote:

> Check the examples of update/json here:
>
> http://wiki.apache.org/solr/**UpdateJSON
>
> In your case, either leave out the "add" level or add a "doc" level below
> it.
>
> For example:
>
> curl 
> http://localhost:8983/solr/**update/json-H
>  'Content-type:application/
> **json' -d '
> {
> "add": {"doc": {"id" : "TestDoc1", "title" : "test1"} },
> "add": {"doc": {"id" : "TestDoc2", "title" : "another test"} }
> }'
>
> -- Jack Krupansky
>
> -Original Message- From: Rajesh Jain
> Sent: Monday, May 14, 2012 1:27 PM
> To: solr-user@lucene.apache.org
> Cc: Rajesh Jain
> Subject: Update JSON not working for me
>
>
> Hi,
>
> I am using the 4.x version of Solr, and following the UpdateJSON Solr Wiki
>
> 1. When I try to update using :
>
> curl 
> 'http://localhost:8983/solr/**update/json?commit=true
> '
> --data-binary @books.json -H 'Content-type:application/**json'
>
> I don't see any Category as Books in Velocity based Solr Browser the
> http://localhost:8983/solr/**collection1/browse/
> ?
>
> I see the following message on the startup window when I run this command
> C:\Tools\Solr\apache-solr-4.0-**2012-05-04_08-23-31\example\**
> exampledocs>C:\tools\curl\curl
> http://localhost:8983/solr/**update/json?commit=true--data-binary
> @books
> .json -H 'Content-type:application/**json'
> {
>  "responseHeader":{
>   "status":0,
>   "QTime":47}}
>
> 2. I wrote my own JSON file where I added an extra "add" directive
>
> My JSON File
> [
>  {
> "add":{
> "id" : "MXN",
> "cat" : ["currency"],
> "name" : "One Peso",
> "inStock" : true,
> "price_c" : "1,MXN",
> "manu" : "384",
> "manu_id_s" : "Bank Mexico",
> "features":"Coins and notes"
> }
>   }
> ]
>
> I still don't see the addition in the existing Currency Categories.
>
>
> Please let me know if the UPDATEJSON works in 4.x or is this only for 3.6?
>
> Thanks
> Rajesh
>

Re: Boosting on field empty or not

2012-05-14 Thread Donald Organ

OK maybe i need to describe this a little more.

Basically I want documents that have a given field populated to have a
higher score than the documents that dont.  So if you search for foo I want
documents that contain foo, but i want the documents that have field a
populated to have a higher score...

Is there a way to do this?

On Mon, May 14, 2012 at 2:22 PM, Jack Krupansky wrote:

> In a query or filter query you can write +field:* to require that a field
> be populated or +(-field:*) to require that it not be populated
>
> -- Jack Krupansky
>
> -Original Message- From: Donald Organ
> Sent: Monday, May 14, 2012 2:10 PM
> To: solr-user
> Subject: Boosting on field empty or not
>
> Is there a way to boost a document based on whether the field is empty or
> not.  I am looking to boost documents that have a specific field
> populated.
>

Re: Update JSON not working for me

2012-05-14 Thread Jack Krupansky


Check the examples of update/json here:

http://wiki.apache.org/solr/UpdateJSON

In your case, either leave out the "add" level or add a "doc" level below 
it.


For example:

curl http://localhost:8983/solr/update/json -H 
'Content-type:application/json' -d '

{
"add": {"doc": {"id" : "TestDoc1", "title" : "test1"} },
"add": {"doc": {"id" : "TestDoc2", "title" : "another test"} }
}'

-- Jack Krupansky

-Original Message- 
From: Rajesh Jain

Sent: Monday, May 14, 2012 1:27 PM
To: solr-user@lucene.apache.org
Cc: Rajesh Jain
Subject: Update JSON not working for me

Hi,

I am using the 4.x version of Solr, and following the UpdateJSON Solr Wiki

1. When I try to update using :

curl 'http://localhost:8983/solr/update/json?commit=true'
--data-binary @books.json -H 'Content-type:application/json'

I don't see any Category as Books in Velocity based Solr Browser the
http://localhost:8983/solr/collection1/browse/?

I see the following message on the startup window when I run this command
C:\Tools\Solr\apache-solr-4.0-2012-05-04_08-23-31\example\exampledocs>C:\tools\curl\curl
http://localhost:8983/solr/update/json?commit=true --data-binary
@books
.json -H 'Content-type:application/json'
{
 "responseHeader":{
   "status":0,
   "QTime":47}}

2. I wrote my own JSON file where I added an extra "add" directive

My JSON File
[
 {
"add":{
"id" : "MXN",
"cat" : ["currency"],
"name" : "One Peso",
"inStock" : true,
"price_c" : "1,MXN",
"manu" : "384",
"manu_id_s" : "Bank Mexico",
"features":"Coins and notes"
 }
   }
]

I still don't see the addition in the existing Currency Categories.


Please let me know if the UPDATEJSON works in 4.x or is this only for 3.6?

Thanks
Rajesh

Re: Boosting on field empty or not

2012-05-14 Thread Jack Krupansky

In a query or filter query you can write +field:* to require that a field be 
populated or +(-field:*) to require that it not be populated


-- Jack Krupansky

-Original Message- 
From: Donald Organ

Sent: Monday, May 14, 2012 2:10 PM
To: solr-user
Subject: Boosting on field empty or not

Is there a way to boost a document based on whether the field is empty or
not.  I am looking to boost documents that have a specific field populated.

Boosting on field empty or not

2012-05-14 Thread Donald Organ

Is there a way to boost a document based on whether the field is empty or
not.  I am looking to boost documents that have a specific field populated.

Re: Kernel methods in SOLR

2012-05-14 Thread Dmitry Kan

Peyman,

Did you have a look at this?

https://issues.apache.org/jira/browse/LUCENE-2959

the pluggable ranking functions. Can be a good starting point for you.

Dmitry

On Mon, Apr 23, 2012 at 7:29 PM, Peyman Faratin wrote:

> Hi
>
> Has there been any work that tries to integrate Kernel methods [1] with
> SOLR? I am interested in using kernel methods to solve synonym, hyponym and
> polysemous (disambiguation) problems which SOLR's Vector space model ("bag
> of words") does not capture.
>
> For example, imagine we have only 3 words in our corpus, "puma", "cougar"
> and "feline". The 3 words have obviously interdependencies (puma
> disambiguates to cougar, cougar and puma are instances of felines -
> hyponyms). Now, imagine 2 docs, d1 and d2, that have the following TF-IDF
> vectors.
>
> puma, cougar, feline
> d1   =   [  2,0, 0]
> d2   =   [  0,1, 0]
>
> i.e. d1 has no mention of term cougar or feline and conversely, d2 has no
> mention of terms puma or feline. Hence under the vector approach d1 and d2
> are not related at all (and each interpretation of the terms have a unique
> vector). Which is not what we want to conclude.
>
> What I need is to include a kernel matrix (as data) such as the following
> that captures these relationships:
>
>   puma, cougar, feline
> puma=   [  1,1, 0.4]
> cougar  =   [  1,1, 0.4]
> feline  =   [  0.4, 0.4, 1]
>
> then recompute the TF-IDF vector as a product of (1) the original vector
> and (2) the kernel matrix, resulting in
>
> puma, cougar, feline
> d1   =   [  2,2, 0.8]
> d2   =   [  1,1, 0.4]
>
> (note, the new vectors are much less sparse).
>
> I can solve this problem (inefficiently) at the application layer but I
> was wondering if there has been any attempts within the community to solve
> similar problems, efficiently without paying a hefty response time price?
>
> thank you
>
> Peyman
>
> [1] http://en.wikipedia.org/wiki/Kernel_methods




-- 
Regards,

Dmitry Kan

Re: not getting expected results when doing a delta import via full import

2012-05-14 Thread geeky2

update on this:

i also tried manipulating the timestamps in the dataimport.properties file
to advance the date so that no records could be older than last_index_time

example:

#Mon May 14 12:42:49 CDT 2012
core1-model.last_index_time=2012-05-15 14\:38\:55
last_index_time=2012-05-15 14\:38\:55
~

this leads me to believe that date comparisons are not being done correctly
or have not been configured correctly.

so what does something need to be configured for the date comparison to
work?

example from wiki:

OR last_modified > '${*dataimporter.last_index_time*}'">



--
View this message in context: 
http://lucene.472066.n3.nabble.com/not-getting-expected-results-when-doing-a-delta-import-via-full-import-tp3983711p3983715.html
Sent from the Solr - User mailing list archive at Nabble.com.

relative path for xsl:include failing in Solr 3.6

2012-05-14 Thread pramila_tha...@ontla.ola.org

Hi Everyone,

I am sure some one might have encountered this problem.

using xsl:include works when the file is in conf\xslt directory

But fails if the file is at different location.

The same thing worked for solr 1.4.

Can someone share their experience, if they have encounteres this please.

the error is basically 

getTransformer fails in getContentType java.lang.RuntimeException:
getTransformer fails in getContentType at
org.apache.solr.response.XSLTResponseWriter.getContentType(XSLTResponseWriter.java:72)
at
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:338)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:273)

--Thanks

--
View this message in context: 
http://lucene.472066.n3.nabble.com/relative-path-for-xsl-include-failing-in-Solr-3-6-tp3983714.html
Sent from the Solr - User mailing list archive at Nabble.com.

not getting expected results when doing a delta import via full import

2012-05-14 Thread geeky2

hello all,


i am not getting the expected results when trying to set up delta imports
according to the wiki documentation here:

http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport?highlight=%28delta%29|%28import%29



i have the following set up in my DIH,

query="select [complicated sql goes here] and
('${dataimporter.request.clean}' != 'false' OR some_table.upd_by_ts  >
'${dataimporter.last_index_time}')"> 

i have the following set up in the shell script to invoke my import process
(either a full w/clean or delta)

# change clean=true for full, clean=false for delta

SERVER="http://some_server:port/some_core/dataimport -F command=full-import
-F clean=false"

curl $SERVER


when i do a full import (clean=true) i see all of the documents (via the
stats page) show up in the core.

when i do a delta import (clean=false) i see ~900 fewer records in the
import, but i should see much fewer (~84,000) records less, based on the
fact that i am updating the upd_by_ts field to the current timestamp on
84,000 records!

can someone tell me what i am missing?

thank you,




--
View this message in context: 
http://lucene.472066.n3.nabble.com/not-getting-expected-results-when-doing-a-delta-import-via-full-import-tp3983711.html
Sent from the Solr - User mailing list archive at Nabble.com.

Update JSON not working for me

2012-05-14 Thread Rajesh Jain

Hi,

I am using the 4.x version of Solr, and following the UpdateJSON Solr Wiki

1. When I try to update using :

curl 'http://localhost:8983/solr/update/json?commit=true'
--data-binary @books.json -H 'Content-type:application/json'

I don't see any Category as Books in Velocity based Solr Browser the
http://localhost:8983/solr/collection1/browse/?

I see the following message on the startup window when I run this command
C:\Tools\Solr\apache-solr-4.0-2012-05-04_08-23-31\example\exampledocs>C:\tools\curl\curl
http://localhost:8983/solr/update/json?commit=true --data-binary
@books
.json -H 'Content-type:application/json'
{
  "responseHeader":{
"status":0,
"QTime":47}}

2. I wrote my own JSON file where I added an extra "add" directive

My JSON File
[
  {
"add":{
"id" : "MXN",
"cat" : ["currency"],
"name" : "One Peso",
"inStock" : true,
"price_c" : "1,MXN",
"manu" : "384",
"manu_id_s" : "Bank Mexico",
"features":"Coins and notes"
  }
}
]

I still don't see the addition in the existing Currency Categories.


Please let me know if the UPDATEJSON works in 4.x or is this only for 3.6?

Thanks
Rajesh

Re: Solr Import Handler Custom Transformer not working

2012-05-14 Thread dboychuck

Thank you for your input. With your help I was able to solve my problem.
Although I could find no good example of how to handle multivalued fields
with a custom transformer online your comments helped me to find a solution.

Here is the code that handles both multi-valued and single valued fields.


package org.build.com.solr;

/**
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 * http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

import org.apache.solr.handler.dataimport.Context;
import org.apache.solr.handler.dataimport.Transformer;

import java.util.ArrayList;
import java.util.List;
import java.util.Map;

public class FacetsTransformer extends Transformer {
  
  public Object transformRow(Map row, Context context) {
Object tf = row.get("facets");
if (tf != null) {
  if (tf instanceof List) {
List list = (List) tf;
String tempKey = "";
for (Object o : list) {
  String[] arr = ((String) o).split("=");
  if (arr.length == 3) {
tempKey = arr[0].replaceAll("[^A-Za-z0-9]", "") + "_" +
arr[1];
if (row.containsKey(tempKey))
{
  List tempArrayList =
(ArrayList)row.get(tempKey);
  tempArrayList.add(arr[2]);
  row.put(tempKey, tempArrayList );
} else {
  List tempArrayList = new ArrayList();
  tempArrayList.add(arr[2]);
  row.put(tempKey, tempArrayList);
}
  }
}
  } else {
String[] arr = ((String) tf).split("=");
if (arr.length == 3) row.put(arr[0].replaceAll("[^A-Za-z0-9]",
"") + "_" + arr[1], arr[2]);
  }
  row.remove("facets");
}
return row;
  }
}


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Import-Handler-Custom-Transformer-not-working-tp3978746p3983704.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Shards multi core slower then single big core

2012-05-14 Thread arjit

Robert can you tell what you mean when you say "We do a lot of faceting so
maybe that is why since facets can be built in parallel on different
threads/cores". I am novice in solr. Can you tell me where Can i read about
it ?
Thanks ,
Arjit



On Mon, May 14, 2012 at 8:54 PM, Robert Stewart [via Lucene] <
ml-node+s472066n3983692...@n3.nabble.com> wrote:

> We used to have one large index - then moved to 10 shards (7 million docs
> each) - parallel search across all shards, and we get better performance
> that way.  We use a 40 core box with 128GB ram.  We do a lot of faceting so
> maybe that is why since facets can be built in parallel on different
> threads/cores.  We also have indexes on fast local disks (6 15K RPM disks
> using raid stripes).
>
>
> On May 14, 2012, at 10:42 AM, Michael Della Bitta wrote:
>
> > Hi, all,
> >
> > I've been running into murmurs about this idea elsewhere:
> >
> >
> http://stackoverflow.com/questions/8698762/run-multiple-big-solr-shard-instances-on-one-physical-machine
> >
> >
> http://java.dzone.com/articles/optimizing-solr-or-how-7x-your?mz=33057-solr_lucene
> >
> > Michael
> >
> > On Mon, May 14, 2012 at 10:29 AM, Otis Gospodnetic
> > <[hidden email] >
> wrote:
> >> Hi Kuli,
> >>
> >> As long as there are enough CPUs with spare cycles and disk IO is not a
> bottleneck, this works faster.  This was 12+ months ago.
> >>
> >> Otis
> >> 
> >> Performance Monitoring for Solr / ElasticSearch / HBase -
> http://sematext.com/spm
> >>
> >>
> >>
> >>> 
> >>> From: Michael Kuhlmann <[hidden 
> >>> email]>
>
> >>> To: [hidden email]
> >>> Sent: Monday, May 14, 2012 10:21 AM
> >>> Subject: Re: Solr Shards multi core slower then single big core
> >>>
> >>> Am 14.05.2012 16:18, schrieb Otis Gospodnetic:
>  Hi Kuli,
> 
>  In a client engagement, I did see this (N shards on 1 beefy box with
> lots of RAM and CPU cores) be faster than 1 big index.
> 
> >>>
> >>> I want to believe you, but I also want to understand. Can you explain
> >>> why? And did this only happen for single requests, or even under heavy
> load?
> >>>
> >>> Greetings,
> >>> Kuli
> >>>
> >>>
> >>>
>
>
>
> --
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/Solr-Shards-multi-core-slower-then-single-big-core-tp3979115p3983692.html
>  To unsubscribe from Solr Shards multi core slower then single big core, click
> here
> .
> NAML
>


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Shards-multi-core-slower-then-single-big-core-tp3979115p3983697.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Getting payloads for matching term in search result

2012-05-14 Thread Jack Krupansky

The "stored" value of your payload field does in fact have the original 
payload value, albeit formatted as you have shown. Is that not sufficient?


There doesn't appear to be any Solr support for returning term payload 
directly.


I see a Jira issue for adding query support, but I don’t see a Jira for 
returning the payload(s) for matched terms.


-- Jack Krupansky

-Original Message- 
From: s.herm...@uni-jena.de

Sent: Monday, May 14, 2012 9:13 AM
To: solr-user@lucene.apache.org
Subject: Getting payloads for matching term in search result

Good day

currently I have a field defined as can be seen below:

class="solr.TextField">

  

  encoder="identity" />

  




Basically the content for that field has the following form:

  "Wiedersehn|x1062y1755 macht|x1340y1758 Freude|x1502y1758"

where the stuff after the pipe is the payload data (some coordinates). What 
I want is to get that payload data at query time.
E.g. I search for "macht" and in the result document from solr there will be 
the payload data "x1340y1758".


Is there a way out of the box with solr. I have done this in plain lucene 
once with the TermPositions, so I know it might be possible to adopt this to 
solr.


Silvio

Re: Solr Shards multi core slower then single big core

2012-05-14 Thread Robert Stewart

We used to have one large index - then moved to 10 shards (7 million docs each) 
- parallel search across all shards, and we get better performance that way.  
We use a 40 core box with 128GB ram.  We do a lot of faceting so maybe that is 
why since facets can be built in parallel on different threads/cores.  We also 
have indexes on fast local disks (6 15K RPM disks using raid stripes).


On May 14, 2012, at 10:42 AM, Michael Della Bitta wrote:

> Hi, all,
> 
> I've been running into murmurs about this idea elsewhere:
> 
> http://stackoverflow.com/questions/8698762/run-multiple-big-solr-shard-instances-on-one-physical-machine
> 
> http://java.dzone.com/articles/optimizing-solr-or-how-7x-your?mz=33057-solr_lucene
> 
> Michael
> 
> On Mon, May 14, 2012 at 10:29 AM, Otis Gospodnetic
>  wrote:
>> Hi Kuli,
>> 
>> As long as there are enough CPUs with spare cycles and disk IO is not a 
>> bottleneck, this works faster.  This was 12+ months ago.
>> 
>> Otis
>> 
>> Performance Monitoring for Solr / ElasticSearch / HBase - 
>> http://sematext.com/spm
>> 
>> 
>> 
>>> 
>>> From: Michael Kuhlmann 
>>> To: solr-user@lucene.apache.org
>>> Sent: Monday, May 14, 2012 10:21 AM
>>> Subject: Re: Solr Shards multi core slower then single big core
>>> 
>>> Am 14.05.2012 16:18, schrieb Otis Gospodnetic:
 Hi Kuli,
 
 In a client engagement, I did see this (N shards on 1 beefy box with lots 
 of RAM and CPU cores) be faster than 1 big index.
 
>>> 
>>> I want to believe you, but I also want to understand. Can you explain
>>> why? And did this only happen for single requests, or even under heavy load?
>>> 
>>> Greetings,
>>> Kuli
>>> 
>>> 
>>>

Re: Date format in the schema.xml

2012-05-14 Thread Jack Krupansky

At least in this case where dates have a precision of day, the total number 
of unique values should be relatively low (3,650 for a 10-year period or 
even 18,250 for a 50-year period), so precision step probably won't matter 
in this case much at all. The big benefit with tdate over old date here is 
the fact that the string is stored as a long integer, which speeds 
comparisons.


A precisionStep of zero ("0") means use the maximum value 
(Integer.MAX_VALUE). The default (in trunk; not sure about 3.6 or earlier) 
is "8", not 4 as the Javadoc indicates. Zero or high step values means more 
precision, less index space consumed, but slower searching. Lower step 
values, such as 8 or 4 or even 1, mean less precision, more index space 
consumed, but faster searching.


I'm still struggling to figure out how to map values of precisionStep to 
"precision" of the input data. For example, seconds, minutes, hours, days, 
years for date values.


I'm also not sure what the implications, if any, are for faceting and the 
FieldCache for Trie precision step.


Here's more detail:
http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/search/NumericRangeQuery.html

-- Jack Krupansky

-Original Message- 
From: Ahmet Arslan

Sent: Monday, May 14, 2012 10:16 AM
To: solr-user@lucene.apache.org
Subject: Re: Date format in the schema.xml


is it mandatory to use the date format -mm-ddThh:mm:ssZ
?


Yes.


I have a date with this format:
mmdd
 in my xml source file.

Where can I find more information, I found only these
definitions in the schema.xml


In schema.xml there is a xml comment about dates, starting with

RE: Relicating a large solr index

2012-05-14 Thread Rohit

Hi Erick,

Yes I have enabled the following setting,

internal
   
   5000
   1

Will try with higher timeouts. I tried scp command and the link didnt break
once, I was able to copy the entire 300Gb files, so am not too sure if this
is a network problem.

Regards,
Rohit
Mobile: +91-9901768202
About Me: http://about.me/rohitg


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: 14 May 2012 20:22
To: solr-user@lucene.apache.org
Subject: Re: Relicating a large solr index

Have you tried modifying the timeout parameters? See:
http://wiki.apache.org/solr/SolrReplication,
the "Slave" section..

Best
Erick

On Mon, May 14, 2012 at 10:30 AM, Rohit  wrote:
> The size of index is about 300GB, I am seeing the following error in 
> the logs,
>
> java.net.SocketTimeoutException: Read timed out
>        at java.net.SocketInputStream.socketRead0(Native Method)
>        at java.net.SocketInputStream.read(SocketInputStream.java:129)
>        at 
> java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>        at 
> java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>        at
> org.apache.commons.httpclient.ChunkedInputStream.getChunkSizeFromInput
> Stream
> (ChunkedInputStream.java:250)
>        at
> org.apache.commons.httpclient.ChunkedInputStream.nextChunk(ChunkedInpu
> tStrea
> m.java:221)
>        at
> org.apache.commons.httpclient.ChunkedInputStream.read(ChunkedInputStre
> am.jav
> a:176)
>        at java.io.FilterInputStream.read(FilterInputStream.java:116)
>        at
> org.apache.commons.httpclient.AutoCloseInputStream.read(AutoCloseInput
> Stream
> .java:108)
>        at
> org.apache.solr.common.util.FastInputStream.refill(FastInputStream.jav
> a:68)
>        at
> org.apache.solr.common.util.FastInputStream.read(FastInputStream.java:
> 97)
>        at
> org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.
> java:1
> 22)
>        at
> org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.
> java:1
> 17)
>        at
>
org.apache.solr.handler.SnapPuller$FileFetcher.fetchPackets(SnapPuller.java:
> 943)
>        at
> org.apache.solr.handler.SnapPuller$FileFetcher.fetchFile(SnapPuller.ja
> va:904
> )
>        at
> org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:
> 545)
>        at
> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:29
> 5)
>        at
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.
> java:2
> 68)
>        at
> org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.ja
> va:149
> )
> May 14, 2012 1:45:46 PM org.apache.solr.handler.ReplicationHandler 
> doFetch
> SEVERE: SnapPull failed
> org.apache.solr.common.SolrException: Unable to download _vvyv.fdt 
> completely. Downloaded 200278016!=208644265
>        at
> org.apache.solr.handler.SnapPuller$FileFetcher.cleanup(SnapPuller.java
> :1038)
>        at
> org.apache.solr.handler.SnapPuller$FileFetcher.fetchFile(SnapPuller.ja
> va:918
> )
>        at
> org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:
> 545)
>        at
> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:29
> 5)
>        at
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.
> java:2
> 68)
>        at
> org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.ja
> va:149
> )
>
>
> Actually the replication starts, but is never able to complete and 
> then restarts again.
>
> Regards,
> Rohit
> Mobile: +91-9901768202
> About Me: http://about.me/rohitg
>
>
> -Original Message-
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: 14 May 2012 18:00
> To: solr-user@lucene.apache.org
> Subject: Re: Relicating a large solr index
>
> What do your logs show? Solr replication should be robust.
> How large is "large"?
>
> You might review:
> http://wiki.apache.org/solr/UsingMailingLists
>
> Best
> Erick
>
> On Mon, May 14, 2012 at 3:11 AM, Rohit  wrote:
>> Hi,
>>
>>
>>
>> I have a large solr index which needs to be replicated, solr 
>> replication start but then keeps breaking and starting from 0. Is 
>> there another way to achieve this,          I was thinking of using 
>> scp to copy the index from master to slave and then enable 
>> replication,
> will this work?
>>
>>
>>
>>
>> Regards,
>>
>> Rohit
>>
>>
>>
>
>

Re: Date format in the schema.xml

2012-05-14 Thread Bruno Mannina


Ok Thanks !

Le 14/05/2012 16:16, Ahmet Arslan a écrit :

is it mandatory to use the date format -mm-ddThh:mm:ssZ
?

Yes.


I have a date with this format:
mmdd
  in my xml source file.

Where can I find more information, I found only these
definitions in the schema.xml

In schema.xml there is a xml comment about dates, starting with

Re: Relicating a large solr index

2012-05-14 Thread Erick Erickson

Have you tried modifying the timeout parameters? See:
http://wiki.apache.org/solr/SolrReplication,
the "Slave" section..

Best
Erick

On Mon, May 14, 2012 at 10:30 AM, Rohit  wrote:
> The size of index is about 300GB, I am seeing the following error in the
> logs,
>
> java.net.SocketTimeoutException: Read timed out
>        at java.net.SocketInputStream.socketRead0(Native Method)
>        at java.net.SocketInputStream.read(SocketInputStream.java:129)
>        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>        at
> org.apache.commons.httpclient.ChunkedInputStream.getChunkSizeFromInputStream
> (ChunkedInputStream.java:250)
>        at
> org.apache.commons.httpclient.ChunkedInputStream.nextChunk(ChunkedInputStrea
> m.java:221)
>        at
> org.apache.commons.httpclient.ChunkedInputStream.read(ChunkedInputStream.jav
> a:176)
>        at java.io.FilterInputStream.read(FilterInputStream.java:116)
>        at
> org.apache.commons.httpclient.AutoCloseInputStream.read(AutoCloseInputStream
> .java:108)
>        at
> org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:68)
>        at
> org.apache.solr.common.util.FastInputStream.read(FastInputStream.java:97)
>        at
> org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:1
> 22)
>        at
> org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:1
> 17)
>        at
> org.apache.solr.handler.SnapPuller$FileFetcher.fetchPackets(SnapPuller.java:
> 943)
>        at
> org.apache.solr.handler.SnapPuller$FileFetcher.fetchFile(SnapPuller.java:904
> )
>        at
> org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:545)
>        at
> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:295)
>        at
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:2
> 68)
>        at
> org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:149
> )
> May 14, 2012 1:45:46 PM org.apache.solr.handler.ReplicationHandler doFetch
> SEVERE: SnapPull failed
> org.apache.solr.common.SolrException: Unable to download _vvyv.fdt
> completely. Downloaded 200278016!=208644265
>        at
> org.apache.solr.handler.SnapPuller$FileFetcher.cleanup(SnapPuller.java:1038)
>        at
> org.apache.solr.handler.SnapPuller$FileFetcher.fetchFile(SnapPuller.java:918
> )
>        at
> org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:545)
>        at
> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:295)
>        at
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:2
> 68)
>        at
> org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:149
> )
>
>
> Actually the replication starts, but is never able to complete and then
> restarts again.
>
> Regards,
> Rohit
> Mobile: +91-9901768202
> About Me: http://about.me/rohitg
>
>
> -Original Message-
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: 14 May 2012 18:00
> To: solr-user@lucene.apache.org
> Subject: Re: Relicating a large solr index
>
> What do your logs show? Solr replication should be robust.
> How large is "large"?
>
> You might review:
> http://wiki.apache.org/solr/UsingMailingLists
>
> Best
> Erick
>
> On Mon, May 14, 2012 at 3:11 AM, Rohit  wrote:
>> Hi,
>>
>>
>>
>> I have a large solr index which needs to be replicated, solr
>> replication start but then keeps breaking and starting from 0. Is
>> there another way to achieve this,          I was thinking of using
>> scp to copy the index from master to slave and then enable replication,
> will this work?
>>
>>
>>
>>
>> Regards,
>>
>> Rohit
>>
>>
>>
>
>

Re: Documents With large number of fields

2012-05-14 Thread Jack Krupansky

Indexing should be fine - depending on your total document count. I think 
the potential issue is the FieldCache at query time. I think it should be 
linear based on number of documents, fields, and unique terms per field for 
string values, so if you do two tests, index with 1,000 docs and then 2,000 
docs, and then check Java memory usage after a simple query, then after a 
query with a significant number of these faceted fields, and then after a 
couple more queries with a high number of distinct fields that are faceted, 
and then multiply those memory use increments to scale up to your expected 
range of documents, that should give you a semi-decent estimate of memory 
the JVM will need. CPU requirement estimating would be more complex, but 
memory has to work out first. And the delta for index size between 1,000 and 
2,000 should give you a number to scale up to total index size, roughly, but 
depending on relative uniqueness of field values.


-- Jack Krupansky

-Original Message- 
From: Keswani, Nitin - BLS CTR

Sent: Monday, May 14, 2012 10:27 AM
To: solr-user@lucene.apache.org
Subject: RE: Documents With large number of fields

Unfortunately I never got any response. However I did a POC with a Document 
containing 400 fields and loaded around 1000 docs to my local machine. I 
didn’t see any issue but then again the document set was very small. 
Hopefully as mentioned below providing enough memory should help alleviate 
any performance issues.


Thanks.

Regards,

Nitin Keswani


-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Sunday, May 13, 2012 10:42 PM
To: solr-user@lucene.apache.org
Subject: Re: Documents With large number of fields

I didn't see any response. There was a similar issue recently, where someone 
had 400 faceted fields with 50-70 facets per query and they were running out 
of memory due to accumulation of the FieldCache for these faceted fields, 
but that was on a 3 GB system.


It probably could be done, assuming a fair number of 64-bit sharded 
machines.


-- Jack Krupansky

-Original Message-
From: Darren Govoni
Sent: Sunday, May 13, 2012 7:56 PM
To: solr-user@lucene.apache.org
Subject: Re: Documents With large number of fields

Was there a response to this?

On Fri, 2012-05-04 at 10:27 -0400, Keswani, Nitin - BLS CTR wrote:

Hi,

My data model consist of different types of data. Each data type has
its own characteristics

If I include the unique characteristics of each type of data, my
single Solr Document could end up containing 300-400 fields.

In order to drill down to this data set I would have to provide
faceting on most of these fields so that I can drilldown to very small
set of Documents.

Here are some of the questions :

1) What's the best approach when dealing with documents with large
number of fields .
Should I keep a single document with large number of fields or
split my
document into a number of smaller  documents where each document
would consist of some fields

2) From an operational point of view, what's the drawback of having a
single document with a very large number of fields.
Can Solr support documents with large number of fields (say 300 to
400).


Thanks.

Regards,

Nitin Keswani

Re: Solr Shards multi core slower then single big core

2012-05-14 Thread Michael Della Bitta

Hi, all,

I've been running into murmurs about this idea elsewhere:

http://stackoverflow.com/questions/8698762/run-multiple-big-solr-shard-instances-on-one-physical-machine

http://java.dzone.com/articles/optimizing-solr-or-how-7x-your?mz=33057-solr_lucene

Michael

On Mon, May 14, 2012 at 10:29 AM, Otis Gospodnetic
 wrote:
> Hi Kuli,
>
> As long as there are enough CPUs with spare cycles and disk IO is not a 
> bottleneck, this works faster.  This was 12+ months ago.
>
> Otis
> 
> Performance Monitoring for Solr / ElasticSearch / HBase - 
> http://sematext.com/spm
>
>
>
>>
>> From: Michael Kuhlmann 
>>To: solr-user@lucene.apache.org
>>Sent: Monday, May 14, 2012 10:21 AM
>>Subject: Re: Solr Shards multi core slower then single big core
>>
>>Am 14.05.2012 16:18, schrieb Otis Gospodnetic:
>>> Hi Kuli,
>>>
>>> In a client engagement, I did see this (N shards on 1 beefy box with lots 
>>> of RAM and CPU cores) be faster than 1 big index.
>>>
>>
>>I want to believe you, but I also want to understand. Can you explain
>>why? And did this only happen for single requests, or even under heavy load?
>>
>>Greetings,
>>Kuli
>>
>>
>>

Re: slave index not cleaned

2012-05-14 Thread Bill Bell

This is a known issue in 1.4 especially in Windows. Some of it was resolved in 
3x.

Bill Bell
Sent from mobile


On May 14, 2012, at 5:54 AM, Erick Erickson  wrote:

> Hmmm, replication will require up to twice the space of the
> index _temporarily_, just checking if that's what you're seeing
> But that should go away reasonably soon. Out of curiosity, what
> happens if you restart your server, do the extra files go away?
> 
> But it sounds like your index is growing over a longer period of time
> than just a single replication, is that true?
> 
> Best
> Erick
> 
> On Fri, May 11, 2012 at 6:03 AM, Jasper Floor  wrote:
>> Hi,
>> 
>> On Thu, May 10, 2012 at 5:59 PM, Otis Gospodnetic
>>  wrote:
>>> Hi Jasper,
>> 
>> Sorry, I should've added more technical info wihtout being prompted.
>> 
>>> Solr does handle that for you.  Some more stuff to share:
>>> 
>>> * Solr version?
>> 
>> 1.4
>> 
>>> * JVM version?
>> 1.7 update 2
>> 
>>> * OS?
>> Debian (2.6.32-5-xen-amd64)
>> 
>>> * Java replication?
>> yes
>> 
>>> * Errors in Solr logs?
>> no
>> 
>>> * deletion policy section in solrconfig.xml?
>> missing I would say, but I don't see this on the replication wiki page.
>> 
>> This is what we have configured for replication:
>> 
>> 
>>
>> 
>>> name="masterUrl">${solr.master.url}/df-stream-store/replication
>> 
>>00:20:00
>>internal
>>5000
>>1
>> 
>> 
>> 
>> 
>> We will be updating to 3.6 fairly soon however. To be honest, from
>> what I've read, the Solr cloud is what we really want in the future
>> but we will have to be patient for that.
>> 
>> thanks in advance
>> 
>> mvg,
>> Jasper
>> 
>>> You may also want to look at your Index report in SPM 
>>> (http://sematext.com/spm) before/during/after replication and share what 
>>> you see.
>>> 
>>> Otis
>>> 
>>> Performance Monitoring for Solr / ElasticSearch / HBase - 
>>> http://sematext.com/spm
>>> 
>>> 
>>> 
>>> - Original Message -
 From: Jasper Floor 
 To: solr-user@lucene.apache.org
 Cc:
 Sent: Thursday, May 10, 2012 9:08 AM
 Subject: slave index not cleaned
 
 Perhaps I am missing the obvious but our slaves tend to run out of
 disk space. The index sizes grow to multiple times the size of the
 master. So I just toss all the data and trigger a replication.
 However, can't solr handle this for me?
 
 I'm sorry if I've missed a simple setting which does this for me, but
 if its there then I have missed it.
 
 mvg
 Jasper

Re: Kernel methods in SOLR

2012-05-14 Thread Otis Gospodnetic

Hi Peyman,

I never saw this mentioned on Lucene/Solr MLs, so if anyone has done any work 
on this, I don't think it was shared.

Otis 

Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 



>
> From: Peyman Faratin 
>To: solr-user@lucene.apache.org 
>Sent: Monday, April 23, 2012 12:29 PM
>Subject: Kernel methods in SOLR
> 
>Hi
>
>Has there been any work that tries to integrate Kernel methods [1] with SOLR? 
>I am interested in using kernel methods to solve synonym, hyponym and 
>polysemous (disambiguation) problems which SOLR's Vector space model ("bag of 
>words") does not capture. 
>
>For example, imagine we have only 3 words in our corpus, "puma", "cougar" and 
>"feline". The 3 words have obviously interdependencies (puma disambiguates to 
>cougar, cougar and puma are instances of felines - hyponyms). Now, imagine 2 
>docs, d1 and d2, that have the following TF-IDF vectors. 
>
>                 puma, cougar, feline
>d1       =   [  2,        0,         0]
>d2       =   [  0,        1,         0]
>
>i.e. d1 has no mention of term cougar or feline and conversely, d2 has no 
>mention of terms puma or feline. Hence under the vector approach d1 and d2 are 
>not related at all (and each interpretation of the terms have a unique 
>vector). Which is not what we want to conclude. 
>
>What I need is to include a kernel matrix (as data) such as the following that 
>captures these relationships:
>
>                       puma, cougar, feline
>puma    =   [  1,        1,         0.4]
>cougar    =   [  1,        1,         0.4]
>feline    =   [  0.4,     0.4,         1]
>
>then recompute the TF-IDF vector as a product of (1) the original vector and 
>(2) the kernel matrix, resulting in
>
>                 puma, cougar, feline
>d1       =   [  2,        2,         0.8]
>d2       =   [  1,        1,         0.4]
>
>(note, the new vectors are much less sparse). 
>
>I can solve this problem (inefficiently) at the application layer but I was 
>wondering if there has been any attempts within the community to solve similar 
>problems, efficiently without paying a hefty response time price?
>
>thank you 
>
>Peyman
>
>[1] http://en.wikipedia.org/wiki/Kernel_methods
>
>

Re: Documents With large number of fields

2012-05-14 Thread Otis Gospodnetic

Nitin,

I meant to reply, but

I think the thing to watch out for are Lucene segment merges.  I think this is 
another thing I saw in a client engagement where the client had a crazy number 
of fields.  If I recall correctly, it was segment merges that were painfully 
slow.

So try creating a non-trivial index where you have some big Lucene index 
segments and see how that goes.  This was a while back before various Lucene 
improvements around indexing (different merge policies, non-blocking flushing 
to disk, etc.) were implemented, so things may be different now.

Otis 

Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 



>
> From: "Keswani, Nitin - BLS CTR" 
>To: "solr-user@lucene.apache.org"  
>Sent: Monday, May 14, 2012 10:27 AM
>Subject: RE: Documents With large number of fields
> 
>Unfortunately I never got any response. However I did a POC with a Document 
>containing 400 fields and loaded around 1000 docs to my local machine. I 
>didn’t see any issue but then again the document set was very small. Hopefully 
>as mentioned below providing enough memory should help alleviate any 
>performance issues.
>
>Thanks.
>
>Regards,
>
>Nitin Keswani
>
>
>-Original Message-
>From: Jack Krupansky [mailto:j...@basetechnology.com] 
>Sent: Sunday, May 13, 2012 10:42 PM
>To: solr-user@lucene.apache.org
>Subject: Re: Documents With large number of fields
>
>I didn't see any response. There was a similar issue recently, where someone 
>had 400 faceted fields with 50-70 facets per query and they were running out 
>of memory due to accumulation of the FieldCache for these faceted fields, but 
>that was on a 3 GB system.
>
>It probably could be done, assuming a fair number of 64-bit sharded machines.
>
>-- Jack Krupansky
>
>-Original Message-
>From: Darren Govoni
>Sent: Sunday, May 13, 2012 7:56 PM
>To: solr-user@lucene.apache.org
>Subject: Re: Documents With large number of fields
>
>Was there a response to this?
>
>On Fri, 2012-05-04 at 10:27 -0400, Keswani, Nitin - BLS CTR wrote:
>> Hi,
>>
>> My data model consist of different types of data. Each data type has 
>> its own characteristics
>>
>> If I include the unique characteristics of each type of data, my 
>> single Solr Document could end up containing 300-400 fields.
>>
>> In order to drill down to this data set I would have to provide 
>> faceting on most of these fields so that I can drilldown to very small 
>> set of Documents.
>>
>> Here are some of the questions :
>>
>> 1) What's the best approach when dealing with documents with large 
>> number of fields .
>>     Should I keep a single document with large number of fields or 
>> split my
>>     document into a number of smaller  documents where each document 
>> would consist of some fields
>>
>> 2) From an operational point of view, what's the drawback of having a 
>> single document with a very large number of fields.
>>     Can Solr support documents with large number of fields (say 300 to 
>> 400).
>>
>>
>> Thanks.
>>
>> Regards,
>>
>> Nitin Keswani
>>
>
>
>
>

RE: Relicating a large solr index

2012-05-14 Thread Rohit

The size of index is about 300GB, I am seeing the following error in the
logs,

java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
at
org.apache.commons.httpclient.ChunkedInputStream.getChunkSizeFromInputStream
(ChunkedInputStream.java:250)
at
org.apache.commons.httpclient.ChunkedInputStream.nextChunk(ChunkedInputStrea
m.java:221)
at
org.apache.commons.httpclient.ChunkedInputStream.read(ChunkedInputStream.jav
a:176)
at java.io.FilterInputStream.read(FilterInputStream.java:116)
at
org.apache.commons.httpclient.AutoCloseInputStream.read(AutoCloseInputStream
.java:108)
at
org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:68)
at
org.apache.solr.common.util.FastInputStream.read(FastInputStream.java:97)
at
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:1
22)
at
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:1
17)
at
org.apache.solr.handler.SnapPuller$FileFetcher.fetchPackets(SnapPuller.java:
943)
at
org.apache.solr.handler.SnapPuller$FileFetcher.fetchFile(SnapPuller.java:904
)
at
org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:545)
at
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:295)
at
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:2
68)
at
org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:149
)
May 14, 2012 1:45:46 PM org.apache.solr.handler.ReplicationHandler doFetch
SEVERE: SnapPull failed 
org.apache.solr.common.SolrException: Unable to download _vvyv.fdt
completely. Downloaded 200278016!=208644265
at
org.apache.solr.handler.SnapPuller$FileFetcher.cleanup(SnapPuller.java:1038)
at
org.apache.solr.handler.SnapPuller$FileFetcher.fetchFile(SnapPuller.java:918
)
at
org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:545)
at
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:295)
at
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:2
68)
at
org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:149
)

Actually the replication starts, but is never able to complete and then
restarts again.

Regards,
Rohit
Mobile: +91-9901768202
About Me: http://about.me/rohitg

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: 14 May 2012 18:00
To: solr-user@lucene.apache.org
Subject: Re: Relicating a large solr index

What do your logs show? Solr replication should be robust.
How large is "large"?

You might review:
http://wiki.apache.org/solr/UsingMailingLists

Best
Erick

On Mon, May 14, 2012 at 3:11 AM, Rohit  wrote:
> Hi,
>
>
>
> I have a large solr index which needs to be replicated, solr 
> replication start but then keeps breaking and starting from 0. Is 
> there another way to achieve this,          I was thinking of using 
> scp to copy the index from master to slave and then enable replication,
will this work?
>
>
>
>
> Regards,
>
> Rohit
>
>
>

Re: Solr Shards multi core slower then single big core

2012-05-14 Thread Otis Gospodnetic

Hi Kuli,

As long as there are enough CPUs with spare cycles and disk IO is not a 
bottleneck, this works faster.  This was 12+ months ago.

Otis 

Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 



>
> From: Michael Kuhlmann 
>To: solr-user@lucene.apache.org 
>Sent: Monday, May 14, 2012 10:21 AM
>Subject: Re: Solr Shards multi core slower then single big core
> 
>Am 14.05.2012 16:18, schrieb Otis Gospodnetic:
>> Hi Kuli,
>>
>> In a client engagement, I did see this (N shards on 1 beefy box with lots of 
>> RAM and CPU cores) be faster than 1 big index.
>>
>
>I want to believe you, but I also want to understand. Can you explain 
>why? And did this only happen for single requests, or even under heavy load?
>
>Greetings,
>Kuli
>
>
>

RE: Documents With large number of fields

2012-05-14 Thread Keswani, Nitin - BLS CTR

Unfortunately I never got any response. However I did a POC with a Document 
containing 400 fields and loaded around 1000 docs to my local machine. I didn’t 
see any issue but then again the document set was very small. Hopefully as 
mentioned below providing enough memory should help alleviate any performance 
issues.

Thanks.

Regards,

Nitin Keswani

-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com] 
Sent: Sunday, May 13, 2012 10:42 PM
To: solr-user@lucene.apache.org
Subject: Re: Documents With large number of fields

I didn't see any response. There was a similar issue recently, where someone 
had 400 faceted fields with 50-70 facets per query and they were running out of 
memory due to accumulation of the FieldCache for these faceted fields, but that 
was on a 3 GB system.

It probably could be done, assuming a fair number of 64-bit sharded machines.

-- Jack Krupansky

-Original Message-
From: Darren Govoni
Sent: Sunday, May 13, 2012 7:56 PM
To: solr-user@lucene.apache.org
Subject: Re: Documents With large number of fields

Was there a response to this?

On Fri, 2012-05-04 at 10:27 -0400, Keswani, Nitin - BLS CTR wrote:
> Hi,
>
> My data model consist of different types of data. Each data type has 
> its own characteristics
>
> If I include the unique characteristics of each type of data, my 
> single Solr Document could end up containing 300-400 fields.
>
> In order to drill down to this data set I would have to provide 
> faceting on most of these fields so that I can drilldown to very small 
> set of Documents.
>
> Here are some of the questions :
>
> 1) What's the best approach when dealing with documents with large 
> number of fields .
> Should I keep a single document with large number of fields or 
> split my
> document into a number of smaller  documents where each document 
> would consist of some fields
>
> 2) From an operational point of view, what's the drawback of having a 
> single document with a very large number of fields.
> Can Solr support documents with large number of fields (say 300 to 
> 400).
>
>
> Thanks.
>
> Regards,
>
> Nitin Keswani
>

Re: Solr Shards multi core slower then single big core

2012-05-14 Thread Michael Kuhlmann


Am 14.05.2012 16:18, schrieb Otis Gospodnetic:

Hi Kuli,

In a client engagement, I did see this (N shards on 1 beefy box with lots of 
RAM and CPU cores) be faster than 1 big index.



I want to believe you, but I also want to understand. Can you explain 
why? And did this only happen for single requests, or even under heavy load?


Greetings,
Kuli

Re: Solr Shards multi core slower then single big core

2012-05-14 Thread Otis Gospodnetic

Hi Kuli,

In a client engagement, I did see this (N shards on 1 beefy box with lots of 
RAM and CPU cores) be faster than 1 big index.

Otis 

Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 



>
> From: Michael Kuhlmann 
>To: solr-user@lucene.apache.org 
>Sent: Monday, May 14, 2012 7:56 AM
>Subject: Re: Solr Shards multi core slower then single big core
> 
>Am 14.05.2012 13:22, schrieb Sami Siren:
>>> Sharding is (nearly) always slower than using one big index with sufficient
>>> hardware resources. Only use sharding when your index is too huge to fit
>>> into one single machine.
>> 
>> If you're not constrained by CPU or IO, in other words have plenty of
>> CPU cores available together with for example separate hard discs for
>> each shard splitting your index into smaller shards can in some cases
>> make a huge difference in one box too.
>
>Do you have an example?
>
>This is hard to believe. If you've several shard on the same machine, you'll 
>need that much memory that each shard has enough for all its caches and duch. 
>With that lot of memory, a single Solr core should be really fast.
>
>If dividing the index is the reason, then a software RAID 0 (striping) should 
>be much better.
>
>The only point I see is the concurrent search for one request. Maybe, for 
>large requests, this might outweigh the sharding overhead, but only for 
>long-running requests without disk I/O. I only see the case when using very 
>complicated query functions. And, this only stays true as long as you don't 
>run multiple concurrent requests.
>
>Greetings,
>Kuli
>
>
>

Re: Relicating a large solr index

2012-05-14 Thread Otis Gospodnetic

Rohit,

Sure.  You can set up old style Solr index replication (uses rsync - all 
documented on the Wiki) and then enable Java replication.

Of course, you could also try sharing more details about how your current 
replication breaks and maybe people can help you fix that so you don't have to 
use the above workaround.

Otis 

Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 



>
> From: Rohit 
>To: solr-user@lucene.apache.org 
>Sent: Monday, May 14, 2012 3:11 AM
>Subject: Relicating a large solr index
> 
>Hi,
>
>
>
>I have a large solr index which needs to be replicated, solr replication
>start but then keeps breaking and starting from 0. Is there another way to
>achieve this,          I was thinking of using scp to copy the index from
>master to slave and then enable replication, will this work?
>
>
>
>
>Regards,
>
>Rohit
>
>
>
>
>
>

Re: Date format in the schema.xml

2012-05-14 Thread Ahmet Arslan

> is it mandatory to use the date format -mm-ddThh:mm:ssZ
> ?

Yes.

> I have a date with this format:
> mmdd
>  in my xml source file.
> 
> Where can I find more information, I found only these
> definitions in the schema.xml

In schema.xml there is a xml comment about dates, starting with

Date format in the schema.xml

2012-05-14 Thread Bruno Mannina


Dear,

is it mandatory to use the date format -mm-ddThh:mm:ssZ ?

I have a date with this format:
mmdd
 in my xml source file.

Where can I find more information, I found only these definitions in the 
schema.xml


positionIncrementGap="0"/>
positionIncrementGap="0"/>


Could you explain me the PrecisionStep param also?

Thanks and sorry for this newbie question,
Bruno

Getting payloads for matching term in search result

2012-05-14 Thread s . hermann


Good day

currently I have a field defined as can be seen below:


  

  
  




Basically the content for that field has the following form:

  "Wiedersehn|x1062y1755 macht|x1340y1758 Freude|x1502y1758"

where the stuff after the pipe is the payload data (some coordinates). What I 
want is to get that payload data at query time.
E.g. I search for "macht" and in the result document from solr there will be the payload 
data "x1340y1758".

Is there a way out of the box with solr. I have done this in plain lucene once 
with the TermPositions, so I know it might be possible to adopt this to solr.

Silvio

Re: New scoring models in LUCENE/SOLR (LUCENE-2959)

2012-05-14 Thread Erick Erickson

See LUCENE-2959 is relevant, but can't link there right now, the site is under
maintenance.

Bottom line: This has already been done in trunk although I confess I
haven't made
use of it. It is NOT in any of the 3.x code however.

Best
Erick

On Mon, May 14, 2012 at 8:40 AM, ilay raja  wrote:
> Hi
>
>
>
>  I was going through flexscoring implementations of lucene to experiment
> the new scoring models.
>
>  I am into structured search where I feel BM25F is more relevant to
> use/experiement with.
>
>  But there is no implementation of BM25FSimilarity .. I have also come
> across BM25F implemenatation in http://nlp.uned.es/~jperezi/Lucene-BM25/
>
>
>
>  Has anyone used this (or) integrated with solr.
>
>
>
> Ilay

Re: New scoring models in LUCENE/SOLR (LUCENE-2959)

2012-05-14 Thread ilay raja

Hi



  I was going through flexscoring implementations of lucene to experiment
the new scoring models.

  I am into structured search where I feel BM25F is more relevant to
use/experiement with.

  But there is no implementation of BM25FSimilarity .. I have also come
across BM25F implemenatation in http://nlp.uned.es/~jperezi/Lucene-BM25/



  Has anyone used this (or) integrated with solr.



Ilay

Re: Lucene FieldCache doesn' get cleaned up and OOM occurs

2012-05-14 Thread Erick Erickson

Patches welcome ...

But yeah, the fieldCache is pretty much controlled by Lucene. I rather doubt
there's a lot of interest in flushing it as you're asking, just because of the
concentration on making Solr/Lucene _fast_. Especially if you think about
running 10 simultaneous queries sorting on the 10 fields in your example. At
least one of them would have to block until enough of the others had completed
that it would be OK to flush some values it just seems like a nightmare both
code-wise and speed-wise.

But then I don't really understand the low-level details well enough to say
for sure

Best
Erick

On Mon, May 14, 2012 at 8:18 AM, Mathias Hodler
 wrote:
> Hi Erick,
>
> I'm sorting on 10 different fields (string, date and floats) with 90%
> - 100% unique values and 50k indexed documents.
>
> You're right - cleaning the cache would slow down the next queries
> until the fields are cached again. But it would be nice to run Solr on
> systems with limited memory and accept the longer query time.
>
> I don't can say you an exact size of memory I can use for solr because
> it should be runnable on different systems. Thats the reason why I
> analyzed the behavior of solr on systems with little memory.
>
> So reducing the number of available sort fields is the only
> possibility if you can't increase memory.
>
> Thanks.
>
> Mathias
>
> 2012/5/14 Erick Erickson :
>> But consider what would happen if the cache was cleaned up the next
>> query in would require that the terms be re-loaded. I guess it's possible
>> that some people would be willing to sacrifice speed in constrained
>> situations...
>>
>> Meanwhile, you have two options
>> 1> increase memory
>> 2> sort on fewer unique values. Have you examined your index to see
>>     how many _unique_ values are in these fields? And is it possible to
>>     reduce that number?
>>
>> Best
>> Erick
>>
>> P.S. How much memory are you working with, and how many unique values
>> of what types are you sorting on? And what version of Solr are you using?
>>
>> On Fri, May 11, 2012 at 5:38 AM, Mathias Hodler
>>  wrote:
>>> Hi,
>>>
>>> sorting on a field increases the Lucene FieldCache. If I'm starting 10
>>> queries and each query sorting on a different field, 9 queries could
>>> be executed but then the Lucene FieldCache exceeds max memory and OOM
>>> occurs.
>>> In my opinion Lucene Field Cache should be cleaned up if there is not
>>> enough memory left. But instead of that, Field Cache will always
>>> remains in "Old Generation GC".
>>>
>>> Could this be fixed or is the only way out to get more memory?
>>>
>>> Thanks.
>>>
>>> Mathias

Re: Relicating a large solr index

2012-05-14 Thread Erick Erickson

What do your logs show? Solr replication should be robust.
How large is "large"?

You might review:
http://wiki.apache.org/solr/UsingMailingLists

Best
Erick

On Mon, May 14, 2012 at 3:11 AM, Rohit  wrote:
> Hi,
>
>
>
> I have a large solr index which needs to be replicated, solr replication
> start but then keeps breaking and starting from 0. Is there another way to
> achieve this,          I was thinking of using scp to copy the index from
> master to slave and then enable replication, will this work?
>
>
>
>
> Regards,
>
> Rohit
>
>
>

Re: searching when in a solr-component?

2012-05-14 Thread Erick Erickson

I think something like
SolrQueryRequest.getCore().getSearcher()
does what you want

Best
Erick

On Fri, May 11, 2012 at 5:46 PM, Paul Libbrecht  wrote:
> Hello SOLR experts,
>
> can I see the same index while responding another query?
> If yes how?
>
> thanks in advance
>
> Paul

Re: Editing long Solr URLs - Chrome Extension

2012-05-14 Thread Erick Erickson

Cool!

On Fri, May 11, 2012 at 10:56 AM, Jan Høydahl  wrote:
> I've been testing 
> https://chrome.google.com/webstore/detail/mbnigpeabbgkmbcbhkkbnlidcobbapff?hl=en
>  but I don't think it's great.
>
> Great work on this one. Simple and straight forward. A few wishes:
> * Sticky mode? This tool would make sense in a sidebar, to do rapid 
> refinements
> * If you edit a value and click "TAB", it is not updated :(
> * It should not be necessary to URLencode all non-ascii chars - why not leave 
> colon, caret (^) etc as is, for better readability?
> * Some param values in Solr may be large, such as "fl", "qf" or "bf". Would 
> be nice if the edit box was multi-line, or perhaps adjusts to the size of the 
> content
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.facebook.com/Cominvent
> Solr Training - www.solrtraining.com
>
> On 11. mai 2012, at 07:32, Amit Nithian wrote:
>
>> Hey all,
>>
>> I don't know about you but most of the Solr URLs I issue are fairly
>> lengthy full of parameters on the query string and browser location
>> bars aren't long enough/have multi-line capabilities. I tried to find
>> something that does this but couldn't so I wrote a chrome extension to
>> help.
>>
>> Please check out my blog post on the subject and please let me know if
>> something doesn't work or needs improvement. Of course this can work
>> for any URL with a query string but my motivation was to help edit my
>> long Solr URLs.
>>
>> http://hokiesuns.blogspot.com/2012/05/manipulating-urls-with-long-query.html
>>
>> Thanks!
>> Amit
>

Re: Lucene FieldCache doesn' get cleaned up and OOM occurs

2012-05-14 Thread Mathias Hodler

Hi Erick,

I'm sorting on 10 different fields (string, date and floats) with 90%
- 100% unique values and 50k indexed documents.

You're right - cleaning the cache would slow down the next queries
until the fields are cached again. But it would be nice to run Solr on
systems with limited memory and accept the longer query time.

I don't can say you an exact size of memory I can use for solr because
it should be runnable on different systems. Thats the reason why I
analyzed the behavior of solr on systems with little memory.

So reducing the number of available sort fields is the only
possibility if you can't increase memory.

Thanks.

Mathias

2012/5/14 Erick Erickson :
> But consider what would happen if the cache was cleaned up the next
> query in would require that the terms be re-loaded. I guess it's possible
> that some people would be willing to sacrifice speed in constrained
> situations...
>
> Meanwhile, you have two options
> 1> increase memory
> 2> sort on fewer unique values. Have you examined your index to see
>     how many _unique_ values are in these fields? And is it possible to
>     reduce that number?
>
> Best
> Erick
>
> P.S. How much memory are you working with, and how many unique values
> of what types are you sorting on? And what version of Solr are you using?
>
> On Fri, May 11, 2012 at 5:38 AM, Mathias Hodler
>  wrote:
>> Hi,
>>
>> sorting on a field increases the Lucene FieldCache. If I'm starting 10
>> queries and each query sorting on a different field, 9 queries could
>> be executed but then the Lucene FieldCache exceeds max memory and OOM
>> occurs.
>> In my opinion Lucene Field Cache should be cleaned up if there is not
>> enough memory left. But instead of that, Field Cache will always
>> remains in "Old Generation GC".
>>
>> Could this be fixed or is the only way out to get more memory?
>>
>> Thanks.
>>
>> Mathias

Re: How detect slave replication termination

2012-05-14 Thread Erick Erickson

What is it you're trying to accomplish with this knowledge?

I _think_ that you can use the HTTP request indexversion here:
http://wiki.apache.org/solr/SolrReplication#HTTP_API
and I _think_ this is not updated until after replication... It should
match that of the master.

Best
Erick

On Fri, May 11, 2012 at 9:49 AM, Jamel ESSOUSSI
 wrote:
> Hi,
>
> I have an indexer that indexes solr documents, at the end of the indexing I
> will initiate replication by activating it on the master and on all slaves,
> my question is : how I will know when the replication between the master and
> the slave1 will be ended to replicate with the slave2.
>
> Best Regards
>
> --Jamel ESSOUSSI
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/How-detect-slave-replication-termination-tp3979991.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Shards multi core slower then single big core

2012-05-14 Thread Michael Kuhlmann


Am 14.05.2012 13:22, schrieb Sami Siren:

Sharding is (nearly) always slower than using one big index with sufficient
hardware resources. Only use sharding when your index is too huge to fit
into one single machine.


If you're not constrained by CPU or IO, in other words have plenty of
CPU cores available together with for example separate hard discs for
each shard splitting your index into smaller shards can in some cases
make a huge difference in one box too.


Do you have an example?

This is hard to believe. If you've several shard on the same machine, 
you'll need that much memory that each shard has enough for all its 
caches and duch. With that lot of memory, a single Solr core should be 
really fast.


If dividing the index is the reason, then a software RAID 0 (striping) 
should be much better.


The only point I see is the concurrent search for one request. Maybe, 
for large requests, this might outweigh the sharding overhead, but only 
for long-running requests without disk I/O. I only see the case when 
using very complicated query functions. And, this only stays true as 
long as you don't run multiple concurrent requests.


Greetings,
Kuli

Re: slave index not cleaned

2012-05-14 Thread Erick Erickson

Hmmm, replication will require up to twice the space of the
index _temporarily_, just checking if that's what you're seeing
But that should go away reasonably soon. Out of curiosity, what
happens if you restart your server, do the extra files go away?

But it sounds like your index is growing over a longer period of time
than just a single replication, is that true?

Best
Erick

On Fri, May 11, 2012 at 6:03 AM, Jasper Floor  wrote:
> Hi,
>
> On Thu, May 10, 2012 at 5:59 PM, Otis Gospodnetic
>  wrote:
>> Hi Jasper,
>
> Sorry, I should've added more technical info wihtout being prompted.
>
>> Solr does handle that for you.  Some more stuff to share:
>>
>> * Solr version?
>
> 1.4
>
>> * JVM version?
> 1.7 update 2
>
>> * OS?
> Debian (2.6.32-5-xen-amd64)
>
>> * Java replication?
> yes
>
>> * Errors in Solr logs?
> no
>
>> * deletion policy section in solrconfig.xml?
> missing I would say, but I don't see this on the replication wiki page.
>
> This is what we have configured for replication:
>
> 
>    
>
>         name="masterUrl">${solr.master.url}/df-stream-store/replication
>
>        00:20:00
>        internal
>        5000
>        1
>
>     
> 
>
> We will be updating to 3.6 fairly soon however. To be honest, from
> what I've read, the Solr cloud is what we really want in the future
> but we will have to be patient for that.
>
> thanks in advance
>
> mvg,
> Jasper
>
>> You may also want to look at your Index report in SPM 
>> (http://sematext.com/spm) before/during/after replication and share what you 
>> see.
>>
>> Otis
>> 
>> Performance Monitoring for Solr / ElasticSearch / HBase - 
>> http://sematext.com/spm
>>
>>
>>
>> - Original Message -
>>> From: Jasper Floor 
>>> To: solr-user@lucene.apache.org
>>> Cc:
>>> Sent: Thursday, May 10, 2012 9:08 AM
>>> Subject: slave index not cleaned
>>>
>>> Perhaps I am missing the obvious but our slaves tend to run out of
>>> disk space. The index sizes grow to multiple times the size of the
>>> master. So I just toss all the data and trigger a replication.
>>> However, can't solr handle this for me?
>>>
>>> I'm sorry if I've missed a simple setting which does this for me, but
>>> if its there then I have missed it.
>>>
>>> mvg
>>> Jasper
>>>

Re: Multi-words synonyms matching

2012-05-14 Thread elisabeth benoit

Just for the record, I'd like to conclude this thread

First, you were right, there was no behaviour difference between fq and q
parameters.

I realized that:

1) my synonym (hotel de ville) has a stopword in it (de) and since I used
tokenizerFactory="solr.KeywordTokenizerFactory" in my synonyms declaration,
there was no stopword removal in the indewed expression, so when requesting
"hotel de ville", after stopwords removal in query, Solr was comparing
"hotel de ville"
with "hotel ville"

but my queries never even got to that point since

2) I made a mistake using "mairie" alone in the admin interface when
testing my schema. The real field was something like "collectivités
territoriales mairie",
so the synonym "hotel de ville" was not even applied, because of the
tokenizerFactory="solr.KeywordTokenizerFactory" in my synonym definition
not splitting field into words when parsing

So my problem is not solved, and I'm considering solving it outside of Solr
scope, unless someone else has a clue

Thanks again,
Elisabeth



2012/4/25 Erick Erickson 

> A little farther down the debug info output you'll find something
> like this (I specified fq=name:features)
>
> 
> name:features
> 
>
>
> so it may well give you some clue. But unless I'm reading things wrong,
> your
> q is going against a field that has much more information than the
> CATEGORY_ANALYZED field, is it possible that the data from your
> test cases simply isn't _in_ CATEGORY_ANALYZED?
>
> Best
> Erick
>
> On Wed, Apr 25, 2012 at 9:39 AM, elisabeth benoit
>  wrote:
> > I'm not at the office until next Wednesday, and I don't have my Solr
> under
> > hand, but isn't debugQuery=on giving informations only about q parameter
> > matching and nothing about fq parameter? Or do you mean
> > "parsed_filter_querie"s gives information about fq?
> >
> > CATEGORY_ANALYZED is being populated by a copyField instruction in
> > schema.xml, and has the same field type as my catchall field, the search
> > field for my searchHandler (the one being used by q parameter).
> >
> > CATEGORY (a string) is copied in CATEGORY_ANALYZED (field type is text)
> >
> > CATEGORY (a string) is copied in catchall field (field type is text),
> and a
> > lot of other fields are copied too in that catchall field.
> >
> > So as far as I can see, the same analysis should be done in both cases,
> but
> > obviously I'm missing something, and the only thing I can think of is a
> > different behavior between q and fq parameter.
> >
> > I'll check that parsed_filter_querie first thing in the morning next
> > Wednesday.
> >
> > Thanks a lot for your help.
> >
> > Elisabeth
> >
> >
> > 2012/4/24 Erick Erickson 
> >
> >> Elisabeth:
> >>
> >> What shows up in the debug section of the response when you add
> >> &debugQuery=on? There should be some bit of that section like:
> >> "parsed_filter_queries"
> >>
> >> My other question is "are you absolutely sure that your
> >> CATEGORY_ANALYZED field has the correct content?". How does it
> >> get populated?
> >>
> >> Nothing jumps out at me here
> >>
> >> Best
> >> Erick
> >>
> >> On Tue, Apr 24, 2012 at 9:55 AM, elisabeth benoit
> >>  wrote:
> >> > yes, thanks, but this is NOT my question.
> >> >
> >> > I was wondering why I have multiple matches with q="hotel de ville"
> and
> >> no
> >> > match with fq=CATEGORY_ANALYZED:"hotel de ville", since in both case
> I'm
> >> > searching in the same solr fieldType.
> >> >
> >> > Why is q parameter behaving differently in that case? Why do the
> quotes
> >> > work in one case and not in the other?
> >> >
> >> > Does anyone know?
> >> >
> >> > Thanks,
> >> > Elisabeth
> >> >
> >> > 2012/4/24 Jeevanandam 
> >> >
> >> >>
> >> >> usage of q and fq
> >> >>
> >> >> q => is typically the main query for the search request
> >> >>
> >> >> fq => is Filter Query; generally used to restrict the super set of
> >> >> documents without influencing score (more info.
> >> >> http://wiki.apache.org/solr/**CommonQueryParameters#q<
> >> http://wiki.apache.org/solr/CommonQueryParameters#q>
> >> >> )
> >> >>
> >> >> For example:
> >> >> 
> >> >> q="hotel de ville" ===> returns 100 documents
> >> >>
> >> >> q="hotel de ville"&fq=price:[100 To *]&fq=roomType:"King size Bed"
> ===>
> >> >> returns 40 documents from super set of 100 documents
> >> >>
> >> >>
> >> >> hope this helps!
> >> >>
> >> >> - Jeevanandam
> >> >>
> >> >>
> >> >>
> >> >> On 24-04-2012 3:08 pm, elisabeth benoit wrote:
> >> >>
> >> >>> Hello,
> >> >>>
> >> >>> I'd like to resume this post.
> >> >>>
> >> >>> The only way I found to do not split synonyms in words in
> synonyms.txt
> >> it
> >> >>> to use the line
> >> >>>
> >> >>>   >> >>> ignoreCase="true" expand="true"
> >> >>> tokenizerFactory="solr.**KeywordTokenizerFactory"/>
> >> >>>
> >> >>> in schema.xml
> >> >>>
> >> >>> where tokenizerFactory="solr.**KeywordTokenizerFactory"
> >> >>>
> >> >>> instructs SynonymFilterFactory not to break synonyms into words on
> >> white
> >> >>> spaces when pars

Re: Lucene FieldCache doesn' get cleaned up and OOM occurs

2012-05-14 Thread Erick Erickson

But consider what would happen if the cache was cleaned up the next
query in would require that the terms be re-loaded. I guess it's possible
that some people would be willing to sacrifice speed in constrained
situations...

Meanwhile, you have two options
1> increase memory
2> sort on fewer unique values. Have you examined your index to see
 how many _unique_ values are in these fields? And is it possible to
 reduce that number?

Best
Erick

P.S. How much memory are you working with, and how many unique values
of what types are you sorting on? And what version of Solr are you using?

On Fri, May 11, 2012 at 5:38 AM, Mathias Hodler
 wrote:
> Hi,
>
> sorting on a field increases the Lucene FieldCache. If I'm starting 10
> queries and each query sorting on a different field, 9 queries could
> be executed but then the Lucene FieldCache exceeds max memory and OOM
> occurs.
> In my opinion Lucene Field Cache should be cleaned up if there is not
> enough memory left. But instead of that, Field Cache will always
> remains in "Old Generation GC".
>
> Could this be fixed or is the only way out to get more memory?
>
> Thanks.
>
> Mathias

Re: Solr Import Handler Custom Transformer not working

2012-05-14 Thread Erick Erickson

Nothing jumps out on a quick look. So I'd try a couple of things:

1> you can debug this via "remote debugging" in an IDE, see if
your code is a> reached and b> does what you expect.
2> Look at your logs. Anything coming out that's unexpected?
3> Try some simple logging, maybe dump out the document (row)
 you get after all your substitutions.

Is it possible that you're not executing this code at all?

But, as I said this seems OK, so it may be some kind of pilot error

Sorry I can't be more help
Erick

On Thu, May 10, 2012 at 9:45 PM, dboychuck  wrote:
> Also here is my schema
>
>   stored="false" multiValued="true"/>
>    stored="false" multiValued="true"/>
>    stored="false"/>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-Import-Handler-Custom-Transformer-not-working-tp3978746p3978748.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Multicore file handling

2012-05-14 Thread Aleksander Akerø

Hi

 

Im having problems with the file handling when using multicore setup in
solr 3.6. The same issue that is described in Solr-1894
 . In Jira it says that it
has been fixed in 3.1, doesnt that mean that it should work also for 3.6?

 

The problem for me is that the browse gui is not able to load the css files.
But Ive also noticed that solr wont load the schema.xml ++ from either of
the cores admin. Neither the default core.

 

My core setup is quite simple:



 



 













 

And each core has its own home directory that is more or less the same as
the example dirs. This is also where the velocity folder resides.

 

/Aleksander

Re: Solr Shards multi core slower then single big core

2012-05-14 Thread Sami Siren

> Sharding is (nearly) always slower than using one big index with sufficient
> hardware resources. Only use sharding when your index is too huge to fit
> into one single machine.

If you're not constrained by CPU or IO, in other words have plenty of
CPU cores available together with for example separate hard discs for
each shard splitting your index into smaller shards can in some cases
make a huge difference in one box too.

--
 Sami Siren

Re: Solr Shards multi core slower then single big core

2012-05-14 Thread Michael Kuhlmann


Am 14.05.2012 05:56, schrieb arjit:

Thanks Erick for the reply.
I have 6 cores which doesn't contain duplicated data. every core has some
unique data. What I thought was when I read it would read parallel 6 cores
and join the result and return the query. And this would be efficient then
reading one big core.


No, it's not. When you request 10 documents from Solr, it can't know in 
prior which shards contain how many of those documents. It could be that 
each shard only needs to fill one or two documents into the result, but 
it might be that only one shard conatins all ten docuemnts. Therefor, 
Solr needs to request 10 documents from each shard, then taking only the 
10 top documents from those 60 ones and drop the rest. And it gets worse 
when you set an offset of, say, 100.


Sharding is (nearly) always slower than using one big index with 
sufficient hardware resources. Only use sharding when your index is too 
huge to fit into one single machine.


Greetings,
Kuli

Hunspell stemmer solr 3.4

2012-05-14 Thread search engn dev

I am currently using solr 3.4 in my application, Currently i cant upgrade
solr to 3.5 due to some problem. I want to use hunspell stemmer in solr 3.4
, for doing this which all changes i need to make.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Hunspell-stemmer-solr-3-4-tp3983630.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How add custom field to Nutch1.4?

2012-05-14 Thread Markus Jelsma

Please ask Nutch related questions only on the Nutch users mailing 
list.

Thanks.

On Sun, 13 May 2012 20:18:37 -0700 (PDT), forwardswing 
 wrote:

who can help me ?

--
View this message in context:

http://lucene.472066.n3.nabble.com/How-add-custom-field-to-Nutch1-4-tp3983549p3983597.html
Sent from the Solr - User mailing list archive at Nabble.com.


--

Re: Show a portion of searchable text in Solr

2012-05-14 Thread Ahmet Arslan

> I have indexed very large documents, In some cases these
> documents has
> 100.000 characters. Is there a way to return a portion of
> the documents
> (lets say the 300 first characters) when i am querying
> "Solr"?. Is there any
> attribute to set in the schema.xml or solrconfig.xml to
> achieve this?

I have a set-up with very large documents too. Here is two different solutions 
that I have used in the past:

1) Use highlighting with hl.alternateField and hl.maxAlternateFieldLength
http://wiki.apache.org/solr/HighlightingParameters

2) Create an extra field (indexed="false" and stored="true") using copyField 
just for display purposes. (&fl=shortField)


http://wiki.apache.org/solr/SchemaXml#Copy_Fields

Also, didn't used by myself yet but I *think* this can be accomplished by using 
a custom Transformer too. http://wiki.apache.org/solr/DocTransformers

Relicating a large solr index

2012-05-14 Thread Rohit

Hi,

 

I have a large solr index which needs to be replicated, solr replication
start but then keeps breaking and starting from 0. Is there another way to
achieve this,  I was thinking of using scp to copy the index from
master to slave and then enable replication, will this work?


 

Regards,

Rohit

Re: Index Corruption

2012-05-14 Thread Shubham Srivastava

I am using 3.5 .

- Original Message -
From: Lance Norskog [mailto:goks...@gmail.com]
Sent: Monday, May 14, 2012 11:08 AM
To: solr-user@lucene.apache.org 
Subject: Re: Index Corruption

"Index corruption" usually means data structure problems. There is a
Lucene program 'org.apache.lucene.index.CheckIndex' in the lucene core
jar. If there is a problem with the data structures, this program will
find it:

java -cp lucene-core-XX.jar org.apache.lucene.index.CheckIndex /index/data

Do you use Solr 3.1, 3.2 or 3.3? There was an index flushing bug in
this series of Solr releases. Solr 3.4, 3.5 and 3.6 don't have the
problem, and the trunk never had the problem.

You should not (to my knowledge) ever have duplicated documents if
there is a crash while indexing. If this happens, but there is no
Lucene index corruption, please file a bug.

Lance

On Sun, May 13, 2012 at 1:51 PM, shubham  wrote:
> We have a problem in last couple of days when a particular Solr master was
> restarted while there was an import running . This led to the corruption of
> some document entities where they had multiple doc's of same unique id etc.
>
> Is this kind of corruption possible , by now I expected that Solr indexing
> works a way where either the data is completely imported/updated or nothing
> has changed, But with this there exists a third possibility which is pretty
> risky. Apart from writing queries to generate alerts when some kind of
> corruption occurs is there a recommended way to do the same. However why the
> same happened in terms of corruption still bother's me.
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Index-Corruption-tp3983579.html
> Sent from the Solr - User mailing list archive at Nabble.com.

-- 
Lance Norskog
goks...@gmail.com

1 2 >

100 matches

Mail list logo