Unknown field

2012-05-17 Thread Tolga

Hi,

Is there a way what fields to add to schema.xml prior to crawling with 
nutch, rather than crawling over and over again and fixing the fields 
one by one?


Regards,


Re: Question about sampling

2012-05-17 Thread Lance Norskog
Yes. The trick is to use a hash value on each document. The
SignatureUpdateProcessor provides a tool for this. Store the hash
value in a hex string field.

Now, do wildcard queries on the hash string: hash:a* will randomly
choose 1/16 of the documents. hash:00* will pick 1/256 of the
documents.

On Wed, May 16, 2012 at 6:43 AM, Yuval Dotan  wrote:
> Hi Guys
> We have an environment containing billions of documents.
> Faceting over this large result set could take many seconds, and so we
> thought we might be able to use statistical sampling of a smaller result
> set from the facet, and give an approximate result much quicker.
> Is there any way to facet only a random sample of the results?
> Thanks
> Yuval



-- 
Lance Norskog
goks...@gmail.com


Re: fq syntax question

2012-05-17 Thread Chris Hostetter

: No. fq queries are standard syntax queries. But they can be arbitrarily
: complex, i.e. fq=model:(member OR new_member)

using param refrences, you can also do some interesting things like...

  fq={!term f=model v=$model}&model=member

...which can come in handy for hardcoding certain rules in your 
 defaults, and then overriding them at request time.

but that doesn't help with the "multiple OR clauses" type situation in 
your example (because the variable refrences only look at the first 
value)...

: > Is it possible to pass the fq parameter with alternative syntax like:
: > fq=model=member&model= new_member or in other way?


-Hoss


Re: Posting JSON Data to Solr using XHR?

2012-05-17 Thread Chris Hostetter

: I am trying to post JSON Data to Solr using XHR / JQuery and it doesn't seem

You are not POSTing any JSON data.  In this method...

: var jqxhr = $.post(url, { "id" : "978-0545139700",
: "cat" : "book",
: "name" : "Harry Potter and the Deathly Hallows",
: "author" : "J K Rowling",
: "price" : "13.65",
: "pages_i" : "787"
: },

...what you passing the post() method is an anon map containing keys 
"id", "cat", "name", etc... which jquery then treats as form data sent 
using application/x-www-form-urlencoded

http://api.jquery.com/jQuery.ajax/#sending-data-to-server

If you want to POST json data you need to specify the JSON contentType 
(and i believe: serialize the map to a single string yourself ... there 
are lots of examples if you google "POST json data with jquery")

: Here is the message on the console. 

You can see from logs that the individual key=val pairs in your map where 
sent to solr as request params...

: *INFO: [collection1] webapp=/solr path=/update
: 
params={id=978-0545139700&author=J+K+Rowling&cat=book&price=13.65&pages_i=787&commit=true&name=Harry+Potter+and+the+Deathly+Hallows}
: {commit=} 0 29


-Hoss


Re: Urgent! Highlighting not working as expected

2012-05-17 Thread Chris Hostetter

copyField is a literal operation that happens at index time -- but it 
really has no bearing what so ever on highlighting done at query time.  
there is no "memory" of what source fields any values came from, so it 
doesn't affect things in any way.

You haven't provided any details about your schema, but i suspect the 
reason you are only seeing highlights in the "cr_name" field (and 
not any other field containing the value) is because highlighting can only 
work if the field value is stored...

http://wiki.apache.org/solr/FieldOptionsByUseCase

...if your "text" field is stored=false, then solr can't highlight it at 
all, so it is never even considered when you use hl.fl=*

: I queried Solr (3.5) with this: q=text:"G-Money"&hl=true&hl.fl=*, where text
: is a "text" field and all the other fields were copied to it. I got three
: records returned, however, only one field (also "text" field) was
: highlighted: 
: 
: 
: 
: G-MONEY HETZEL
: 
: 
: 
: 
: 
: 
: But the other two also have matched fields (that is why they are returned),
: but they are "string" field, they were not highlighted. Also, in the same
: record "cr_149107", the "string" field "cr_firstname" has exactly matched
: string "G-Money", but it was not highlighted. But if I search on this field:
: q=cr_firstname:"G-Money"&hl=true&hl.fl=*, it will be highlighted. Any idea
: what shall I do to let both "text" and "string" fields highlighted? 


-Hoss


Re: highlighter not respecting sentence boundry

2012-05-17 Thread abhayd
hi 

It did work in many cases but now is see many cases where it is not working.
Is this something to do with analysis. I m using word delimiter factory on
the field which is being used as hi.field.

Should this field be not tokenized?  use one field for search and copy of it 
for hl.field? 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/highlighter-not-respecting-sentence-boundry-tp3984327p3984564.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Exception in DataImportHandler (stack overflow)

2012-05-17 Thread Shawn Heisey

On 5/17/2012 3:01 PM, Dyer, James wrote:

Do you think this behavior is because, while the indexing is paused, you reach 
some type of timeout so either your db or the jdbc cuts the connection?  Or, ar 
you thinking something in the DIH/JDBCDataSource code is causing the connection 
to drop under these circumstances?


I'm almost positive that it's the db or jdbc that cuts the connection, 
probably the former.  The last time I ran into it (which was before Solr 
3.5), Solr's indexing was paused for eight minutes while merges 
finished, and I think we have a five minute timeout.  I don't think I 
saw the same exception that Jon is seeing, but I don't have a record, so 
I can't check.


My test server is SATA RAID1, and I have also done some indexing onto a 
USB2/SATA drive, which is SLOW.  I've never run into the timeout problem 
on my production system, but those machines are running six 1TB drives 
in RAID10.  Lots of IOPS.


With an effective mergeFactor of 35, I merge much less often and I never 
see a third-level merge.  I haven't calculated how big my index has to 
get before I will see a third level merge, but with my settings (see 
below, because I modified the config snippet I pasted in earlier) I 
should keep indexing even with three merges happening.


Solr 3.6 API for ConcurrentMergeScheduler:
http://bit.ly/JNmNY4

I did remove one line from my indexDefaults that I pasted in - I also 
set maxThreadCount to 4, even though I am not doing a multithreaded 
DIH.  I removed it because I thought it might be confusing to have it 
there.  Turns out that was a bad idea.  After looking at the 3.6 source 
code for ConcurrentMergeScheduler, I believe that maxThreadCount is 
required, but maxMergeCount defaults to maxThreadCount plus two, so it 
actually would not be required, as long as maxThreadCount is set to at 
least 4.  Without the explicit configuration, maxThreadCount would 
default to three or less, depending on how many CPUs you have.


  private int maxThreadCount = Math.max(1, Math.min
(3, Runtime.getRuntime().availableProcessors()/2));
  private int maxMergeCount = maxThreadCount+2;



false

35
35
105


4
4

128
32768
1000
1
native





Re: boost not showing up in Solr 3.6 debugQueries?

2012-05-17 Thread Robert Muir
On Thu, May 17, 2012 at 4:51 PM, Tom Burton-West  wrote:

> But in Solr 3.6 I am not seeing the boost factor called out.
>
>  On the other hand it looks like it may now be incoroporated in the
> queryNorm (Please see example below).
>
> Is there a bug in Solr 3.6 debugQueries?  Is there some new behavior
> regarding boosts and queryNorms? or am I missing something obvious?
>

Your queries are different. your first example is a simple termquery.
The second example is a boolean query.

if you have a booleanquery(green frog) with a boost of 5, it
incorporates its boost into the query norm passed down to its
children.
So when leaf nodes normalize their weight, it includes all the boosts
from the parent hierarchy.
You can see what I mean if you look at BooleanWeight.normalize()

Because of how this is done, 3.x's explain confusingly only shows the
leaf node's explicit boost, since thats all it really knows.
To see what i mean try something like booleanquery(green^2 frog^3)^5

In 4.x these boosts are split apart from and kept separate from the
query norm, so we could actually improve the explanations here I
think.

-- 
lucidimagination.com


RE: Exception in DataImportHandler (stack overflow)

2012-05-17 Thread Dyer, James
Shawn,

Do you think this behavior is because, while the indexing is paused, you reach 
some type of timeout so either your db or the jdbc cuts the connection?  Or, ar 
you thinking something in the DIH/JDBCDataSource code is causing the connection 
to drop under these circumstances?

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Thursday, May 17, 2012 3:56 PM
To: solr-user@lucene.apache.org
Subject: Re: Exception in DataImportHandler (stack overflow)

 On 5/15/2012 3:42 PM, Jon Drukman wrote:
> I fixed it for now by upping the wait_timeout on the mysql server.
>   Apparently Solr doesn't like having its connection yanked out from under
> it and/or isn't smart enough to reconnect if the server goes away.  I'll
> set it back the way it was and try your readOnly option.

I use DIH with MySQL.  The only time I ran into timeouts while importing 
was related to segment merging.  A first level merge happens when the 
number of segments reaches mergeFactor.  A second level merge happens 
when the number of merged segments reaches mergeFactor.  A third level 
merge happens when you get enough segments created by second level 
merges.  It's probably possible for this to extend to fourth level and 
beyond, though I have not seen that personally.

When there are multiple merges happening at the same time (on 3.4 and 
earlier, 3.5 may have changed this), only one of them actually runs, the 
others are paused.  Eventually, if you have a slow I/O system (SATA 
RAID1 or slower) and a big enough index, your full-import can reach a 
state where you have all three levels happening at the same time.  When 
this happens, indexing stops.  If it stops for long enough, the server 
will close the connection and DIH will fail once it begins indexing again.

Since my DIH config consists of a single SELECT statement that runs for 
the entire three hour duration of the import, adding reconnect 
capability to DIH would not help.  The only way to make it work right is 
to configure things such that Solr never stops indexing.  I did this by 
increasing my mergeFactor, and when I installed Solr 3.5, used 
maxMergeAtOnce, segmentsPerTier, and maxMergeAtOnceExplicit.  I also 
increased maxMergeCount under mergeScheduler.  Here's my current 
indexDefaults section:


false

35
35
105


4

128
32768
1000
1
native


Thanks,
Shawn



Re: Exception in DataImportHandler (stack overflow)

2012-05-17 Thread Shawn Heisey

On 5/15/2012 3:42 PM, Jon Drukman wrote:

I fixed it for now by upping the wait_timeout on the mysql server.
  Apparently Solr doesn't like having its connection yanked out from under
it and/or isn't smart enough to reconnect if the server goes away.  I'll
set it back the way it was and try your readOnly option.


I use DIH with MySQL.  The only time I ran into timeouts while importing 
was related to segment merging.  A first level merge happens when the 
number of segments reaches mergeFactor.  A second level merge happens 
when the number of merged segments reaches mergeFactor.  A third level 
merge happens when you get enough segments created by second level 
merges.  It's probably possible for this to extend to fourth level and 
beyond, though I have not seen that personally.


When there are multiple merges happening at the same time (on 3.4 and 
earlier, 3.5 may have changed this), only one of them actually runs, the 
others are paused.  Eventually, if you have a slow I/O system (SATA 
RAID1 or slower) and a big enough index, your full-import can reach a 
state where you have all three levels happening at the same time.  When 
this happens, indexing stops.  If it stops for long enough, the server 
will close the connection and DIH will fail once it begins indexing again.


Since my DIH config consists of a single SELECT statement that runs for 
the entire three hour duration of the import, adding reconnect 
capability to DIH would not help.  The only way to make it work right is 
to configure things such that Solr never stops indexing.  I did this by 
increasing my mergeFactor, and when I installed Solr 3.5, used 
maxMergeAtOnce, segmentsPerTier, and maxMergeAtOnceExplicit.  I also 
increased maxMergeCount under mergeScheduler.  Here's my current 
indexDefaults section:



false

35
35
105


4

128
32768
1000
1
native


Thanks,
Shawn



boost not showing up in Solr 3.6 debugQueries?

2012-05-17 Thread Tom Burton-West
Hello all,

In Solr 3.4, the boost factor is explicitly shown in debugQueries:


0.37087926 = (MATCH) sum of:
  0.3708323 = (MATCH) weight(ocr:dog^1000.0 in 215624), product of:
0.995 = queryWeight(ocr:dog^1000.0), product of:
  1000.0 = boost
  2.32497 = idf(docFreq=237626, maxDocs=893970)
  4.3011288E-4 = queryNorm
0.37083247 = (MATCH) fieldWeight(ocr:dog in 215624), product of:
  27.221315 = tf(termFreq(ocr:dog)=741)
...

But in Solr 3.6 I am not seeing the boost factor called out.

 On the other hand it looks like it may now be incoroporated in the
queryNorm (Please see example below).

Is there a bug in Solr 3.6 debugQueries?  Is there some new behavior
regarding boosts and queryNorms? or am I missing something obvious?

(Apologies for the Japanese query, but right now the only index I have
in Solr 3.6 is for CJK and this is one of the querie from our log.

Tom Burton-West




   兵にな^1000 OR hanUnigrams:兵にな
   兵にな^1000 OR hanUnigrams:兵にな
  ((+ocr:兵に +ocr:にな)^1000.0) hanUnigrams:兵
  ((+ocr:兵に +ocr:にな)^1000.0)
hanUnigrams:兵

  

0.15685473 = (MATCH) sum of:
  0.15684697 = (MATCH) sum of:
0.0067602023 = (MATCH) weight(ocr:兵に in 213594), product of:
  0.81443477 = queryWeight(ocr:兵に), product of:
3.3998778 = idf(docFreq=70130, maxDocs=772972)
0.23954825 = queryNorm
  0.008300483 = (MATCH) fieldWeight(ocr:兵に in 213594), product of:
1.0 = tf(termFreq(ocr:兵に)=1)
3.3998778 = idf(docFreq=70130, maxDocs=772972)
0.0024414062 = fieldNorm(field=ocr, doc=213594)
0.15008678 = (MATCH) weight(ocr:にな in 213594), product of:
  0.5802551 = queryWeight(ocr:にな), product of:
2.422289 = idf(docFreq=186410, maxDocs=772972)
0.23954825 = queryNorm
  0.25865653 = (MATCH) fieldWeight(ocr:にな in 213594), product of:
43.737854 = tf(termFreq(ocr:にな)=1913)
2.422289 = idf(docFreq=186410, maxDocs=772972)
0.0024414062 = fieldNorm(field=ocr, doc=213594)
  7.76674E-6 = (MATCH) weight(hanUnigrams:兵 in 213594), product of:
2.9968342E-4 = queryWeight(hanUnigrams:兵), product of:
  1.2510358 = idf(docFreq=601367, maxDocs=772972)
  2.3954824E-4 = queryNorm
0.025916481 = (MATCH) fieldWeight(hanUnigrams:兵 in 213594), product of:
  4.2426405 = tf(termFreq(hanUnigrams:兵)=18)
  1.2510358 = idf(docFreq=601367, maxDocs=772972)
  0.0048828125 = fieldNorm(field=hanUnigrams, doc=213594)



Re: org.apache.solr.common.SolrException: org.apache.lucene.queryParser.ParseException: Cannot parse

2012-05-17 Thread Jack Krupansky

SKU should be type "string" and then SKU_text would be your text type.

Or, you can do it the opposite: SKU would be text and SKU_string for the raw 
string value for precise wildcards and faceting.


The Solr example does have "sku" as a text field. You can do it that way or 
the opposite. Whichever feels more natural for your application.


-- Jack Krupansky

-Original Message- 
From: Prachi Phatak

Sent: Thursday, May 17, 2012 4:15 PM
To: solr-user@lucene.apache.org
Subject: RE: org.apache.solr.common.SolrException: 
org.apache.lucene.queryParser.ParseException: Cannot parse


So do you mean I should change it " class="solr.TextField" to " 
class="solr.StrField"?


-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Thursday, May 17, 2012 3:00 PM
To: solr-user@lucene.apache.org
Subject: Re: org.apache.solr.common.SolrException: 
org.apache.lucene.queryParser.ParseException: Cannot parse


Sorry, my suggestion for the escaped left parenthesis is if you change SKU 
to be a string field. And then have SKU_text as a copy of that field (add a 
copyField to your schema.xml for SKU to SKU_text) but with some "text"

type - then you could simply say SKU_text:soft .

-- Jack Krupansky

-Original Message-
From: Jack Krupansky
Sent: Thursday, May 17, 2012 3:50 PM
To: solr-user@lucene.apache.org
Subject: Re: org.apache.solr.common.SolrException:
org.apache.lucene.queryParser.ParseException: Cannot parse

Code? I'm not sure what you're referring to. These changes are in schema.xml 
and solrconfig.xml.


In your query, you need to change:

SKU:soft(*^1.0

to

SKU:soft\(*^1.0

-- Jack Krupansky

-Original Message-
From: Prachi Phatak
Sent: Thursday, May 17, 2012 3:25 PM
To: solr-user@lucene.apache.org ; Prachi Phatak
Subject: RE: org.apache.solr.common.SolrException:
org.apache.lucene.queryParser.ParseException: Cannot parse

Can I do this in the configuration or I have to change my code.

-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Thursday, May 17, 2012 2:23 PM
To: solr-user@lucene.apache.org; Prachi Phatak
Subject: Re: org.apache.solr.common.SolrException:
org.apache.lucene.queryParser.ParseException: Cannot parse

Okay, first, now that we can see your data, it looks to me like you should 
keep it in two fields: 1) a "string" field for exact match, faceting, and 
precise wildcarding, and 2) copy to a "text" field for searching by keyword.

For the latter, use a field type/analyzer comparable to "text_en_splitting".

Any time you have a query term with anything other than a letters, digits, 
hyphens, and underscores, you should enclose it in quotes (double quote).


For example: "Soft(drink)"

So the query parsing exception is really just twlling you to enclose your 
term in quotes.


But if your terms are in a "text" field, those special characters will be 
deleted and treated as spaces anyway, so that "Soft(drink)" would be 
equivalent to "soft drink" (or even "Soft-Drink".)


But if you also store your product category as a "string" field, you would 
have to use either the full category name in quotes, or use wildcards, such 
as "Women's Wear" or Women*.


You can also escape non-letter character, such as: Women\'s\ Wear - note the 
escape of the space between words, needed for string fields, but not in text 
fields.


-- Jack Krupansky

-Original Message-
From: Prachi Phatak
Sent: Thursday, May 17, 2012 2:52 PM
To: solr-user@lucene.apache.org
Subject: org.apache.solr.common.SolrException:
org.apache.lucene.queryParser.ParseException: Cannot parse

My configuration


 
   
   

   

 
 
   
   

   



 
   
Data:
Coke ProductSoftdrink
Coke ProductSoftdrink
Coke ProductSoftdrink
Coke() ProductSoftdrink
Pepsi ProductSoftdrink
Pepsi ProductSoftdrink
Other ProductSoft(drink)
Domestic-BeerBeer-34333
Domestic-BeerBeer-34333
Domestic-BeerBeer
Domestic BeerBeer
Import Beer+9Beer
Import BeerBeer
Import BeerBeer
T-ShirtShirt
T-ShirtShirt
T-ShirtShirt
T-ShirtShirt-34333
T-ShirtShirt
BlouseWomen's-Wear
BlouseWomen's-Wear
Skirt%3Women's Wear
SkirtWomen's Wear
DressFormal Wear
Whenever I search for Soft(, it gives me the following error and if
try to search 34333, it gives no results
SEVERE: org.apache.solr.common.SolrException:
org.apache.lucene.queryParser.ParseException: Cannot parse
'+DIMENSION:Product^1.0 +( (SKU:soft(*^1.0 DIM_ATTR_ONE:soft(*^1.0
DIM_ATTR_ Was expecting one of:
...
...
...
   "+" ...
   "-" ...
   "(" ...
   ")" ...
   "*" ...
   "^" ...
...
...
...
...
   "[" ...
   "{" ...
...
   at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:105)
   at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
   at
org.a

RE: org.apache.solr.common.SolrException: org.apache.lucene.queryParser.ParseException: Cannot parse

2012-05-17 Thread Prachi Phatak
So do you mean I should change it " class="solr.TextField" to " 
class="solr.StrField"?

-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com] 
Sent: Thursday, May 17, 2012 3:00 PM
To: solr-user@lucene.apache.org
Subject: Re: org.apache.solr.common.SolrException: 
org.apache.lucene.queryParser.ParseException: Cannot parse

Sorry, my suggestion for the escaped left parenthesis is if you change SKU to 
be a string field. And then have SKU_text as a copy of that field (add a 
copyField to your schema.xml for SKU to SKU_text) but with some "text" 
type - then you could simply say SKU_text:soft .

-- Jack Krupansky

-Original Message-
From: Jack Krupansky
Sent: Thursday, May 17, 2012 3:50 PM
To: solr-user@lucene.apache.org
Subject: Re: org.apache.solr.common.SolrException: 
org.apache.lucene.queryParser.ParseException: Cannot parse

Code? I'm not sure what you're referring to. These changes are in schema.xml 
and solrconfig.xml.

In your query, you need to change:

SKU:soft(*^1.0

to

SKU:soft\(*^1.0

-- Jack Krupansky

-Original Message-
From: Prachi Phatak
Sent: Thursday, May 17, 2012 3:25 PM
To: solr-user@lucene.apache.org ; Prachi Phatak
Subject: RE: org.apache.solr.common.SolrException:
org.apache.lucene.queryParser.ParseException: Cannot parse

Can I do this in the configuration or I have to change my code.

-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Thursday, May 17, 2012 2:23 PM
To: solr-user@lucene.apache.org; Prachi Phatak
Subject: Re: org.apache.solr.common.SolrException:
org.apache.lucene.queryParser.ParseException: Cannot parse

Okay, first, now that we can see your data, it looks to me like you should keep 
it in two fields: 1) a "string" field for exact match, faceting, and precise 
wildcarding, and 2) copy to a "text" field for searching by keyword.
For the latter, use a field type/analyzer comparable to "text_en_splitting".

Any time you have a query term with anything other than a letters, digits, 
hyphens, and underscores, you should enclose it in quotes (double quote).

For example: "Soft(drink)"

So the query parsing exception is really just twlling you to enclose your term 
in quotes.

But if your terms are in a "text" field, those special characters will be 
deleted and treated as spaces anyway, so that "Soft(drink)" would be equivalent 
to "soft drink" (or even "Soft-Drink".)

But if you also store your product category as a "string" field, you would have 
to use either the full category name in quotes, or use wildcards, such as 
"Women's Wear" or Women*.

You can also escape non-letter character, such as: Women\'s\ Wear - note the 
escape of the space between words, needed for string fields, but not in text 
fields.

-- Jack Krupansky

-Original Message-
From: Prachi Phatak
Sent: Thursday, May 17, 2012 2:52 PM
To: solr-user@lucene.apache.org
Subject: org.apache.solr.common.SolrException:
org.apache.lucene.queryParser.ParseException: Cannot parse

My configuration
>  positionIncrementGap="100">
>  
>
> words="stopwords.txt" enablePositionIncrements="true" />
>  maxGramSize="15" side="front"/>
>
> 
>  
>  
>
> ignoreCase="true" expand="true"/>
>  maxGramSize="15" side="front"/>
>ignoreCase="true"
>words="stopwords.txt"
>enablePositionIncrements="true"
>/>
>  preserveOriginal="1" generateWordParts="1" generateNumberParts="1"
> catenateWords="1" catenateNumbers="1" catenateAll="0"
> splitOnCaseChange="0"/>
> 
> 
>  
>
> Data:
> Coke ProductSoftdrink
> Coke ProductSoftdrink
> Coke ProductSoftdrink
> Coke() ProductSoftdrink
> Pepsi ProductSoftdrink
> Pepsi ProductSoftdrink
> Other ProductSoft(drink)
> Domestic-BeerBeer-34333
> Domestic-BeerBeer-34333
> Domestic-BeerBeer
> Domestic BeerBeer
> Import Beer+9Beer
> Import BeerBeer
> Import BeerBeer
> T-ShirtShirt
> T-ShirtShirt
> T-ShirtShirt
> T-ShirtShirt-34333
> T-ShirtShirt
> BlouseWomen's-Wear
> BlouseWomen's-Wear
> Skirt%3Women's Wear
> SkirtWomen's Wear
> DressFormal Wear
> Whenever I search for Soft(, it gives me the following error and if 
> try to search 34333, it gives no results
> SEVERE: org.apache.solr.common.SolrException:
> org.apache.lucene.queryParser.ParseException: Cannot parse
> '+DIMENSION:Product^1.0 +( (SKU:soft(*^1.0 DIM_ATTR_ONE:soft(*^1.0 
> DIM_ATTR_ Was expecting one of:
> ...
> ...
> ...
>"+" ...
>"-" ...
>"(" ...
>")" ...
>"*" ...
>"^" ...
> ...
> ...
> ...
> ...
>"[" ...
>"{" ...
> ...
>at
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:105)
>at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
> 

Re: org.apache.solr.common.SolrException: org.apache.lucene.queryParser.ParseException: Cannot parse

2012-05-17 Thread Jack Krupansky
Sorry, my suggestion for the escaped left parenthesis is if you change SKU 
to be a string field. And then have SKU_text as a copy of that field (add a 
copyField to your schema.xml for SKU to SKU_text) but with some "text" 
type - then you could simply say SKU_text:soft .


-- Jack Krupansky

-Original Message- 
From: Jack Krupansky

Sent: Thursday, May 17, 2012 3:50 PM
To: solr-user@lucene.apache.org
Subject: Re: org.apache.solr.common.SolrException: 
org.apache.lucene.queryParser.ParseException: Cannot parse


Code? I'm not sure what you're referring to. These changes are in schema.xml
and solrconfig.xml.

In your query, you need to change:

SKU:soft(*^1.0

to

SKU:soft\(*^1.0

-- Jack Krupansky

-Original Message- 
From: Prachi Phatak

Sent: Thursday, May 17, 2012 3:25 PM
To: solr-user@lucene.apache.org ; Prachi Phatak
Subject: RE: org.apache.solr.common.SolrException:
org.apache.lucene.queryParser.ParseException: Cannot parse

Can I do this in the configuration or I have to change my code.

-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Thursday, May 17, 2012 2:23 PM
To: solr-user@lucene.apache.org; Prachi Phatak
Subject: Re: org.apache.solr.common.SolrException:
org.apache.lucene.queryParser.ParseException: Cannot parse

Okay, first, now that we can see your data, it looks to me like you should
keep it in two fields: 1) a "string" field for exact match, faceting, and
precise wildcarding, and 2) copy to a "text" field for searching by keyword.
For the latter, use a field type/analyzer comparable to "text_en_splitting".

Any time you have a query term with anything other than a letters, digits,
hyphens, and underscores, you should enclose it in quotes (double quote).

For example: "Soft(drink)"

So the query parsing exception is really just twlling you to enclose your
term in quotes.

But if your terms are in a "text" field, those special characters will be
deleted and treated as spaces anyway, so that "Soft(drink)" would be
equivalent to "soft drink" (or even "Soft-Drink".)

But if you also store your product category as a "string" field, you would
have to use either the full category name in quotes, or use wildcards, such
as "Women's Wear" or Women*.

You can also escape non-letter character, such as: Women\'s\ Wear - note the
escape of the space between words, needed for string fields, but not in text
fields.

-- Jack Krupansky

-Original Message-
From: Prachi Phatak
Sent: Thursday, May 17, 2012 2:52 PM
To: solr-user@lucene.apache.org
Subject: org.apache.solr.common.SolrException:
org.apache.lucene.queryParser.ParseException: Cannot parse

My configuration


 
   
   

   

 
 
   
   

   



 
   
Data:
Coke ProductSoftdrink
Coke ProductSoftdrink
Coke ProductSoftdrink
Coke() ProductSoftdrink
Pepsi ProductSoftdrink
Pepsi ProductSoftdrink
Other ProductSoft(drink)
Domestic-BeerBeer-34333
Domestic-BeerBeer-34333
Domestic-BeerBeer
Domestic BeerBeer
Import Beer+9Beer
Import BeerBeer
Import BeerBeer
T-ShirtShirt
T-ShirtShirt
T-ShirtShirt
T-ShirtShirt-34333
T-ShirtShirt
BlouseWomen's-Wear
BlouseWomen's-Wear
Skirt%3Women's Wear
SkirtWomen's Wear
DressFormal Wear
Whenever I search for Soft(, it gives me the following error and if
try to search 34333, it gives no results
SEVERE: org.apache.solr.common.SolrException:
org.apache.lucene.queryParser.ParseException: Cannot parse
'+DIMENSION:Product^1.0 +( (SKU:soft(*^1.0 DIM_ATTR_ONE:soft(*^1.0
DIM_ATTR_ Was expecting one of:
...
...
...
   "+" ...
   "-" ...
   "(" ...
   ")" ...
   "*" ...
   "^" ...
...
...
...
...
   "[" ...
   "{" ...
...
   at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:105)
   at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
   at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368)
   at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
   at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
   at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
   at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
   at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
   at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
   at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
   at
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
   at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
   at
org.

Re: using Tika (ExtractingRequestHandler)

2012-05-17 Thread Ahmet Arslan
> i'm looking at using Tika to index a
> bunch of documents. the wiki page seems to be a little bit
> out of date ("// TODO: this is out of date as of Solr 1.4 -
> dist/apache-solr-cell-1.4.jar and all of
> contrib/extraction/lib are needed") and it also looks a
> little incomplete.
> 
> is there an actual list of all the required jar files? i'm
> not sure they are in the same place in the 3.6.0
> distribution as they were in 1.4, and having an actual list
> would be very helpful in figuring out where they are.

Here is a list of 
  

If you want to use DIH :
 




Re: org.apache.solr.common.SolrException: org.apache.lucene.queryParser.ParseException: Cannot parse

2012-05-17 Thread Jack Krupansky
Code? I'm not sure what you're referring to. These changes are in schema.xml 
and solrconfig.xml.


In your query, you need to change:

SKU:soft(*^1.0

to

SKU:soft\(*^1.0

-- Jack Krupansky

-Original Message- 
From: Prachi Phatak

Sent: Thursday, May 17, 2012 3:25 PM
To: solr-user@lucene.apache.org ; Prachi Phatak
Subject: RE: org.apache.solr.common.SolrException: 
org.apache.lucene.queryParser.ParseException: Cannot parse


Can I do this in the configuration or I have to change my code.

-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Thursday, May 17, 2012 2:23 PM
To: solr-user@lucene.apache.org; Prachi Phatak
Subject: Re: org.apache.solr.common.SolrException: 
org.apache.lucene.queryParser.ParseException: Cannot parse


Okay, first, now that we can see your data, it looks to me like you should 
keep it in two fields: 1) a "string" field for exact match, faceting, and 
precise wildcarding, and 2) copy to a "text" field for searching by keyword.

For the latter, use a field type/analyzer comparable to "text_en_splitting".

Any time you have a query term with anything other than a letters, digits, 
hyphens, and underscores, you should enclose it in quotes (double quote).


For example: "Soft(drink)"

So the query parsing exception is really just twlling you to enclose your 
term in quotes.


But if your terms are in a "text" field, those special characters will be 
deleted and treated as spaces anyway, so that "Soft(drink)" would be 
equivalent to "soft drink" (or even "Soft-Drink".)


But if you also store your product category as a "string" field, you would 
have to use either the full category name in quotes, or use wildcards, such 
as "Women's Wear" or Women*.


You can also escape non-letter character, such as: Women\'s\ Wear - note the 
escape of the space between words, needed for string fields, but not in text 
fields.


-- Jack Krupansky

-Original Message-
From: Prachi Phatak
Sent: Thursday, May 17, 2012 2:52 PM
To: solr-user@lucene.apache.org
Subject: org.apache.solr.common.SolrException:
org.apache.lucene.queryParser.ParseException: Cannot parse

My configuration


 
   
   

   

 
 
   
   

   



 
   
Data:
Coke ProductSoftdrink
Coke ProductSoftdrink
Coke ProductSoftdrink
Coke() ProductSoftdrink
Pepsi ProductSoftdrink
Pepsi ProductSoftdrink
Other ProductSoft(drink)
Domestic-BeerBeer-34333
Domestic-BeerBeer-34333
Domestic-BeerBeer
Domestic BeerBeer
Import Beer+9Beer
Import BeerBeer
Import BeerBeer
T-ShirtShirt
T-ShirtShirt
T-ShirtShirt
T-ShirtShirt-34333
T-ShirtShirt
BlouseWomen's-Wear
BlouseWomen's-Wear
Skirt%3Women's Wear
SkirtWomen's Wear
DressFormal Wear
Whenever I search for Soft(, it gives me the following error and if
try to search 34333, it gives no results
SEVERE: org.apache.solr.common.SolrException:
org.apache.lucene.queryParser.ParseException: Cannot parse
'+DIMENSION:Product^1.0 +( (SKU:soft(*^1.0 DIM_ATTR_ONE:soft(*^1.0
DIM_ATTR_ Was expecting one of:
...
...
...
   "+" ...
   "-" ...
   "(" ...
   ")" ...
   "*" ...
   "^" ...
...
...
...
...
   "[" ...
   "{" ...
...
   at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:105)
   at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
   at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368)
   at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
   at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
   at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
   at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
   at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
   at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
   at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
   at
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
   at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
   at
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
   at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
   at org.mortbay.jetty.Server.handle(Server.java:326)
   at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
   at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
   at org.mortbay.jetty.HttpParser

RE: org.apache.solr.common.SolrException: org.apache.lucene.queryParser.ParseException: Cannot parse

2012-05-17 Thread Prachi Phatak
Can I do this in the configuration or I have to change my code.

-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com] 
Sent: Thursday, May 17, 2012 2:23 PM
To: solr-user@lucene.apache.org; Prachi Phatak
Subject: Re: org.apache.solr.common.SolrException: 
org.apache.lucene.queryParser.ParseException: Cannot parse

Okay, first, now that we can see your data, it looks to me like you should keep 
it in two fields: 1) a "string" field for exact match, faceting, and precise 
wildcarding, and 2) copy to a "text" field for searching by keyword. 
For the latter, use a field type/analyzer comparable to "text_en_splitting".

Any time you have a query term with anything other than a letters, digits, 
hyphens, and underscores, you should enclose it in quotes (double quote).

For example: "Soft(drink)"

So the query parsing exception is really just twlling you to enclose your term 
in quotes.

But if your terms are in a "text" field, those special characters will be 
deleted and treated as spaces anyway, so that "Soft(drink)" would be equivalent 
to "soft drink" (or even "Soft-Drink".)

But if you also store your product category as a "string" field, you would have 
to use either the full category name in quotes, or use wildcards, such as 
"Women's Wear" or Women*.

You can also escape non-letter character, such as: Women\'s\ Wear - note the 
escape of the space between words, needed for string fields, but not in text 
fields.

-- Jack Krupansky

-Original Message-
From: Prachi Phatak
Sent: Thursday, May 17, 2012 2:52 PM
To: solr-user@lucene.apache.org
Subject: org.apache.solr.common.SolrException: 
org.apache.lucene.queryParser.ParseException: Cannot parse

My configuration
>  positionIncrementGap="100">
>  
>
> words="stopwords.txt" enablePositionIncrements="true" />
>  maxGramSize="15" side="front"/>
>
> 
>  
>  
>
> ignoreCase="true" expand="true"/>
>  maxGramSize="15" side="front"/>
>ignoreCase="true"
>words="stopwords.txt"
>enablePositionIncrements="true"
>/>
>  preserveOriginal="1" generateWordParts="1" generateNumberParts="1" 
> catenateWords="1" catenateNumbers="1" catenateAll="0" 
> splitOnCaseChange="0"/>
> 
> 
>  
>
> Data:
> Coke ProductSoftdrink
> Coke ProductSoftdrink
> Coke ProductSoftdrink
> Coke() ProductSoftdrink
> Pepsi ProductSoftdrink
> Pepsi ProductSoftdrink
> Other ProductSoft(drink)
> Domestic-BeerBeer-34333
> Domestic-BeerBeer-34333
> Domestic-BeerBeer
> Domestic BeerBeer
> Import Beer+9Beer
> Import BeerBeer
> Import BeerBeer
> T-ShirtShirt
> T-ShirtShirt
> T-ShirtShirt
> T-ShirtShirt-34333
> T-ShirtShirt
> BlouseWomen's-Wear
> BlouseWomen's-Wear
> Skirt%3Women's Wear
> SkirtWomen's Wear
> DressFormal Wear
> Whenever I search for Soft(, it gives me the following error and if 
> try to search 34333, it gives no results
> SEVERE: org.apache.solr.common.SolrException: 
> org.apache.lucene.queryParser.ParseException: Cannot parse
> '+DIMENSION:Product^1.0 +( (SKU:soft(*^1.0 DIM_ATTR_ONE:soft(*^1.0 
> DIM_ATTR_ Was expecting one of:
> ...
> ...
> ...
>"+" ...
>"-" ...
>"(" ...
>")" ...
>"*" ...
>"^" ...
> ...
> ...
> ...
> ...
>"[" ...
>"{" ...
> ...
>at
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:105)
>at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
>at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
>at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368)
>at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
>at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
>at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>at
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
>at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>at
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>at org.mortbay.jetty.Server.handle(Server.java:326)
>at
> org.mortbay.jetty.HttpConnection.handleRequest(Htt

Re: org.apache.solr.common.SolrException: org.apache.lucene.queryParser.ParseException: Cannot parse

2012-05-17 Thread Jack Krupansky
Okay, first, now that we can see your data, it looks to me like you should 
keep it in two fields: 1) a "string" field for exact match, faceting, and 
precise wildcarding, and 2) copy to a "text" field for searching by keyword. 
For the latter, use a field type/analyzer comparable to "text_en_splitting".


Any time you have a query term with anything other than a letters, digits, 
hyphens, and underscores, you should enclose it in quotes (double quote).


For example: "Soft(drink)"

So the query parsing exception is really just twlling you to enclose your 
term in quotes.


But if your terms are in a "text" field, those special characters will be 
deleted and treated as spaces anyway, so that "Soft(drink)" would be 
equivalent to "soft drink" (or even "Soft-Drink".)


But if you also store your product category as a "string" field, you would 
have to use either the full category name in quotes, or use wildcards, such 
as "Women's Wear" or Women*.


You can also escape non-letter character, such as: Women\'s\ Wear - note the 
escape of the space between words, needed for string fields, but not in text 
fields.


-- Jack Krupansky

-Original Message- 
From: Prachi Phatak

Sent: Thursday, May 17, 2012 2:52 PM
To: solr-user@lucene.apache.org
Subject: org.apache.solr.common.SolrException: 
org.apache.lucene.queryParser.ParseException: Cannot parse


My configuration
positionIncrementGap="100">

 
   
   words="stopwords.txt" enablePositionIncrements="true" />
maxGramSize="15" side="front"/>

   

 
 
   
   ignoreCase="true" expand="true"/>
maxGramSize="15" side="front"/>

   
preserveOriginal="1" generateWordParts="1" generateNumberParts="1" 
catenateWords="1" catenateNumbers="1" catenateAll="0" 
splitOnCaseChange="0"/>



 
   
Data:
Coke ProductSoftdrink
Coke ProductSoftdrink
Coke ProductSoftdrink
Coke() ProductSoftdrink
Pepsi ProductSoftdrink
Pepsi ProductSoftdrink
Other ProductSoft(drink)
Domestic-BeerBeer-34333
Domestic-BeerBeer-34333
Domestic-BeerBeer
Domestic BeerBeer
Import Beer+9Beer
Import BeerBeer
Import BeerBeer
T-ShirtShirt
T-ShirtShirt
T-ShirtShirt
T-ShirtShirt-34333
T-ShirtShirt
BlouseWomen's-Wear
BlouseWomen's-Wear
Skirt%3Women's Wear
SkirtWomen's Wear
DressFormal Wear
Whenever I search for Soft(, it gives me the following error and if try to 
search 34333, it gives no results
SEVERE: org.apache.solr.common.SolrException: 
org.apache.lucene.queryParser.ParseException: Cannot parse 
'+DIMENSION:Product^1.0 +( (SKU:soft(*^1.0 DIM_ATTR_ONE:soft(*^1.0 
DIM_ATTR_

Was expecting one of:
...
...
...
   "+" ...
   "-" ...
   "(" ...
   ")" ...
   "*" ...
   "^" ...
...
...
...
...
   "[" ...
   "{" ...
...
   at 
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:105)
   at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
   at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)

   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368)
   at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
   at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
   at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
   at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
   at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
   at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
   at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
   at 
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
   at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
   at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
   at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)

   at org.mortbay.jetty.Server.handle(Server.java:326)
   at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
   at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)

   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
   at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
   at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
   at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
   at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 
'+DIMENSION:Product^1.0 +( (SKU:soft(*^1.0 DIM_ATTR_ONE:soft(*^1

using Tika (ExtractingRequestHandler)

2012-05-17 Thread Welty, Richard
i'm looking at using Tika to index a bunch of documents. the wiki page seems to 
be a little bit out of date ("// TODO: this is out of date as of Solr 1.4 - 
dist/apache-solr-cell-1.4.jar and all of contrib/extraction/lib are needed") 
and it also looks a little incomplete.

is there an actual list of all the required jar files? i'm not sure they are in 
the same place in the 3.6.0 distribution as they were in 1.4, and having an 
actual list would be very helpful in figuring out where they are.

as for "Sending Documents to Solr", is there any plan to address this todo: "// 
TODO: describe the different ways to send the documents to solr (POST body, 
form encoded, remoteStreaming)". this is really just a nice to have, i can see 
how to accomplish my goals using a method that is currently documented.

thanks,
   richard


org.apache.solr.common.SolrException: org.apache.lucene.queryParser.ParseException: Cannot parse

2012-05-17 Thread Prachi Phatak
 My configuration
> 
>      
>        
>        words="stopwords.txt" enablePositionIncrements="true" />
>          maxGramSize="15" side="front"/>
>        
>         
>      
>      
>        
>        ignoreCase="true" expand="true"/>
>          maxGramSize="15" side="front"/>
>                        ignoreCase="true"
>                words="stopwords.txt"
>                enablePositionIncrements="true"
>                />
>          generateWordParts="1" generateNumberParts="1" catenateWords="1" 
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
>         
>         
>      
>    
> Data:
> Coke Product    Softdrink
> Coke Product    Softdrink
> Coke Product    Softdrink
> Coke() Product    Softdrink
> Pepsi Product    Softdrink
> Pepsi Product    Softdrink
> Other Product    Soft(drink)
> Domestic-Beer    Beer-34333
> Domestic-Beer    Beer-34333
> Domestic-Beer    Beer
> Domestic Beer    Beer
> Import Beer+9    Beer
> Import Beer    Beer
> Import Beer    Beer
> T-Shirt    Shirt
> T-Shirt    Shirt
> T-Shirt    Shirt
> T-Shirt    Shirt-34333
> T-Shirt    Shirt
> Blouse    Women's-Wear
> Blouse    Women's-Wear
> Skirt%3    Women's Wear
> Skirt    Women's Wear
> Dress    Formal Wear
> Whenever I search for Soft(, it gives me the following error and if try to 
> search 34333, it gives no results
> SEVERE: org.apache.solr.common.SolrException: 
> org.apache.lucene.queryParser.ParseException: Cannot parse 
> '+DIMENSION:Product^1.0 +( (SKU:soft(*^1.0 DIM_ATTR_ONE:soft(*^1.0 DIM_ATTR_
> Was expecting one of:
>     ...
>     ...
>     ...
>    "+" ...
>    "-" ...
>    "(" ...
>    ")" ...
>    "*" ...
>    "^" ...
>     ...
>     ...
>     ...
>     ...
>    "[" ...
>    "{" ...
>     ...
>        at 
>org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:105)
>        at 
>org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
>        at 
>org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368)
>        at 
>org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
>        at 
>org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
>        at 
>org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>        at 
>org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>        at 
>org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>        at 
>org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>        at 
>org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>        at 
>org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
>        at 
>org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>        at 
>org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>        at 
>org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>        at org.mortbay.jetty.Server.handle(Server.java:326)
>        at 
>org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
>        at 
>org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
>        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
>        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
>        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
>        at 
>org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
>        at 
>org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 
> '+DIMENSION:Product^1.0 +( (SKU:soft(*^1.0 DIM_ATTR_ONE:soft(*^1.0 
> DIM_ATTR_TWO:soft(*^1.0))': Encountered "

RE: Use DIH with more than one entity at the same time

2012-05-17 Thread Dyer, James
The wiki here indicates that you can specify "entity" more than once on the 
request and it will run multiple entities at the same time, in the same 
handler:  http://wiki.apache.org/solr/DataImportHandler#Commands

But I can't say for sure that this actually works!  Having been in the DIH 
code, I would think such a feature is buggy at best, if it works at all.  But 
if you try it let us know how it works for you.  Also, if anyone else out there 
is using multiple "entity" parameters to get entities running in parallel, I'd 
be interested in hearing about it.

But the approach taken in the link Jack sites below does work.  Its a pain to 
set it up though.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Thursday, May 17, 2012 10:21 AM
To: solr-user@lucene.apache.org
Subject: Re: Use DIH with more than one entity at the same time

Okay, the answer is “Yes, sort of, but...”

“One annoyance is because of how DIH is designed, you need a separate handler 
set up in solrconfig.xml for each DIH you plan to run.  So you have to plan in 
advance how many DIH instances you want to run, which config files they'll use, 
etc.”

See:
http://lucene.472066.n3.nabble.com/Multiple-dataimport-processes-to-same-core-td3645525.html

-- Jack Krupansky

From: Sergio Martín Cantero
Sent: Thursday, May 17, 2012 11:07 AM
To: solr-user@lucene.apache.org
Cc: Jack Krupansky
Subject: Re: Use DIH with more than one entity at the same time

Thanks Jack, but that´s not what I want.

I don´t want multiple entities in one invocation, but two simultaneous 
invocations of the DIH with different entities.

Thanks.
[cid:B1C89B4707D142DCB6BFBD6B07E47BC7@JackKrupansky]
[cid:3F3E4BE8DC9D4B808C9038D507DE8415@JackKrupansky]
Sergio Martín Cantero

Office (ES) +34 91 733 73 97

playence Spain SL

sergio.mar...@playence.com

Calle Vicente Gaceo 19

28029 Madrid - España




El 17/05/12 17:04, Jack Krupansky escribió:
Yes. From the doc:

"Multiple 'entity' parameters can be passed on to run multiple entities at 
once. If nothing is passed, all entities are executed."

See:
http://wiki.apache.org/solr/DataImportHandler

But that is one invocation of DIH, not two separate updates as you tried.

-- Jack Krupansky

-Original Message- From: Sergio Martín Cantero
Sent: Thursday, May 17, 2012 10:46 AM
To: solr-user@lucene.apache.org
Subject: Use DIH with more than one entity at the same time

I´m new to this list, so... Hello everybody.

I´m trying to run the DIH with more than one entity at the same time,
but only the first entity I call is being indexed. The other doesn´t get
any response.
For example:
First call:
http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=users
Before the indexing has finished, I call:
http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=products

The second call doesn´t have any effedt, and the products are not
indexed at all.

Isn´t it possible to run more than one full import for different
entities at the same time?

Thanks a lot for your help
Sergio


Re: Use DIH with more than one entity at the same time

2012-05-17 Thread Jack Krupansky
Okay, the answer is “Yes, sort of, but...”

“One annoyance is because of how DIH is designed, you need a separate handler 
set up in solrconfig.xml for each DIH you plan to run.  So you have to plan in 
advance how many DIH instances you want to run, which config files they'll use, 
etc.”

See:
http://lucene.472066.n3.nabble.com/Multiple-dataimport-processes-to-same-core-td3645525.html
 

-- Jack Krupansky

From: Sergio Martín Cantero 
Sent: Thursday, May 17, 2012 11:07 AM
To: solr-user@lucene.apache.org 
Cc: Jack Krupansky 
Subject: Re: Use DIH with more than one entity at the same time

Thanks Jack, but that´s not what I want.

I don´t want multiple entities in one invocation, but two simultaneous 
invocations of the DIH with different entities.

Thanks.

  
  Sergio Martín Cantero Office (ES) +34 91 733 73 97 
  playence Spain SL sergio.mar...@playence.com 
  Calle Vicente Gaceo 19 
 
  28029 Madrid - España   


El 17/05/12 17:04, Jack Krupansky escribió: 
  Yes. From the doc: 

  "Multiple 'entity' parameters can be passed on to run multiple entities at 
once. If nothing is passed, all entities are executed." 

  See: 
  http://wiki.apache.org/solr/DataImportHandler 

  But that is one invocation of DIH, not two separate updates as you tried. 

  -- Jack Krupansky 

  -Original Message- From: Sergio Martín Cantero 
  Sent: Thursday, May 17, 2012 10:46 AM 
  To: solr-user@lucene.apache.org 
  Subject: Use DIH with more than one entity at the same time 

  I´m new to this list, so... Hello everybody. 

  I´m trying to run the DIH with more than one entity at the same time, 
  but only the first entity I call is being indexed. The other doesn´t get 
  any response. 
  For example: 
  First call: 
  
http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=users
 
  Before the indexing has finished, I call: 
  
http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=products
 

  The second call doesn´t have any effedt, and the products are not 
  indexed at all. 

  Isn´t it possible to run more than one full import for different 
  entities at the same time? 

  Thanks a lot for your help 
  Sergio 


Issue with DIH when database is down

2012-05-17 Thread Rahul Warawdekar
Hi,

I am using Solr 3.4 on Tomcat 6 and using DIH to index data from a MS SQL
Server 2008 database.

In case my database is down, or is refusing connections due to any reason,
DIH throws an exception as mentioned below

"org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
execute query: ...

Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: Connection reset
at
com.microsoft.sqlserver.jdbc.SQLServerConnection.terminate(SQLServerConnection.java:1368)"

But when the database is up and running and the next indexing job runs, it
gives me the same error.
I need to restart Tomcat in order to succesfully connect again to the
database.

My dataSource settings in data-config.xml are as follows


Has anyone come across this issue before ?
If yes, what is the resolution ?
Am I missng anything in the dataSource attributes (autoCommit=true)  ??
-- 
Thanks and Regards
Rahul A. Warawdekar


Re: Use DIH with more than one entity at the same time

2012-05-17 Thread Sergio Martín Cantero

  
  
Thanks Jack, but that´s not what I want.

I don´t want multiple entities in one invocation, but two
simultaneous invocations of the DIH with different entities.

Thanks.

  
  

  
 


  

  

  Sergio Martín Cantero


  Office (ES) +34 91 733 73
97

  
  
playence
  Spain SL
sergio.mar...@playence.com
  
  

  Calle Vicente Gaceo 19

 

  
  

  28029 Madrid - España

 
  

  

  

  


El 17/05/12 17:04, Jack Krupansky escribió:
Yes. From the doc:
  
  
  "Multiple 'entity' parameters can be passed on to run multiple
  entities at once. If nothing is passed, all entities are
  executed."
  
  
  See:
  
  http://wiki.apache.org/solr/DataImportHandler
  
  
  But that is one invocation of DIH, not two separate updates as you
  tried.
  
  
  -- Jack Krupansky
  
  
  -Original Message- From: Sergio Martín Cantero
  
  Sent: Thursday, May 17, 2012 10:46 AM
  
  To: solr-user@lucene.apache.org
  
  Subject: Use DIH with more than one entity at the same time
  
  
  I´m new to this list, so... Hello everybody.
  
  
  I´m trying to run the DIH with more than one entity at the same
  time,
  
  but only the first entity I call is being indexed. The other
  doesn´t get
  
  any response.
  
  For example:
  
  First call:
  
http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=users
  
  Before the indexing has finished, I call:
  
http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=products
  
  
  The second call doesn´t have any effedt, and the products are not
  
  indexed at all.
  
  
  Isn´t it possible to run more than one full import for different
  
  entities at the same time?
  
  
  Thanks a lot for your help
  
  Sergio 

  



Re: Use DIH with more than one entity at the same time

2012-05-17 Thread Jack Krupansky

Yes. From the doc:

"Multiple 'entity' parameters can be passed on to run multiple entities at 
once. If nothing is passed, all entities are executed."


See:
http://wiki.apache.org/solr/DataImportHandler

But that is one invocation of DIH, not two separate updates as you tried.

-- Jack Krupansky

-Original Message- 
From: Sergio Martín Cantero

Sent: Thursday, May 17, 2012 10:46 AM
To: solr-user@lucene.apache.org
Subject: Use DIH with more than one entity at the same time

I´m new to this list, so... Hello everybody.

I´m trying to run the DIH with more than one entity at the same time,
but only the first entity I call is being indexed. The other doesn´t get
any response.
For example:
First call:
http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=users
Before the indexing has finished, I call:
http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=products

The second call doesn´t have any effedt, and the products are not
indexed at all.

Isn´t it possible to run more than one full import for different
entities at the same time?

Thanks a lot for your help
Sergio 



Re: highlighter not respecting sentence boundry

2012-05-17 Thread abhayd
hi
Added hl.useFastVectorHighlighter=true to query. I was already doing term
vectors.
This worked like a charm.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/highlighter-not-respecting-sentence-boundry-tp3984327p3984416.html
Sent from the Solr - User mailing list archive at Nabble.com.


Use DIH with more than one entity at the same time

2012-05-17 Thread Sergio Martín Cantero

I´m new to this list, so... Hello everybody.

I´m trying to run the DIH with more than one entity at the same time, 
but only the first entity I call is being indexed. The other doesn´t get 
any response.

For example:
First call:
http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=users
Before the indexing has finished, I call:
http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=products

The second call doesn´t have any effedt, and the products are not 
indexed at all.


Isn´t it possible to run more than one full import for different 
entities at the same time?


Thanks a lot for your help
Sergio


RE: Issue in Applying patch file

2012-05-17 Thread Dyer, James
Recently Lucene/Solr went to a new build process using Ivy.  Simply put, 
dependent .jar files are no longer checked in with Lucene/Solr sources.  
Instead while building, Ivy now downloads them from 'repo1.maven.org'  From the 
error you sent it seems like you do not have access to the Maven repository.  
Possibly your internet connection was down or you're behind a proxy that 
doesn't allow it? 

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: mechravi25 [mailto:mechrav...@yahoo.co.in] 
Sent: Thursday, May 17, 2012 7:23 AM
To: solr-user@lucene.apache.org
Subject: RE: Issue in Applying patch file

Hi James,

Thank you for your reply.
That issue got resolved;but now, when Im trying to build the solr using "ant
dist" command, its resulting in the following error.


[ivy:retrieve] :: resolving dependencies ::
org.apache.lucene#analyzers-phonetic;working@XXXYYN
[ivy:retrieve]  confs: [default]
[ivy:retrieve]  validate = true
[ivy:retrieve]  refresh = false
[ivy:retrieve] resolving dependencies for configuration 'default'
[ivy:retrieve] == resolving dependencies for
org.apache.lucene#analyzers-phonetic;working@XXXYYN [default]
[ivy:retrieve] == resolving dependencies
org.apache.lucene#analyzers-phonetic;working@XXXYYN->commons-codec#commons-codec;1.6
[default->*]
[ivy:retrieve] default: Checking cache for: dependency:
commons-codec#commons-codec;1.6 {*=[*]}
[ivy:retrieve] don't use cache for commons-codec#commons-codec;1.6:
checkModified=true
[ivy:retrieve]  tried C:\Documents and
Settings\.ivy2\local\commons-codec\commons-codec\1.6\ivys\ivy.xml
[ivy:retrieve]  tried C:\Documents and
Settings\.ivy2\local\commons-codec\commons-codec\1.6\jars\commons-codec.jar
[ivy:retrieve]  local: no ivy file nor artifact found for
commons-codec#commons-codec;1.6
[ivy:retrieve] main: Checking cache for: dependency:
commons-codec#commons-codec;1.6 {*=[*]}
[ivy:retrieve]  tried C:\Documents and
Settings\.ivy2\shared\commons-codec\commons-codec\1.6\ivys\ivy.xml
[ivy:retrieve]  tried C:\Documents and
Settings\.ivy2\shared\commons-codec\commons-codec\1.6\jars\commons-codec.jar
[ivy:retrieve]  shared: no ivy file nor artifact found for
commons-codec#commons-codec;1.6
[ivy:retrieve]  tried
http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.pom
[ivy:retrieve] WARN: Host repo1.maven.org not found.
url=http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.pom
[ivy:retrieve] You probably access the destination server through a proxy
server that is not well configured.
[ivy:retrieve]  tried
http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.jar
[ivy:retrieve] WARN: Host repo1.maven.org not found.
url=http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.jar
[ivy:retrieve] You probably access the destination server through a proxy
server that is not well configured.
[ivy:retrieve]  public: no ivy file nor artifact found for
commons-codec#commons-codec;1.6
[ivy:retrieve] WARN:module not found: commons-codec#commons-codec;1.6
[ivy:retrieve] WARN:  local: tried
[ivy:retrieve] WARN:   C:\Documents and
Settings\.ivy2\local\commons-codec\commons-codec\1.6\ivys\ivy.xml
[ivy:retrieve] WARN:   -- artifact
commons-codec#commons-codec;1.6!commons-codec.jar:
[ivy:retrieve] WARN:   C:\Documents and
Settings\.ivy2\local\commons-codec\commons-codec\1.6\jars\commons-codec.jar
[ivy:retrieve] WARN:  shared: tried
[ivy:retrieve] WARN:   C:\Documents and
Settings\.ivy2\shared\commons-codec\commons-codec\1.6\ivys\ivy.xml
[ivy:retrieve] WARN:   -- artifact
commons-codec#commons-codec;1.6!commons-codec.jar:
[ivy:retrieve] WARN:   C:\Documents and
Settings\.ivy2\shared\commons-codec\commons-codec\1.6\jars\commons-codec.jar
[ivy:retrieve] WARN:  public: tried
[ivy:retrieve] WARN:  
http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.pom
[ivy:retrieve] WARN:   -- artifact
commons-codec#commons-codec;1.6!commons-codec.jar:
[ivy:retrieve] WARN:  
http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.jar
[ivy:retrieve]  resolved ivy file produced in C:\Documents and
Settings\.ivy2\cache\resolved-org.apache.lucene-analyzers-phonetic-work...@xxxyyn.xml
[ivy:retrieve] :: downloading artifacts ::
[ivy:retrieve] :: resolution report :: resolve 22250ms :: artifacts dl 0ms
[ivy:retrieve] WARN:::
[ivy:retrieve] WARN:::  UNRESOLVED DEPENDENCIES ::
[ivy:retrieve] WARN:::
[ivy:retrieve] WARN::: commons-codec#commons-codec;1.6: not found
[ivy:retrieve] WARN:::
[ivy:retrieve]  report for
org.apache.lucene#analyzers-phonetic;working@XXXYYN default produced in
C:\Documents and
Settings\.ivy2\cache\org.apache.lucene-analyzers-phonetic-default.

Re: Solr string field stripping new lines & line breaks

2012-05-17 Thread jacousteau
Thank you, but I actually just forgot to reload the core0 when I changed
the field type. oops.

On Thu, May 17, 2012 at 3:52 PM, iorixxx [via Lucene] <
ml-node+s472066n3984405...@n3.nabble.com> wrote:

> > Hi, is there any way to preserve
> > newlines or line breaks when submitting
> > content to a Solr string field?
>
> String is indexed verbatim. Are you using wt=xml in a browser? Try using
> wt=php
>
>
> --
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/Solr-string-field-stripping-new-lines-line-breaks-tp3984384p3984405.html
>  To unsubscribe from Solr string field stripping new lines & line breaks, 
> click
> here
> .
> NAML
>


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-string-field-stripping-new-lines-line-breaks-tp3984384p3984407.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Indexing & Searching MySQL table with Hindi and English data

2012-05-17 Thread Ahmet Arslan
> A search with keyword in Hindi retrieve emptly result
> set.  Also a
> retrieved hindi record displays junk characters.

Could it be URIEncoding setting of your servlet container?
http://wiki.apache.org/solr/SolrTomcat#URI_Charset_Config


Re: Solr string field stripping new lines & line breaks

2012-05-17 Thread Ahmet Arslan
> Hi, is there any way to preserve
> newlines or line breaks when submitting
> content to a Solr string field?

String is indexed verbatim. Are you using wt=xml in a browser? Try using wt=php


Solr string field stripping new lines & line breaks

2012-05-17 Thread jacousteau
Hi, is there any way to preserve newlines or line breaks when submitting
content to a Solr string field?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-string-field-stripping-new-lines-line-breaks-tp3984384.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: indexing Dublin core xml files

2012-05-17 Thread Jack Krupansky
You will first have to map your xml files into Solr xml format. You will 
have to do that yourself outside of Solr. At the same time, you should map 
any DCMI metadata field names to the corresponding field names, such as 
"dc:title" to "title". A number of the DC field names are already in the 
Solr example schema (example/solr/conf/schema.xml). If you need any that 
aren't, you will have to add them yourself. The only tricky part is deciding 
which fields are "text" vs. "string" or numeric, and which need to be 
multi-valued. In some cases you may want to store some metadata as both 
string and text so that the string can be used for faceting while the text 
field can be search by keywords.


For the Solr xml format, see:
http://wiki.apache.org/solr/UpdateXmlMessages

-- Jack Krupansky

-Original Message- 
From: Guys

Sent: Wednesday, May 16, 2012 6:31 AM
To: solr-user@lucene.apache.org
Subject: indexing Dublin core xml files

Hello, i'd like to index xml files in the Dublin Core format in Solr. I'd
like to know which files i should modify and how. Thank you :)

--
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-Dublin-core-xml-files-tp3984060.html
Sent from the Solr - User mailing list archive at Nabble.com. 



RE: Issue in Applying patch file

2012-05-17 Thread mechravi25
Hi James,

Thank you for your reply.
That issue got resolved;but now, when Im trying to build the solr using "ant
dist" command, its resulting in the following error.


[ivy:retrieve] :: resolving dependencies ::
org.apache.lucene#analyzers-phonetic;working@XXXYYN
[ivy:retrieve]  confs: [default]
[ivy:retrieve]  validate = true
[ivy:retrieve]  refresh = false
[ivy:retrieve] resolving dependencies for configuration 'default'
[ivy:retrieve] == resolving dependencies for
org.apache.lucene#analyzers-phonetic;working@XXXYYN [default]
[ivy:retrieve] == resolving dependencies
org.apache.lucene#analyzers-phonetic;working@XXXYYN->commons-codec#commons-codec;1.6
[default->*]
[ivy:retrieve] default: Checking cache for: dependency:
commons-codec#commons-codec;1.6 {*=[*]}
[ivy:retrieve] don't use cache for commons-codec#commons-codec;1.6:
checkModified=true
[ivy:retrieve]  tried C:\Documents and
Settings\.ivy2\local\commons-codec\commons-codec\1.6\ivys\ivy.xml
[ivy:retrieve]  tried C:\Documents and
Settings\.ivy2\local\commons-codec\commons-codec\1.6\jars\commons-codec.jar
[ivy:retrieve]  local: no ivy file nor artifact found for
commons-codec#commons-codec;1.6
[ivy:retrieve] main: Checking cache for: dependency:
commons-codec#commons-codec;1.6 {*=[*]}
[ivy:retrieve]  tried C:\Documents and
Settings\.ivy2\shared\commons-codec\commons-codec\1.6\ivys\ivy.xml
[ivy:retrieve]  tried C:\Documents and
Settings\.ivy2\shared\commons-codec\commons-codec\1.6\jars\commons-codec.jar
[ivy:retrieve]  shared: no ivy file nor artifact found for
commons-codec#commons-codec;1.6
[ivy:retrieve]  tried
http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.pom
[ivy:retrieve] WARN: Host repo1.maven.org not found.
url=http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.pom
[ivy:retrieve] You probably access the destination server through a proxy
server that is not well configured.
[ivy:retrieve]  tried
http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.jar
[ivy:retrieve] WARN: Host repo1.maven.org not found.
url=http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.jar
[ivy:retrieve] You probably access the destination server through a proxy
server that is not well configured.
[ivy:retrieve]  public: no ivy file nor artifact found for
commons-codec#commons-codec;1.6
[ivy:retrieve] WARN:module not found: commons-codec#commons-codec;1.6
[ivy:retrieve] WARN:  local: tried
[ivy:retrieve] WARN:   C:\Documents and
Settings\.ivy2\local\commons-codec\commons-codec\1.6\ivys\ivy.xml
[ivy:retrieve] WARN:   -- artifact
commons-codec#commons-codec;1.6!commons-codec.jar:
[ivy:retrieve] WARN:   C:\Documents and
Settings\.ivy2\local\commons-codec\commons-codec\1.6\jars\commons-codec.jar
[ivy:retrieve] WARN:  shared: tried
[ivy:retrieve] WARN:   C:\Documents and
Settings\.ivy2\shared\commons-codec\commons-codec\1.6\ivys\ivy.xml
[ivy:retrieve] WARN:   -- artifact
commons-codec#commons-codec;1.6!commons-codec.jar:
[ivy:retrieve] WARN:   C:\Documents and
Settings\.ivy2\shared\commons-codec\commons-codec\1.6\jars\commons-codec.jar
[ivy:retrieve] WARN:  public: tried
[ivy:retrieve] WARN:  
http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.pom
[ivy:retrieve] WARN:   -- artifact
commons-codec#commons-codec;1.6!commons-codec.jar:
[ivy:retrieve] WARN:  
http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.jar
[ivy:retrieve]  resolved ivy file produced in C:\Documents and
Settings\.ivy2\cache\resolved-org.apache.lucene-analyzers-phonetic-work...@xxxyyn.xml
[ivy:retrieve] :: downloading artifacts ::
[ivy:retrieve] :: resolution report :: resolve 22250ms :: artifacts dl 0ms
[ivy:retrieve] WARN:::
[ivy:retrieve] WARN:::  UNRESOLVED DEPENDENCIES ::
[ivy:retrieve] WARN:::
[ivy:retrieve] WARN::: commons-codec#commons-codec;1.6: not found
[ivy:retrieve] WARN:::
[ivy:retrieve]  report for
org.apache.lucene#analyzers-phonetic;working@XXXYYN default produced in
C:\Documents and
Settings\.ivy2\cache\org.apache.lucene-analyzers-phonetic-default.xml
[ivy:retrieve]  resolve done (22250ms resolve - 0ms download)
[ivy:retrieve]
[ivy:retrieve] :: problems summary ::
[ivy:retrieve]  WARNINGS
[ivy:retrieve]  Host repo1.maven.org not found.
url=http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.pom
[ivy:retrieve]  Host repo1.maven.org not found.
url=http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.jar
[ivy:retrieve]  module not found: commons-codec#commons-codec;1.6
[ivy:retrieve]   local: tried
[ivy:retrieve]C:\Documents and
Settings\.ivy2\local\commons-codec\commons-codec\1.6\ivys\ivy.xml
[ivy:retrieve]-- artifact

RE: Issue in Applying patch file

2012-05-17 Thread mechravi25
Hi,

Thank you for your reply . 
That error was resolved but now Im not able to build the solr project using
"ant dist" to generate the war file. It is resulting in the following error.

   
-
|  |modules||   artifacts  
|
|   conf   | number| search|dwnlded|evicted||
number|dwnlded|
   
-
|  default |   2   |   0   |   0   |   0   ||   0   |   0  
|
   
-
[ivy:retrieve] :: resolving dependencies ::
org.apache.lucene#analyzers-phonetic;working@XXXYYN
[ivy:retrieve]  confs: [default]
[ivy:retrieve]  validate = true
[ivy:retrieve]  refresh = false
[ivy:retrieve] resolving dependencies for configuration 'default'
[ivy:retrieve] == resolving dependencies for
org.apache.lucene#analyzers-phonetic;working@XXXYYN [default]
[ivy:retrieve] == resolving dependencies
org.apache.lucene#analyzers-phonetic;working@XXXYYN->commons-codec#commons-codec;1.6
[default->*]
[ivy:retrieve] default: Checking cache for: dependency:
commons-codec#commons-codec;1.6 {*=[*]}
[ivy:retrieve] don't use cache for commons-codec#commons-codec;1.6:
checkModified=true
[ivy:retrieve]  tried C:\Documents and
Settings\.ivy2\local\commons-codec\commons-codec\1.6\ivys\ivy.xml
[ivy:retrieve]  tried C:\Documents and
Settings\.ivy2\local\commons-codec\commons-codec\1.6\jars\commons-codec.jar
[ivy:retrieve]  local: no ivy file nor artifact found for
commons-codec#commons-codec;1.6
[ivy:retrieve] main: Checking cache for: dependency:
commons-codec#commons-codec;1.6 {*=[*]}
[ivy:retrieve]  tried C:\Documents and
Settings\.ivy2\shared\commons-codec\commons-codec\1.6\ivys\ivy.xml
[ivy:retrieve]  tried C:\Documents and
Settings\.ivy2\shared\commons-codec\commons-codec\1.6\jars\commons-codec.jar
[ivy:retrieve]  shared: no ivy file nor artifact found for
commons-codec#commons-codec;1.6
[ivy:retrieve]  tried
http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.pom
[ivy:retrieve] WARN: Host repo1.maven.org not found.
url=http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.pom
[ivy:retrieve] You probably access the destination server through a proxy
server that is not well configured.
[ivy:retrieve]  tried
http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.jar
[ivy:retrieve] WARN: Host repo1.maven.org not found.
url=http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.jar
[ivy:retrieve] You probably access the destination server through a proxy
server that is not well configured.
[ivy:retrieve]  public: no ivy file nor artifact found for
commons-codec#commons-codec;1.6
[ivy:retrieve] WARN:module not found: commons-codec#commons-codec;1.6
[ivy:retrieve] WARN:  local: tried
[ivy:retrieve] WARN:   C:\Documents and
Settings\.ivy2\local\commons-codec\commons-codec\1.6\ivys\ivy.xml
[ivy:retrieve] WARN:   -- artifact
commons-codec#commons-codec;1.6!commons-codec.jar:
[ivy:retrieve] WARN:   C:\Documents and
Settings\.ivy2\local\commons-codec\commons-codec\1.6\jars\commons-codec.jar
[ivy:retrieve] WARN:  shared: tried
[ivy:retrieve] WARN:   C:\Documents and
Settings\.ivy2\shared\commons-codec\commons-codec\1.6\ivys\ivy.xml
[ivy:retrieve] WARN:   -- artifact
commons-codec#commons-codec;1.6!commons-codec.jar:
[ivy:retrieve] WARN:   C:\Documents and
Settings\.ivy2\shared\commons-codec\commons-codec\1.6\jars\commons-codec.jar
[ivy:retrieve] WARN:  public: tried
[ivy:retrieve] WARN:  
http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.pom
[ivy:retrieve] WARN:   -- artifact
commons-codec#commons-codec;1.6!commons-codec.jar:
[ivy:retrieve] WARN:  
http://repo1.maven.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.jar
[ivy:retrieve]  resolved ivy file produced in C:\Documents and
Settings\.ivy2\cache\resolved-org.apache.lucene-analyzers-phonetic-work...@xxxyyn.xml
[ivy:retrieve] :: downloading artifacts ::
[ivy:retrieve] :: resolution report :: resolve 22250ms :: artifacts dl 0ms
[ivy:retrieve] WARN:::
[ivy:retrieve] WARN:::  UNRESOLVED DEPENDENCIES ::
[ivy:retrieve] WARN:::
[ivy:retrieve] WARN::: commons-codec#commons-codec;1.6: not found
[ivy:retrieve] WARN:::
[ivy:retrieve]  report for
org.apache.lucene#analyzers-phonetic;working@XXXYYN default produced in
C:\Documents and
Settings\.ivy2\cache\org.apache.lucene-analyzers-phonetic-default.xml
[ivy:retrieve]  resolve done (22250ms resolve - 0ms download)
[ivy:retrieve]
[ivy:retrieve] :: problems summary ::
[ivy:retrieve]  WARNINGS
[ivy:retrieve]  Host repo

Spellcheck suggestions unavailable during rebuild

2012-05-17 Thread Andrei Amariei

Hello,
I am using Solr 3.5.0 with a IndexBasedSpellChecker configured, and I 
noticed that during rebuild, suggestions are not available.
After looking at the source code, I saw that 
IndexBasedSpellChecker.build(...) calls spellchecker.clearIndex() before 
spellchecker.indexDirectory(...) and I think this is the reason for the 
unavailability.
If that's the case, is it possible to configure or patch it somehow, so 
that old suggestions are available until rebuild is over?


RE: Workaround needed to sort on Multivalued fields indexed in SOLR

2012-05-17 Thread Bob Sandiford
How are you hoping that Sort will work on a multivalued field?  Normally, 
trying to do this makes no sense.

For example, if you have two authors for a document:
Smith, John
Jones, Joe

Then would you expect the document to sort under 'S' for Smith, or 'J' for 
Jones?  There's probably not a specific rule to choose one or the other, at 
least not in a generic sense.

If you wanted (for example) to be able to sort by the first author, then you 
could index just the first author in a separate, non-multivalued field, purely 
for the sort (while still having all the authors in your multivalued field)

Bob Sandiford | Lead Software Engineer | SirsiDynix
P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com
www.sirsidynix.com
 
Join the conversation: Like us on Facebook! Follow us on Twitter!


-Original Message-
From: Anupam Bhattacharya [mailto:anupam...@gmail.com] 
Sent: Thursday, May 17, 2012 1:13 AM
To: solr-user@lucene.apache.org
Subject: Workaround needed to sort on Multivalued fields indexed in SOLR

I have indexed many documents which has a field for authors which is 
multivalued.



How can I sort & order by on this kind of multivalued field ? Pls. suggest any 
workaround ?

Thanks
Anupam



Re: Sorting fields of text_general fieldType

2012-05-17 Thread Ahmet Arslan
> The title sort works in a strange manner because the SOLR
> server treats
> title string based on Upper Case or Lower Case String. Thus
> if we sort in
> ascending order, first the title with numeric shows up then
> the titles in
> alphabetical order which starts with Upper Case & after
> that the titles
> starting with Lowercase.
> 
> The title field is indexed as text_general fieldtype.
> 
>  stored="true"/>

Please see Otis' response http://search-lucene.com/m/uDxTF1scW0d2

Simply create an additional field named title_sortable with the following type 

 

  



  


Populate it via copyField directive : 

  

then &sort=title_sortable asc




Re: highlighter not respecting sentence boundry

2012-05-17 Thread Ahmet Arslan
> I also tried boundary scanner 
> &q=iphone&hl.boundaryScanner=simple&hl.fragsize=200&hl.fragmenter=regex&hl.fl=body

hl.boundaryScanner parameter makes sense for FastVectorHighlighter only.

To activate it you need to use &hl.useFastVectorHighlighter=true

"FastVectorHighlighter requires the field is termVectors=on, termPositions=on 
and termOffsets=on."  

http://wiki.apache.org/solr/HighlightingParameters#hl.useFastVectorHighlighter