SolrCloud and Optimize

2012-09-12 Thread Nikhil Chhaochharia
Hi,

I am using a recent nightly of Solr 4 and have setup a simple SolrCloud cluster 
of 2 shards without any replicas.  If I send the 'optimize' command, then it is 
executed on the shards one-by-one instead of in parallel.

Is this by design?How can I run optimize in parallel on all the shards?

Thanks,
Nikhil



Re: [Solr4 beta] error 503 on commit

2012-09-12 Thread Radim Kolar
 After investigating more, here is the tomcat log herebelow. It is 
indeed the same problem: exceeded limit of maxWarmingSearchers=2,.


could not be solr able to close oldest warming searcher and replace it 
by new one?


Re: [Solr4 beta] error 503 on commit

2012-09-12 Thread Yonik Seeley
On Tue, Sep 11, 2012 at 10:52 AM, Radim Kolar h...@filez.com wrote:
 After investigating more, here is the tomcat log herebelow. It is indeed
 the same problem: exceeded limit of maxWarmingSearchers=2,.

 could not be solr able to close oldest warming searcher and replace it by
 new one?

That approach can easily lead to starvation (i.e. you never get a new
searcher usable for queries).

-Yonik
http://lucidworks.com


Re: Semantic document format... standards?

2012-09-12 Thread Alexandre Rafalovitch
Otis,

If you are doing Named Entity Recognition, you may want to look at the
research area concerned with Named Entity Recognition. :-) In general,
there is inline markup and standoff markup. You seem to be going for
standoff/stand-alone markup. I am not clear though whether it is just
'discovery' format or actual annotation format (with reference to
where in the sentence it is with offsets or token ids).

UIMA (which Solr integrate with already, right?), does NER so it must
be using some sort of format.

Also, TREC is one of the competitions and they provide marked-up
datasets you might be able to learn something from:
http://ilps.science.uva.nl/trec-entity/

If you are not sure where to start with NER, you can look at my
collection of papers, though most of them are probably too specific:
http://www.citeulike.org/user/arafalov

Finally,  if you have to deal with overlapping entities, there was an
article about a month about some sort of general format. I can't seem
to find the article right now, but I could try digging if you are
still stuck.

Regards,
Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Tue, Sep 11, 2012 at 11:51 AM, Otis Gospodnetic
otis_gospodne...@yahoo.com wrote:
 Hello,

 If I'm extracting named entities, topics, key phrases/tags, etc. from 
 documents and I want to have a representation of this document, what format 
 should I use? Are there any standard or at least common formats or approaches 
 people use in such situations?

 For example, the most straight forward format might be something like this:


 document
   titledoc title/title
   keywordsmeta keywords coming from the web page/keywords
   contentpage meat/content
   entitiesname entities recognized in the document/entities
   topicstopics extracted by the annotator/topics
   tagstags extracted by the annotator/tags
   relationsrelations extracted by the annotator/relations
 /document

 But this is a made up format - the XML tags above are just what somebody 
 happened to pick.

 Are there any standard or at least common formats for this?


 Thanks,
 Otis
 
 Performance Monitoring - Solr - ElasticSearch - HBase - 
 http://sematext.com/spm

 Search Analytics - http://sematext.com/search-analytics/index.html


Re: suggester issues

2012-09-12 Thread aniljayanti
Hi,

 I m also facing same issue while using suggester (working in c#.net). 
Below is my configurations.

suggest/?q=michael ja
---
fieldType name=edgytext class=solr.TextField positionIncrementGap=100
omitNorms=true
analyzer type=index
  tokenizer class=solr.KeywordTokenizerFactory /
  filter class=solr.LowerCaseFilterFactory /
  filter class=solr.EdgeNGramFilterFactory minGramSize=1
maxGramSize=15 side=front /
  filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
   analyzer type=query
 tokenizer class=solr.KeywordTokenizerFactory / 
 filter class=solr.LowerCaseFilterFactory / 
   /analyzer
  /fieldType

field name=empname type=edgytext indexed=true stored=true
omitNorms=true omitTermFreqAndPositions=true /

field name=autocomplete_text type=edgytext indexed=true
stored=false  multiValued=true omitNorms=true
omitTermFreqAndPositions=false /

copyField source=empname dest=autocomplete_text/

Response :

 ?xml version=1.0 encoding=UTF-8 ?
- response
- lst name=responseHeader
  int name=status0/int 
  int name=QTime1/int 
  /lst
  result name=response numFound=0 start=0 / 
- lst name=spellcheck
- lst name=suggestions
- lst name=michael
  int name=numFound10/int 
  int name=startOffset1/int 
  int name=endOffset8/int 
- arr name=suggestion
  strmichael bully herbig/str 
  strmichael bolton/str 
  strmichael bolton: arias/str 
  strmichael falch/str 
  strmichael holm/str 
  strmichael jackson/str 
  strmichael neale/str 
  strmichael penn/str 
  strmichael salgado/str 
  strmichael w. smith/str 
  /arr
  /lst
- lst name=ja
  int name=numFound10/int 
  int name=startOffset9/int 
  int name=endOffset11/int 
- arr name=suggestion
  strja me tanssimme/str 
  strjacob andersen/str 
  strjacob haugaard/str 
  strjagged edge/str 
  strjaguares/str 
  strjamiroquai/str 
  strjamppa tuominen/str 
  strjane olivor/str 
  strjanis joplin/str 
  strjanne tulkki/str 
  /arr
  /lst
  str name=collationmichael bully herbig ja me tanssimme/str 
  /lst
  /lst
  /response

Please Help,

AnilHayanti 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/suggester-issues-tp3262718p4007205.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Partial search

2012-09-12 Thread Jack Krupansky
Add debugQuery=true to your query request and look at the explain 
section. The scores will indicate why a document ranks as it does.


When you say that your query was Energy Field, was that a quoted phrase or 
just two keywords? I assume the latter. I also assume that you were using 
the OR operator as default (not AND). Is that the case? Are you 
filtering out stop words at index time?


I tried your three test docs on the Solr 4.0-BETA example schema (putting 
the doc text in the features_en dynamic field) and your query actually 
reorders the three docs as expected, doc3, doc2, doc1.


What release of Solr are you using?

There is probably additional info you are not telling us. See if you can 
reproduce the scenario using only the stock Solr example schema. And if you 
have to make changes, tell us what they are.


-- Jack Krupansky

-Original Message- 
From: Mani

Sent: Tuesday, September 11, 2012 8:29 PM
To: solr-user@lucene.apache.org
Subject: Partial search

I have three documents with the following search field (text_en type) 
values.


When I search for Energy Field, I am getting the document in this order
presented. However if you look at the match, I would expect the Doc3 should
come first and Doc1 should be the last.


Doc1 : Automic Energy and Peace
Doc2 : Energy One Energy Two Energy Three Energy Four
Doc3 : Mathematic Field Energy Field

What is the best way to configure my search to accomodate as many terms
match as possible?







--
View this message in context: 
http://lucene.472066.n3.nabble.com/Partial-search-tp4007097.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: SolrCloud fail over

2012-09-12 Thread Mark Miller
Either setup a load balancer, or use the SolrCloud solrj client
CloudSolrServer - it takes a comma separated list of zk servers rather
than a solr url.

On Tue, Sep 11, 2012 at 10:17 PM, andy yhl...@sohu.com wrote:
 I know fail over is available in solr4.0 right now, if one server
 crashes,other servers also support query,I set up a solr cloud like this
 http://lucene.472066.n3.nabble.com/file/n4007117/Selection_028.png

 I use http://localhost:8983/solr/collection1/select?q=*%3A*wt=xml for query
 at first, if the node  8983 crashes, I have to access other nodes for query
 like http://localhost:8900/solr/collection1/select?q=*%3A*wt=xml

 but I use the nodes url in the solrj, how to change the request url
 dynamically?
 does SolrCloud support something like virtual ip address? for example I use
 url http://collections1 in the solrj, and forward the request to available
 url automatically.




 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/SolrCloud-fail-over-tp4007117.html
 Sent from the Solr - User mailing list archive at Nabble.com.



-- 
- Mark


Doubts in PathHierarchyTokenizer

2012-09-12 Thread mechravi25
Hi,

Im Using Solr 3.6.1 version and I have a field which is having values like

A|B|C
B|C|D|EE
A|C|B 
A|B|D
..etc..

So, When I search for A|B, I should get documents starting with 
A and A|B

To implement this, I've used PathHierarchyTokenizer for the above field as


fieldType name=filep class=solr.TextField positionIncrementGap=100
 analyzer type=index  
tokenizer class=solr.PathHierarchyTokenizerFactory pattern=|/
 /analyzer
analyzer type=query
tokenizer class=solr.KeywordTokenizerFactory /
/analyzer
/fieldType

But, When I use the solr analysis page to check if its being split on the
pipe symbol (|) on indexing, I see that its being taken as the entire
token and its not getting split on the delimiter (i.e. the searching is done
only for A|B in the above case)

I also tried using \| as the delimiter but also its not working. 

Am I missing anything here? Or Will the Path Hierarchy not accept pipe
symbol (|) as delimiter?
Can anyone guide me on this?

Thanks a lot



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Doubts-in-PathHierarchyTokenizer-tp4007216.html
Sent from the Solr - User mailing list archive at Nabble.com.


Retrieval of large number of documents

2012-09-12 Thread Rohit Harchandani
Hi all,
I have a solr index with 5,000,000 documents and my index size is 38GB. But
when I query for about 400,000 documents based on certain criteria, solr
searches it really quickly but does not return data for close to 2 minutes.
The unique key field is the only field i am requesting for. Also, I apply
an xslt transformation to the response to get a comma separated list of
unique keys. Is there a way to improve this speed?? Would sharding help in
this case?
I am currently using solr 4.0 beta in my application.
Thanks,
Rohit


Re: Retrieval of large number of documents

2012-09-12 Thread Paul Libbrecht
Isn't XSLT the bottleneck here?
I have not yet met an incremental XSLT processor, although I heard XSLT 1 
claimed it could be done in principle.

If you start to do this kind of processing, I think you have no other choice 
than write your own output method.

Paul


Le 12 sept. 2012 à 15:47, Rohit Harchandani a écrit :

 Hi all,
 I have a solr index with 5,000,000 documents and my index size is 38GB. But
 when I query for about 400,000 documents based on certain criteria, solr
 searches it really quickly but does not return data for close to 2 minutes.
 The unique key field is the only field i am requesting for. Also, I apply
 an xslt transformation to the response to get a comma separated list of
 unique keys. Is there a way to improve this speed?? Would sharding help in
 this case?
 I am currently using solr 4.0 beta in my application.
 Thanks,
 Rohit



Solr 4.0 Beta Release

2012-09-12 Thread samarth s
Hi All,

Would just like to verify if Solr 4.0 Beta has been released. Does the
following url give the official beta release:
http://www.apache.org/dyn/closer.cgi/lucene/solr/4.0.0-BETA

-- 
Regards,
Samarth


Re: Retrieval of large number of documents

2012-09-12 Thread Alexandre Rafalovitch
Have you tried asking for CSV as an output format? Then, you don't
have any XML wrappers and you will get your IDs one per line. I tried
it with returning about 40 rows and it was just fine.

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Wed, Sep 12, 2012 at 9:52 AM, Paul Libbrecht p...@hoplahup.net wrote:
 Isn't XSLT the bottleneck here?
 I have not yet met an incremental XSLT processor, although I heard XSLT 1 
 claimed it could be done in principle.

 If you start to do this kind of processing, I think you have no other choice 
 than write your own output method.

 Paul


 Le 12 sept. 2012 à 15:47, Rohit Harchandani a écrit :

 Hi all,
 I have a solr index with 5,000,000 documents and my index size is 38GB. But
 when I query for about 400,000 documents based on certain criteria, solr
 searches it really quickly but does not return data for close to 2 minutes.
 The unique key field is the only field i am requesting for. Also, I apply
 an xslt transformation to the response to get a comma separated list of
 unique keys. Is there a way to improve this speed?? Would sharding help in
 this case?
 I am currently using solr 4.0 beta in my application.
 Thanks,
 Rohit



Re: Solr unique key can't be blank

2012-09-12 Thread Ahmet Arslan


--- On Wed, 9/12/12, Dotan Cohen dotanco...@gmail.com wrote:

 From: Dotan Cohen dotanco...@gmail.com
 Subject: Solr unique key can't be blank
 To: solr-user@lucene.apache.org
 Date: Wednesday, September 12, 2012, 5:06 PM
 Consider this simple schema:
 
 ?xml version=1.0 encoding=UTF-8?
 schema name=uuidTest version=0.1
     types
         fieldType
 name=uuid class=solr.UUIDField indexed=true /
     /types
     fields
         field name=id
 type=uuid indexed=true stored=true
 required=true/
     /fields
 /schema
 
 When trying to upload it to Websolr I am getting this
 error:
 Solr unique key can't be blank
 
 I also tried adding this element to the XML, after
 /fields:
 uniqueKeyid/uniqueKey
 
 However this did not help. What could be the issue? I The
 code is
 taken verbatim from this page:
 http://wiki.apache.org/solr/UniqueKey
 
 Note that this is on a Solr 4 Alpha index. Thanks.

Hi Dotan,

Did you define the following update processor chain in solrconfig.xml ?
And did you reference it in an update handler?
 
updateRequestProcessorChain name=uuid
processor class=solr.UUIDUpdateProcessorFactory
  str name=fieldNameid/str
/processor
processor class=solr.RunUpdateProcessorFactory /
/updateRequestProcessorChain



Re: Solr 4.0 Beta Release

2012-09-12 Thread Jack Krupansky
Yes, it has been released. Read the details here (including download 
instructions/links):

http://lucene.apache.org/solr/solrnews.html

-- Jack Krupansky

-Original Message- 
From: samarth s

Sent: Wednesday, September 12, 2012 9:54 AM
To: solr-user@lucene.apache.org
Subject: Solr 4.0 Beta Release

Hi All,

Would just like to verify if Solr 4.0 Beta has been released. Does the
following url give the official beta release:
http://www.apache.org/dyn/closer.cgi/lucene/solr/4.0.0-BETA

--
Regards,
Samarth 



Re: Semantic document format... standards?

2012-09-12 Thread Michael Della Bitta
Actually at my company, we do a lot of NLP work and we've ended up
using bespoke formats, formerly a FeatureStructure serialized to JSON,
but most recently in protobufs. Possibly not the answer you were
looking for, Otis, but at least it's a datapoint.

Michael Della Bitta


Appinions | 18 East 41st St., Suite 1806 | New York, NY 10017
www.appinions.com
Where Influence Isn’t a Game


On Wed, Sep 12, 2012 at 7:36 AM, Alexandre Rafalovitch
arafa...@gmail.com wrote:
 Otis,

 If you are doing Named Entity Recognition, you may want to look at the
 research area concerned with Named Entity Recognition. :-) In general,
 there is inline markup and standoff markup. You seem to be going for
 standoff/stand-alone markup. I am not clear though whether it is just
 'discovery' format or actual annotation format (with reference to
 where in the sentence it is with offsets or token ids).

 UIMA (which Solr integrate with already, right?), does NER so it must
 be using some sort of format.

 Also, TREC is one of the competitions and they provide marked-up
 datasets you might be able to learn something from:
 http://ilps.science.uva.nl/trec-entity/

 If you are not sure where to start with NER, you can look at my
 collection of papers, though most of them are probably too specific:
 http://www.citeulike.org/user/arafalov

 Finally,  if you have to deal with overlapping entities, there was an
 article about a month about some sort of general format. I can't seem
 to find the article right now, but I could try digging if you are
 still stuck.

 Regards,
 Alex.
 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all
 at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)


 On Tue, Sep 11, 2012 at 11:51 AM, Otis Gospodnetic
 otis_gospodne...@yahoo.com wrote:
 Hello,

 If I'm extracting named entities, topics, key phrases/tags, etc. from 
 documents and I want to have a representation of this document, what format 
 should I use? Are there any standard or at least common formats or 
 approaches people use in such situations?

 For example, the most straight forward format might be something like this:


 document
   titledoc title/title
   keywordsmeta keywords coming from the web page/keywords
   contentpage meat/content
   entitiesname entities recognized in the document/entities
   topicstopics extracted by the annotator/topics
   tagstags extracted by the annotator/tags
   relationsrelations extracted by the annotator/relations
 /document

 But this is a made up format - the XML tags above are just what somebody 
 happened to pick.

 Are there any standard or at least common formats for this?


 Thanks,
 Otis
 
 Performance Monitoring - Solr - ElasticSearch - HBase - 
 http://sematext.com/spm

 Search Analytics - http://sematext.com/search-analytics/index.html


Re: Solr unique key can't be blank

2012-09-12 Thread Dotan Cohen
On Wed, Sep 12, 2012 at 5:27 PM, Ahmet Arslan iori...@yahoo.com wrote:
 Hi Dotan,

 Did you define the following update processor chain in solrconfig.xml ?
 And did you reference it in an update handler?

 updateRequestProcessorChain name=uuid
 processor class=solr.UUIDUpdateProcessorFactory
   str name=fieldNameid/str
 /processor
 processor class=solr.RunUpdateProcessorFactory /
 /updateRequestProcessorChain


Thank you Ahmet! In fact, I did not know that the
updateRequestProcessorChain needed to be defined in solrconfig.xml and
I had tried to define it in schema.xml. I don't have access to
solrconfig.xml (I am using Websolr) but I will contact them about
adding it.

Thank you.

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Re: SolrCloud and Optimize

2012-09-12 Thread Walter Underwood
Do not run optimize. It is not necessary. Solr continually optimizes in the 
background. 

wunder

On Sep 11, 2012, at 11:15 PM, Nikhil Chhaochharia wrote:

 Hi,
 
 I am using a recent nightly of Solr 4 and have setup a simple SolrCloud 
 cluster of 2 shards without any replicas.  If I send the 'optimize' command, 
 then it is executed on the shards one-by-one instead of in parallel.
 
 Is this by design?How can I run optimize in parallel on all the shards?
 
 Thanks,
 Nikhil
 







Re: Solr unique key can't be blank

2012-09-12 Thread Jack Krupansky
The UniqueKey wiki was recently updated to indicate this new Solr 4.0 
requirement:


http://wiki.apache.org/solr/UniqueKey

in Solr 4, this field must be populated via 
solr.UUIDUpdateProcessorFactory


The changes you were given are contained on that updated wiki page.

-- Jack Krupansky

-Original Message- 
From: Dotan Cohen

Sent: Wednesday, September 12, 2012 10:43 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr unique key can't be blank

On Wed, Sep 12, 2012 at 5:27 PM, Ahmet Arslan iori...@yahoo.com wrote:

Hi Dotan,

Did you define the following update processor chain in solrconfig.xml ?
And did you reference it in an update handler?

updateRequestProcessorChain name=uuid
processor class=solr.UUIDUpdateProcessorFactory
  str name=fieldNameid/str
/processor
processor class=solr.RunUpdateProcessorFactory /
/updateRequestProcessorChain



Thank you Ahmet! In fact, I did not know that the
updateRequestProcessorChain needed to be defined in solrconfig.xml and
I had tried to define it in schema.xml. I don't have access to
solrconfig.xml (I am using Websolr) but I will contact them about
adding it.

Thank you.

--
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com 



Re: Solr unique key can't be blank

2012-09-12 Thread Ahmet Arslan
 Thank you Ahmet! In fact, I did not know that the
 updateRequestProcessorChain needed to be defined in
 solrconfig.xml and
 I had tried to define it in schema.xml. I don't have access
 to
 solrconfig.xml (I am using Websolr) but I will contact them
 about
 adding it.

Please not that you need to reference it to UpdateRequestHander that you are 
using. (this can be extracting, dataimport etc)

  requestHandler name=/update class=solr.UpdateRequestHandler
!-- See below for information on defining 
 updateRequestProcessorChains that can be used by name 
 on each Update Request
  --

   lst name=defaults
 str name=update.chainuuid/str
   /lst   
  /requestHandler



Re: [Solr4 beta] error 503 on commit

2012-09-12 Thread Radim Kolar



could not be solr able to close oldest warming searcher and replace it by
new one?

That approach can easily lead to starvation (i.e. you never get a new
searcher usable for queries).

It will not. If there is more then 1 warming searcher. Look at this schema:

1. current in use searcher
2. 1st warming searcher
3. 2nd warming searcher

if new warming searcher is needed, close (3) and create a new one (3).
(2) will finish work uninterrupted and it will replace (1)


Authentication Not working in solrnet getting 401 error

2012-09-12 Thread Suneel Pandey
Hi,

I am trying to connect with authenticated solr instance. I have added latest
solrnet .dll  but getting authentication issue. Please Suggest me where i
did wrong.

ISolrOperationsSolrProductCorecl oSolrOperations = null;
const string core0url = http://localhost:8080/solr/products;;
const string core1url = http://localhost:8080/solr/products;;
var solrFacility = new SolrNetFacility(core0url);
var container = new WindsorContainer();
container.AddFacility(solr, solrFacility);
BasicAuthHttpWebRequestFactory OAuth = new
BasicAuthHttpWebRequestFactory(djsrNPvHsUnBSETg, x);
// override core1 components
const string core1Connection = core1.connection;
   
container.Register(Component.ForISolrConnection().ImplementedBySolrConnection().Named(core1Connection).Parameters(Castle.MicroKernel.Registration.Parameter.ForKey(serverURL).Eq(core1url)));

   
container.Register(Component.For(typeof(ISolrBasicOperationsSolrProductCorecl),
typeof(ISolrBasicReadOnlyOperationsSolrProductCorecl))
  
.ImplementedBySolrBasicServerlt;SolrProductCorecl()
  
.ServiceOverrides(ServiceOverride.ForKey(connection).Eq(core1Connection)));

   
container.Register(Component.For(typeof(ISolrOperationsSolrProductCorecl),
typeof(ISolrReadOnlyOperationsSolrProductCorecl))
  
.ImplementedBySolrServerlt;SolrProductCorecl()
  
.ServiceOverrides(ServiceOverride.ForKey(connection).Eq(core1Connection)));

   
container.Register(Component.ForISolrQueryExecuterlt;SolrProductCorecl().ImplementedBySolrQueryExecuterlt;SolrProductCorecl()
  
.ServiceOverrides(ServiceOverride.ForKey(connection).Eq(core1Connection)));

//Authentication//
   
container.Register(Component.ForIHttpWebRequestFactory().ImplementedByBasicAuthHttpWebRequestFactory().ServiceOverrides(ServiceOverride.ForKey(connection).Eq()));

oSolrOperations =
container.ResolveISolrOperationslt;SolrProductCorecl();
oSolrOperations.Ping();



-
Regards,

Suneel Pandey
Sr. Software Developer
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Authentication-Not-working-in-solrnet-getting-401-error-tp4007254.html
Sent from the Solr - User mailing list archive at Nabble.com.


Cannot parse :, using HTTP-URL as id

2012-09-12 Thread sysrq
Hi,

I defined a field id in my schema.xml and use it as an uniqueKey:
  field name=id type=string indexed=true stored=true required=true /
  uniqueKeyid/uniqueKey

I want to store URLs with a prefix in this field to be sure that every id is 
unique among websites. For example:
  domain_http://www.domain.com/?p=12345
  foo_http://foo.com
  bar_http://bar.com/?doc=452
I wrote a Java app, which uses Solrj to communicate with a running Solr 
instance. Solr (or Solrj, not sure about this) complains that it can't parse 
::
  Exception in thread main org.apache.solr.common.SolrException:
  org.apache.lucene.queryparser.classic.ParseException:
  Cannot parse 'id:domain_http://www.domain.com/?p=12345': Encountered  : : 
 at line 1, column 14.

How should I handle characters like : to solve this problem?

I already tried to escape the : like this:
  String id = domain_http://www.domain.com/?p=12345.replaceAll(:, :));
  ...
  document.addField(id, id);
  ...
But then Solr (or Solrj) complains again:
  Exception in thread main org.apache.solr.common.SolrException:
  org.apache.lucene.queryparser.classic.ParseException:
  Cannot parse 'id:domain_http\://www.domain.com/?p=12345': Lexical error at 
line 1, column 42.  Encountered: EOF after : /?p=12345
I use 4 backslashes () for double-escape. The first escape is for Java 
itself, the second is for Solr to handle it (I guess).

So what is the correct or usual way to deal with special characters like : in 
Solr (or Solrj)? I don't know if Solr or Solrj is the problem, but I guess it 
is Solrj?


Re: Doubts in PathHierarchyTokenizer

2012-09-12 Thread Koji Sekiguchi

Use delimiter option instead of pattern for PathHierarchyTokenizerFactory:

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PathHierarchyTokenizerFactory

koji
--
http://soleami.com/blog/starting-lab-work.html

(12/09/12 22:22), mechravi25 wrote:

Hi,

Im Using Solr 3.6.1 version and I have a field which is having values like

A|B|C
B|C|D|EE
A|C|B
A|B|D
..etc..

So, When I search for A|B, I should get documents starting with
A and A|B

To implement this, I've used PathHierarchyTokenizer for the above field as


fieldType name=filep class=solr.TextField positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.PathHierarchyTokenizerFactory pattern=|/
  /analyzer
analyzer type=query
 tokenizer class=solr.KeywordTokenizerFactory /
/analyzer
/fieldType

But, When I use the solr analysis page to check if its being split on the
pipe symbol (|) on indexing, I see that its being taken as the entire
token and its not getting split on the delimiter (i.e. the searching is done
only for A|B in the above case)

I also tried using \| as the delimiter but also its not working.

Am I missing anything here? Or Will the Path Hierarchy not accept pipe
symbol (|) as delimiter?
Can anyone guide me on this?

Thanks a lot



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Doubts-in-PathHierarchyTokenizer-tp4007216.html
Sent from the Solr - User mailing list archive at Nabble.com.







Count disctint groups in grouping distributed

2012-09-12 Thread yriveiro
Hi, 

Exists the possibility of do a distinct group count in a grouping done using
a sharding schema?

This issue https://issues.apache.org/jira/browse/SOLR-3436 make a fixe in
the way to sum all groups returned in a distributed grouping operation, but
not always we want the sum, in some cases is interesting have the distinct
groups between shards.



-
Best regards
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Count-disctint-groups-in-grouping-distributed-tp4007257.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Cannot parse :, using HTTP-URL as id

2012-09-12 Thread Ahmet Arslan

Hello,

term query parser is your friend in this case. With this you don't need to 
escape anything.

SolrQuery query = new SolrQuery();

query.setQuery({!term f=id}bar_http://bar.com/?doc=452;);

--- On Wed, 9/12/12, sy...@web.de sy...@web.de wrote:

 From: sy...@web.de sy...@web.de
 Subject: Cannot parse :, using HTTP-URL as id
 To: solr-user@lucene.apache.org
 Date: Wednesday, September 12, 2012, 7:40 PM
 Hi,
 
 I defined a field id in my schema.xml and use it as an
 uniqueKey:
   field name=id type=string indexed=true
 stored=true required=true /
   uniqueKeyid/uniqueKey
 
 I want to store URLs with a prefix in this field to be sure
 that every id is unique among websites. For example:
   domain_http://www.domain.com/?p=12345
   foo_http://foo.com
   bar_http://bar.com/?doc=452
 I wrote a Java app, which uses Solrj to communicate with a
 running Solr instance. Solr (or Solrj, not sure about this)
 complains that it can't parse ::
   Exception in thread main
 org.apache.solr.common.SolrException:
  
 org.apache.lucene.queryparser.classic.ParseException:
   Cannot parse 'id:domain_http://www.domain.com/?p=12345': Encountered  : 
 :
  at line 1, column 14.
 
 How should I handle characters like : to solve this
 problem?
 
 I already tried to escape the : like this:
   String id = domain_http://www.domain.com/?p=12345.replaceAll(:,
 :));
   ...
   document.addField(id, id);
   ...
 But then Solr (or Solrj) complains again:
   Exception in thread main
 org.apache.solr.common.SolrException:
  
 org.apache.lucene.queryparser.classic.ParseException:
   Cannot parse
 'id:domain_http\://www.domain.com/?p=12345': Lexical error
 at line 1, column 42.  Encountered: EOF after :
 /?p=12345
 I use 4 backslashes () for double-escape. The first
 escape is for Java itself, the second is for Solr to handle
 it (I guess).
 
 So what is the correct or usual way to deal with special
 characters like : in Solr (or Solrj)? I don't know if Solr
 or Solrj is the problem, but I guess it is Solrj?



Unable to implememnt SolrNet Authentication.

2012-09-12 Thread Suneel Pandey
Hello,

I am working on solr authentication with the help of solrnet dll and
windsolr container getting some issue. Please suggest me and provide me some
link this will be very helpful for me.



-
Regards,

Suneel Pandey
Sr. Software Developer
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unable-to-implememnt-SolrNet-Authentication-tp4007259.html
Sent from the Solr - User mailing list archive at Nabble.com.


Beginner questions

2012-09-12 Thread Ken Clarke
Hi Folks,

I'm going to setup a SOLR search server for the first time.  Hope you don't 
mind a few beginner questions.  Perhaps a quick summary of how I intend to use 
it will help.

The SOLR server will be installed on a single VPS host and bound to a 
internal IP (192.168.?.?).  Search parameters will be received by a mod_perl 
script which will handle input validation, SOLR query language generation, 
submition to SOLR, SOLR response parsing and search request response.

Should I go with Beta 4 or stable 3?

Which servlet container would you suggest is the most efficient for my 
implementation?

I'm unclear if the JDK is required or I can just install a JRE.  I was 
guessing that Oracle's Java SE 7u7 would probably be the best implementation, 
yes/no?

How relevant is the Apache Solr 3 Enterprise Search Server book to 
working with version 4?  I couldn't find a list of differences anywhere.

Apreesh!
  
 Ken Clarke
 Contract Web Programmer / E-commerce Technologist


failure notice from zju.edu.cn

2012-09-12 Thread Ahmet Arslan
Hello All,


Sometimes (in a random manner) I get the following when I reply a post :

Hi. This is the deliver program at zju.edu.cn.
I'm afraid I wasn't able to deliver your message to the following addresses.
This is a permanent error; I've given up. Sorry it didn't work out.

new...@zju.edu.cn
reject mail 

David asked this question before : http://search-lucene.com/m/mlfOKh7WXn/
But I always use plain text e-mails. Can anybody explain what this 
mailer-dae...@zju.edu.cn or new...@zju.edu.cn thing is? Are they subscribers of 
solr-user Mailing List? How can I prevent this? This seems delaying my mails 
appearing on ML.

Thanks,
Ahmet


Re: PrecedenceQueryParser usage

2012-09-12 Thread Maciej Pestka
Thank you!

It seems to me that I managed to get it work.
Just for future reference short I attach source code. The jar should be placed 
under core/lib folder:
Please let me know if you have any comments or if I got sth incorrect...

public class PrecedenceQParserPlugin extends QParserPlugin {
private static final Logger LOG = 
LoggerFactory.getLogger(PrecedenceQParserPlugin.class);

@Override
public void init(NamedList list) {
}

@Override
public QParser createParser(String qstr, SolrParams localParams, 
SolrParams params, SolrQueryRequest req) {
LOG.debug(creating new PrecedenceQParser:, new Object[] 
{qstr, localParams, params, req});
return new PrecedenceQParser(qstr, localParams, params, req);
}
}

class PrecedenceQParser extends QParser {
private static final Logger LOG = 
LoggerFactory.getLogger(PrecedenceQParser.class);

private final PrecedenceQueryParser parser;
public PrecedenceQParser(String qstr, SolrParams localParams, 
SolrParams params, SolrQueryRequest req) {
super(qstr, localParams, params, req);
this.parser = new PrecedenceQueryParser();
}

@Override
public Query parse() throws ParseException {
LOG.debug(parse(): , qstr);
if (null==qstr) {
return null;
}
final String defaultField = 
QueryParsing.getDefaultField(getReq().getSchema(),getParam(CommonParams.DF));
try {
return parser.parse(qstr, defaultField);
} catch (QueryNodeException e) {
throw new ParseException(e.getMessage(), e);
}
}
}


Best Regards
Maciej Pestka


Dnia 10-09-2012 o godz. 17:46 Ahmet Arslan napisał(a):
  In order for Solr to use this parser,
  you'll need to wrap it with a QParser and QParserPlugin
  implementations, then wire your implementation into
  solrconfig.xml. 
 
 SurroundQParserPlugin.java (api-4_0_0-BETA) can be an example of such 
 implementation.
 
 http://lucene.apache.org/solr/api-4_0_0-BETA/org/apache/solr/search/SurroundQParserPlugin.html





Re: Count disctint groups in grouping distributed

2012-09-12 Thread Jason Rutherglen
Distinct in a distributed environment would require de-duplication
en-masse, use Hive or MapReduce instead.

On Wed, Sep 12, 2012 at 11:53 AM, yriveiro yago.rive...@gmail.com wrote:
 Hi,

 Exists the possibility of do a distinct group count in a grouping done using
 a sharding schema?

 This issue https://issues.apache.org/jira/browse/SOLR-3436 make a fixe in
 the way to sum all groups returned in a distributed grouping operation, but
 not always we want the sum, in some cases is interesting have the distinct
 groups between shards.



 -
 Best regards
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Count-disctint-groups-in-grouping-distributed-tp4007257.html
 Sent from the Solr - User mailing list archive at Nabble.com.


RE: failure notice from zju.edu.cn

2012-09-12 Thread Steven A Rowe
I get the same thing, after nearly every email I send directly to the 
lucene/solr lists (as opposed to auto-sent JIRA posts).

I don't think it delays my messages though.

Steve

-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.com] 
Sent: Wednesday, September 12, 2012 1:24 PM
To: solr-user@lucene.apache.org
Subject: failure notice from zju.edu.cn

Hello All,


Sometimes (in a random manner) I get the following when I reply a post :

Hi. This is the deliver program at zju.edu.cn.
I'm afraid I wasn't able to deliver your message to the following addresses.
This is a permanent error; I've given up. Sorry it didn't work out.

new...@zju.edu.cn
reject mail 

David asked this question before : http://search-lucene.com/m/mlfOKh7WXn/
But I always use plain text e-mails. Can anybody explain what this 
mailer-dae...@zju.edu.cn or new...@zju.edu.cn thing is? Are they subscribers of 
solr-user Mailing List? How can I prevent this? This seems delaying my mails 
appearing on ML.

Thanks,
Ahmet


Aw: Re: Cannot parse :, using HTTP-URL as id

2012-09-12 Thread sysrq
 term query parser is your friend in this case. With this you don't need to 
 escape anything.
   SolrQuery query = new SolrQuery();
   query.setQuery({!term f=id}bar_http://bar.com/?doc=452;);

But how can I *store* a document with an URL as a field value ? E.g. 
domain_http://www.domain.com/?p=12345;
The term query parser may be able to *retrieve* field values with an :, but 
my current problem is that I can't store value with : with *Solrj*, the Java 
library to communicate with Solr.

 --- On Wed, 9/12/12, sy...@web.de sy...@web.de wrote:
 
  From: sy...@web.de sy...@web.de
  Subject: Cannot parse :, using HTTP-URL as id
  To: solr-user@lucene.apache.org
  Date: Wednesday, September 12, 2012, 7:40 PM
  Hi,
  
  I defined a field id in my schema.xml and use it as an
  uniqueKey:
    field name=id type=string indexed=true
  stored=true required=true /
    uniqueKeyid/uniqueKey
  
  I want to store URLs with a prefix in this field to be sure
  that every id is unique among websites. For example:
    domain_http://www.domain.com/?p=12345
    foo_http://foo.com
    bar_http://bar.com/?doc=452
  I wrote a Java app, which uses Solrj to communicate with a
  running Solr instance. Solr (or Solrj, not sure about this)
  complains that it can't parse ::
    Exception in thread main
  org.apache.solr.common.SolrException:
   
  org.apache.lucene.queryparser.classic.ParseException:
    Cannot parse 'id:domain_http://www.domain.com/?p=12345': Encountered  
  : :
   at line 1, column 14.
  
  How should I handle characters like : to solve this
  problem?
  
  I already tried to escape the : like this:
    String id = domain_http://www.domain.com/?p=12345.replaceAll(:,
  :));
    ...
    document.addField(id, id);
    ...
  But then Solr (or Solrj) complains again:
    Exception in thread main
  org.apache.solr.common.SolrException:
   
  org.apache.lucene.queryparser.classic.ParseException:
    Cannot parse
  'id:domain_http\://www.domain.com/?p=12345': Lexical error
  at line 1, column 42.  Encountered: EOF after :
  /?p=12345
  I use 4 backslashes () for double-escape. The first
  escape is for Java itself, the second is for Solr to handle
  it (I guess).
  
  So what is the correct or usual way to deal with special
  characters like : in Solr (or Solrj)? I don't know if Solr
  or Solrj is the problem, but I guess it is Solrj?
 
 


Aw: Re: Cannot parse :, using HTTP-URL as id

2012-09-12 Thread sysrq
my bad, using term query parser works, thanks ahmet.


 Gesendet: Mittwoch, 12. September 2012 um 19:40 Uhr
 Von: sy...@web.de
 An: solr-user@lucene.apache.org
 Betreff: Aw: Re: Cannot parse :, using HTTP-URL as id

  term query parser is your friend in this case. With this you don't need to 
  escape anything.
SolrQuery query = new SolrQuery();
query.setQuery({!term f=id}bar_http://bar.com/?doc=452;);
 
 But how can I *store* a document with an URL as a field value ? E.g. 
 domain_http://www.domain.com/?p=12345;
 The term query parser may be able to *retrieve* field values with an :, 
 but my current problem is that I can't store value with : with *Solrj*, the 
 Java library to communicate with Solr.
 
  --- On Wed, 9/12/12, sy...@web.de sy...@web.de wrote:
  
   From: sy...@web.de sy...@web.de
   Subject: Cannot parse :, using HTTP-URL as id
   To: solr-user@lucene.apache.org
   Date: Wednesday, September 12, 2012, 7:40 PM
   Hi,
   
   I defined a field id in my schema.xml and use it as an
   uniqueKey:
     field name=id type=string indexed=true
   stored=true required=true /
     uniqueKeyid/uniqueKey
   
   I want to store URLs with a prefix in this field to be sure
   that every id is unique among websites. For example:
     domain_http://www.domain.com/?p=12345
     foo_http://foo.com
     bar_http://bar.com/?doc=452
   I wrote a Java app, which uses Solrj to communicate with a
   running Solr instance. Solr (or Solrj, not sure about this)
   complains that it can't parse ::
     Exception in thread main
   org.apache.solr.common.SolrException:
    
   org.apache.lucene.queryparser.classic.ParseException:
     Cannot parse 'id:domain_http://www.domain.com/?p=12345': Encountered  
   : :
at line 1, column 14.
   
   How should I handle characters like : to solve this
   problem?
   
   I already tried to escape the : like this:
     String id = domain_http://www.domain.com/?p=12345.replaceAll(:,
   :));
     ...
     document.addField(id, id);
     ...
   But then Solr (or Solrj) complains again:
     Exception in thread main
   org.apache.solr.common.SolrException:
    
   org.apache.lucene.queryparser.classic.ParseException:
     Cannot parse
   'id:domain_http\://www.domain.com/?p=12345': Lexical error
   at line 1, column 42.  Encountered: EOF after :
   /?p=12345
   I use 4 backslashes () for double-escape. The first
   escape is for Java itself, the second is for Solr to handle
   it (I guess).
   
   So what is the correct or usual way to deal with special
   characters like : in Solr (or Solrj)? I don't know if Solr
   or Solrj is the problem, but I guess it is Solrj?
  
  
 


3.6.1 - Suggester and spellcheker Implementation

2012-09-12 Thread Sujatha Arun
Hi ,

If I am looking to implement Suggester Implementation with 3.6.1 ,I beleive
this creates it own index , now If I want to also use the spellcheck  also
,would it be using the same index as suggester?

Regards
Sujatha


Re: Beginner questions

2012-09-12 Thread Ahmet Arslan
     Should I go with Beta 4 or stable 3?

I would use solr 4, since this is first time installation.

     Which servlet container would you suggest is
 the most efficient for my implementation?

Folks use both jetty and tomcat.

     I'm unclear if the JDK is required or I can
 just install a JRE.  I was guessing that Oracle's Java
 SE 7u7 would probably be the best implementation, yes/no?

README.txt says 
Download the Java SE 6 JDK (Java Development Kit) 
 You will need the JDK installed, and the $JAVA_HOME/bin (Windows: 
%JAVA_HOME%\bin) folder included on your command path. To test this, issue a 
java -version command from your shell (command prompt) and verify that the 
Java version is 1.6 or later.
 
     How relevant is the Apache Solr 3 Enterprise
 Search Server book to working with version 4?  I
 couldn't find a list of differences anywhere.

I suggest you to read this book (without worrying about solr4). It is easy to 
understand and it covers lots of things.


Re: Beginner questions

2012-09-12 Thread Alexandre Rafalovitch
I would start with version 4, hands down.

I started with Solr 4 alpha and has moved to beta. Final can't be too
far behind. So far, it has been extremely stable for me.

And unless you are going into production in a next week, it will
probably be final while you are learning.

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Wed, Sep 12, 2012 at 1:20 PM, Ken Clarke k_cla...@perlprogrammer.net wrote:
 Hi Folks,

 I'm going to setup a SOLR search server for the first time.  Hope you 
 don't mind a few beginner questions.  Perhaps a quick summary of how I intend 
 to use it will help.

 The SOLR server will be installed on a single VPS host and bound to a 
 internal IP (192.168.?.?).  Search parameters will be received by a mod_perl 
 script which will handle input validation, SOLR query language generation, 
 submition to SOLR, SOLR response parsing and search request response.

 Should I go with Beta 4 or stable 3?

 Which servlet container would you suggest is the most efficient for my 
 implementation?

 I'm unclear if the JDK is required or I can just install a JRE.  I was 
 guessing that Oracle's Java SE 7u7 would probably be the best implementation, 
 yes/no?

 How relevant is the Apache Solr 3 Enterprise Search Server book to 
 working with version 4?  I couldn't find a list of differences anywhere.

 Apreesh!

 Ken Clarke
 Contract Web Programmer / E-commerce Technologist


Re: 3.6.1 - Suggester and spellcheker Implementation

2012-09-12 Thread Otis Gospodnetic
Hi Sujatha,

No, suggester and spellchecker are separate beasts.

Otis
-- 
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html


On Wed, Sep 12, 2012 at 3:18 PM, Sujatha Arun suja.a...@gmail.com wrote:
 Hi ,

 If I am looking to implement Suggester Implementation with 3.6.1 ,I beleive
 this creates it own index , now If I want to also use the spellcheck  also
 ,would it be using the same index as suggester?

 Regards
 Sujatha


Hey solr-user MODERATOR (was: Re: failure notice from zju.edu.cn)

2012-09-12 Thread Otis Gospodnetic
Same here.  Changed subject to attract more attention.

Otis

On Wed, Sep 12, 2012 at 1:34 PM, Steven A Rowe sar...@syr.edu wrote:
 I get the same thing, after nearly every email I send directly to the 
 lucene/solr lists (as opposed to auto-sent JIRA posts).

 I don't think it delays my messages though.

 Steve

 -Original Message-
 From: Ahmet Arslan [mailto:iori...@yahoo.com]
 Sent: Wednesday, September 12, 2012 1:24 PM
 To: solr-user@lucene.apache.org
 Subject: failure notice from zju.edu.cn

 Hello All,


 Sometimes (in a random manner) I get the following when I reply a post :

 Hi. This is the deliver program at zju.edu.cn.
 I'm afraid I wasn't able to deliver your message to the following addresses.
 This is a permanent error; I've given up. Sorry it didn't work out.

 new...@zju.edu.cn
 reject mail 

 David asked this question before : http://search-lucene.com/m/mlfOKh7WXn/
 But I always use plain text e-mails. Can anybody explain what this 
 mailer-dae...@zju.edu.cn or new...@zju.edu.cn thing is? Are they subscribers 
 of solr-user Mailing List? How can I prevent this? This seems delaying my 
 mails appearing on ML.

 Thanks,
 Ahmet


Re: TikaException: Unsupported AutoCAD drawing version

2012-09-12 Thread Ahmet Arslan
 I am indexing data with Solr Cell,
 using mainly the code from here: 
 http://wiki.apache.org/solr/ContentStreamUpdateRequestExample
 
 But in my Solr server i got the TikaException followed by a
 solrexception 
 in my solrj programm.
 
 Is there a way to suppress this and similar exceptions
 directly in the 
 Server?

Taken from : 
http://search-lucene.com/m/ZOs8xGNL6j2/TikaException+ignoresubj=ignoreTikaException+value

  requestHandler name=/update/extract
   startup=lazy
   class=solr.extraction.ExtractingRequestHandler 
 lst name=defaults
 :
   bool name=ignoreTikaExceptiontrue/bool
 :
 /lst
   /requestHandler


Is it possible to do an if statement in a Solr query?

2012-09-12 Thread Gustav
Hello everyone, I'm working on an e-commerce website and using Solr as my
Search Engine, im really enjoying its funcionality and the search
options/performance. 
But i am stucky in a kinda tricky cenario... That what happens:

I Have  a medicine web-store, where i indexed all necessary products in my
Index Solr. 
But when i search for some medicine, following my business rules, i have to
verify if the result of my search contains any Original medicine, if there
is any, then i wouldn't show the generics of this respective medicine, on
the other hand, if there wasnt any original product in the result i would
have to return its generics.
Im currently returning the original and generics, is there a way to do this
kind of checking in solr?

Thanks! :)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-it-possible-to-do-an-if-statement-in-a-Solr-query-tp4007311.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Is it possible to do an if statement in a Solr query?

2012-09-12 Thread Walter Underwood
You may be able to do this with grouping. Group on the medicine family, and 
only show the Original if there are multiple items in the family.

wunder

On Sep 12, 2012, at 2:09 PM, Gustav wrote:

 Hello everyone, I'm working on an e-commerce website and using Solr as my
 Search Engine, im really enjoying its funcionality and the search
 options/performance. 
 But i am stucky in a kinda tricky cenario... That what happens:
 
 I Have  a medicine web-store, where i indexed all necessary products in my
 Index Solr. 
 But when i search for some medicine, following my business rules, i have to
 verify if the result of my search contains any Original medicine, if there
 is any, then i wouldn't show the generics of this respective medicine, on
 the other hand, if there wasnt any original product in the result i would
 have to return its generics.
 Im currently returning the original and generics, is there a way to do this
 kind of checking in solr?
 
 Thanks! :)
 






Re: Is it possible to do an if statement in a Solr query?

2012-09-12 Thread Jack Krupansky
You could implement a custom search component with that logic, if you 
don't mind the complexity of writing Java code that runs inside the Solr 
environment. Otherwise, just implement that logic in your app. Or, or 
implement an app server which sits between Solr and your app.


http://wiki.apache.org/solr/SearchComponent

-- Jack Krupansky

-Original Message- 
From: Gustav

Sent: Wednesday, September 12, 2012 5:09 PM
To: solr-user@lucene.apache.org
Subject: Is it possible to do an if statement in a Solr query?

Hello everyone, I'm working on an e-commerce website and using Solr as my
Search Engine, im really enjoying its funcionality and the search
options/performance.
But i am stucky in a kinda tricky cenario... That what happens:

I Have  a medicine web-store, where i indexed all necessary products in my
Index Solr.
But when i search for some medicine, following my business rules, i have to
verify if the result of my search contains any Original medicine, if there
is any, then i wouldn't show the generics of this respective medicine, on
the other hand, if there wasnt any original product in the result i would
have to return its generics.
Im currently returning the original and generics, is there a way to do this
kind of checking in solr?

Thanks! :)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-it-possible-to-do-an-if-statement-in-a-Solr-query-tp4007311.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Can solr return matched fields?

2012-09-12 Thread Dan Foley
is there a way for solr to tell me what fields the query matched,
other then turning debug on?

I'd like my application to take different actions based on what fields
were matched.

-- 
Dan Foley
Owner - PHP Web Developer
___
Micamedia.com - PHP Web Development


Re: Can solr return matched fields?

2012-09-12 Thread Casey Callendrello
What about using the FastVectorHighlighter? It should get you what
you're looking for (fields with matches) without much of a query-time
performance impact.

--Casey


On 9/12/12 3:01 PM, Dan Foley wrote:
 is there a way for solr to tell me what fields the query matched,
 other then turning debug on?

 I'd like my application to take different actions based on what fields
 were matched.





signature.asc
Description: OpenPGP digital signature


How to post atomic updates using xml

2012-09-12 Thread jimtronic
There's a good intro to atomic updates here:
http://yonik.com/solr/atomic-updates/ but it does not describe how to
structure the updates using xml.

Anyone have any idea on how these would look?

Thanks! Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-post-atomic-updates-using-xml-tp4007323.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to post atomic updates using xml

2012-09-12 Thread jimtronic
Figured it out.

in JSON: 

 {id   : book1,
  author   : {set:Neal Stephenson}
 }

in XML:

adddocfield name=idbook1/fieldfield name=author set=Neal
Stephenson/field

This seems to work.

Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-post-atomic-updates-using-xml-tp4007323p4007325.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Can solr return matched fields?

2012-09-12 Thread Jack Krupansky
But presumably matched fields relates to indexed fields, which might not 
have stored values.


-- Jack Krupansky

-Original Message- 
From: Casey Callendrello

Sent: Wednesday, September 12, 2012 6:15 PM
To: solr-user@lucene.apache.org
Subject: Re: Can solr return matched fields?



Want a multi-datacenter environment with Solr?

2012-09-12 Thread Stephanie Huynh
Does anyone want me to send them a white paper on having a
multi-datacenter environment with Solr?

Best,
Stephanie


Re: Want a multi-datacenter environment with Solr?

2012-09-12 Thread Otis Gospodnetic
Is that with plain Apache Solr or Datastax?

Otis
--
Performance Monitoring - http://sematext.com/spm
On Sep 12, 2012 7:55 PM, Stephanie Huynh st...@datastax.com wrote:

 Does anyone want me to send them a white paper on having a
 multi-datacenter environment with Solr?

 Best,
 Stephanie



How does Solr handle overloads so well?

2012-09-12 Thread Mike Gagnon
Hi,

I have been studying how server software responds to requests that cause
CPU overloads (such as infinite loops).

In my experiments I have observed that Solr performs unusually well when
subjected to such loads. Every other piece of web software I've
experimented with drops to zero service under such loads. Do you know how
Solr achieves such good performance? I am guessing that when Solr is
overload sheds load to make room for incoming requests, but I could not
find any documentation that describes Solr's overload strategy.

Experimental setup: I ran Solr 3.1 on a 12-core machine with 12 GB ram,
using it index and search about 10,000 pages on MediaWiki. I test both
Solr+Jetty and Solr+Tomcat. I submitted a variety of Solr queries at a rate
of 300 requests per second. At the same time, I submitted overload
requests at a rate of 60 requests per second. Each overload request caused
an infinite loop in Solr via https://issues.apache.org/jira/browse/SOLR-2631.

With Jetty about 70% of non-overload requests completed --- 95% of requests
completing within 0.6 seconds.
With Tomcat about 34% of non-overload requests completed --- 95% of
requests completing within 0.6 seconds.

I also ran Solr+Jetty with non-overload requests coming in 65 requests per
second (overload requests remain at 60 requests per second). In this
workload, the completion rate drops to 15% and the 95th percentile latency
increases to 25.

Cheers,
Mike Gagnon


Re: How does Solr handle overloads so well?

2012-09-12 Thread Otis Gospodnetic
Hm, I'm not sure how to approach this. Solr is not alone here - there's
container like jetty, solr inside it and lucene inside solr.
Next, that index is rally small, so there is no disk IO. The request
rate is also not super high and if you did this over a fast connection then
there are also no issues with slow response writing or with having lots of
concurrent connections or running out of threads ...

...so it's not really that surprising solr keeps working :)

But...tell us more.

Otis
--
Performance Monitoring - http://sematext.com/spm
On Sep 12, 2012 8:51 PM, Mike Gagnon mikegag...@gmail.com wrote:

 Hi,

 I have been studying how server software responds to requests that cause
 CPU overloads (such as infinite loops).

 In my experiments I have observed that Solr performs unusually well when
 subjected to such loads. Every other piece of web software I've
 experimented with drops to zero service under such loads. Do you know how
 Solr achieves such good performance? I am guessing that when Solr is
 overload sheds load to make room for incoming requests, but I could not
 find any documentation that describes Solr's overload strategy.

 Experimental setup: I ran Solr 3.1 on a 12-core machine with 12 GB ram,
 using it index and search about 10,000 pages on MediaWiki. I test both
 Solr+Jetty and Solr+Tomcat. I submitted a variety of Solr queries at a rate
 of 300 requests per second. At the same time, I submitted overload
 requests at a rate of 60 requests per second. Each overload request caused
 an infinite loop in Solr via
 https://issues.apache.org/jira/browse/SOLR-2631.

 With Jetty about 70% of non-overload requests completed --- 95% of requests
 completing within 0.6 seconds.
 With Tomcat about 34% of non-overload requests completed --- 95% of
 requests completing within 0.6 seconds.

 I also ran Solr+Jetty with non-overload requests coming in 65 requests per
 second (overload requests remain at 60 requests per second). In this
 workload, the completion rate drops to 15% and the 95th percentile latency
 increases to 25.

 Cheers,
 Mike Gagnon



Re: Is it possible to do an if statement in a Solr query?

2012-09-12 Thread Amit Nithian
If the fact that it's original vs generic is a field is_original
0/1 can you sort by is_original? Similarly, could you put a huge boost
on is_original in the dismax so that document matches on is_original
score higher than those that aren't original? Or is your goal to not
show generics *at all*?


On Wed, Sep 12, 2012 at 2:47 PM, Walter Underwood wun...@wunderwood.org wrote:
 You may be able to do this with grouping. Group on the medicine family, and 
 only show the Original if there are multiple items in the family.

 wunder

 On Sep 12, 2012, at 2:09 PM, Gustav wrote:

 Hello everyone, I'm working on an e-commerce website and using Solr as my
 Search Engine, im really enjoying its funcionality and the search
 options/performance.
 But i am stucky in a kinda tricky cenario... That what happens:

 I Have  a medicine web-store, where i indexed all necessary products in my
 Index Solr.
 But when i search for some medicine, following my business rules, i have to
 verify if the result of my search contains any Original medicine, if there
 is any, then i wouldn't show the generics of this respective medicine, on
 the other hand, if there wasnt any original product in the result i would
 have to return its generics.
 Im currently returning the original and generics, is there a way to do this
 kind of checking in solr?

 Thanks! :)







Re: SolrCloud fail over

2012-09-12 Thread andy
Cool,Thanks Mark!

Mark Miller-3 wrote
 
 Either setup a load balancer, or use the SolrCloud solrj client
 CloudSolrServer - it takes a comma separated list of zk servers rather
 than a solr url.
 
 On Tue, Sep 11, 2012 at 10:17 PM, andy lt;yhlweb@gt; wrote:
 I know fail over is available in solr4.0 right now, if one server
 crashes,other servers also support query,I set up a solr cloud like this
 http://lucene.472066.n3.nabble.com/file/n4007117/Selection_028.png

 I use http://localhost:8983/solr/collection1/select?q=*%3A*wt=xml for
 query
 at first, if the node  8983 crashes, I have to access other nodes for
 query
 like http://localhost:8900/solr/collection1/select?q=*%3A*wt=xml

 but I use the nodes url in the solrj, how to change the request url
 dynamically?
 does SolrCloud support something like virtual ip address? for example I
 use
 url http://collections1 in the solrj, and forward the request to
 available
 url automatically.




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/SolrCloud-fail-over-tp4007117.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 -- 
 - Mark
 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-fail-over-tp4007117p4007360.html
Sent from the Solr - User mailing list archive at Nabble.com.