date:20120125

solr replication

2012-01-25 Thread Parvin Gasimzade

Hi all,

I try to setup solr replication by following
http://wiki.apache.org/solr/SolrReplication tutorial.Everything is working
ok.

My question is, should i define two SolrServer in Solrj (one for master and
one for slave) in order to redirect indexes into master and queries into
slave?

In the future if i want to add new slave then should i change code?

What is the recommended way?

Thanks in advance.

Regards

Re: solr replication

2012-01-25 Thread darul

Here is the way I see it (and implemented it), while using SolrJ api you have
to fire :

- Indexation commands to your /indexation solr instance/ (master) example :
http://myMaster:80/myCore/
- Query commands to your /search solr instance/ (slave). You may have
several slaves, and also find alternative as broker to make load balancing
betweeen each
http://mySlave1:80/myCore/
http://mySlave2:80/myCore/
...

You do not need any changes in code normally, replication is made
automatically and defined in your solrconfig.xml configuration file.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-replication-tp3687106p3687168.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: index-time over boosted

2012-01-25 Thread remi tassing

Hi,

it worked (I'm using Solr-3.4.0, not that it matters)!!

I'll try to figure out what went wrong ...with my limited skills.

The solution omitNorms=true works for now but it's not a long term
solution in my opinion. I also need to figure out how to make all that work.

Thanks again Jan!!

Remi

On Tue, Jan 24, 2012 at 5:58 PM, Jan Høydahl jan@cominvent.com wrote:

 Hi,

 Well, I think you do it right, but get tricked by either editing the wrong
 file, a typo or browser caching.
 Why not try to start with a fresh Solr3.5.0, start the example app, index
 all exampledocs, search for Podcasts, you get one hit, in fields text
 and features.
 Then change solr/example/solr/conf/schema.xml and add omitNorms=true to
 these two fields. Then stop Solr, delete your index, start Solr, re-index
 the docs and try again. fieldNorm is now 1.0. Once you get that working you
 can start debugging where you got it wrong in your own setup.

 --
 Jan Høydahl, search solution architect
 Cominvent AS - www.cominvent.com
 Solr Training - www.solrtraining.com

  On 24. jan. 2012, at 14:55, remi tassing wrote:

  Hello,
 
  thanks for helping out Jan, I really appreciate that!
 
  These are full explains of two results:
 
  Result#1.--
 
  3.0412199E-5 = (MATCH) max of:
   3.0412199E-5 = (MATCH) weight(content:mobil broadband^0.5 in
  19081), product of:
 0.13921623 = queryWeight(content:mobil broadband^0.5), product of:
   0.5 = boost
   6.3531075 = idf(content: mobil=5270 broadband=2392)
   0.043826185 = queryNorm
 2.1845297E-4 = fieldWeight(content:mobil broadband in 19081),
 product of:
   3.6055512 = tf(phraseFreq=13.0)
   6.3531075 = idf(content: mobil=5270 broadband=2392)
   9.536743E-6 = fieldNorm(field=content, doc=19081)
 
  Result#2.-
 
  2.6991445E-5 = (MATCH) max of:
   2.6991445E-5 = (MATCH) weight(content:mobil broadband^0.5 in
  15306), product of:
 0.13921623 = queryWeight(content:mobil broadband^0.5), product of:
   0.5 = boost
   6.3531075 = idf(content: mobil=5270 broadband=2392)
   0.043826185 = queryNorm
 1.9388145E-4 = fieldWeight(content:mobil broadband in 15306),
 product of:
   1.0 = tf(phraseFreq=1.0)
   6.3531075 = idf(content: mobil=5270 broadband=2392)
   3.0517578E-5 = fieldNorm(field=content, doc=15306)
 
  Remi
 
 
  On Tue, Jan 24, 2012 at 3:38 PM, Jan Høydahl jan@cominvent.com
 wrote:
 
  That looks right. Can you restart your Solr, do a new search with
  debugQuery=true and copy/paste the full EXPLAIN output for your query?
 
  --
  Jan Høydahl, search solution architect
  Cominvent AS - www.cominvent.com
  Solr Training - www.solrtraining.com
 
  On 24. jan. 2012, at 13:22, remi tassing wrote:
 
  Any idea?
 
  This is a snippet of my schema.xml now:
 
  ?xml version=1.0 encoding=UTF-8 ?
  !--
  Licensed to the Apache Software Foundation (ASF) under one or more
  ...
   !-- fields for index-basic plugin --
field name=host type=url stored=false indexed=true/
field name=site type=string stored=false indexed=true/
field name=url type=url stored=true indexed=true
required=true/
field name=content type=text stored=true indexed=true
  omitNorms=true/
field name=cache type=string stored=true indexed=false/
field name=tstamp type=long stored=true indexed=false/
   !-- fields for index-anchor plugin --
field name=anchor type=string stored=true indexed=true
multiValued=true/
 
  ...
   !-- uncomment the following to ignore any fields that don't already
  match an existing
field name or dynamic field, rather than reporting them as an
  error.
alternately, change the type=ignored to some other type e.g.
  text if you want
unknown fields indexed and/or stored by default --
   !--dynamicField name=* type=ignored multiValued=true /--
 
  /fields
 
  !-- Field to use to determine and enforce document uniqueness.
  Unless this field is marked with required=false, it will be a
  required field
   --
  uniqueKeyid/uniqueKey
 
  !-- field for the QueryParser to use when an explicit fieldname is
  absent
  ...
 
  /schema
 
 
  Remi
 
  On Sun, Jan 22, 2012 at 6:31 PM, remi tassing tassingr...@gmail.com
  wrote:
 
  Hi,
 
  I got wrong in beginning but putting omitNorms in the query url.
 
  Now following your advice, I merged the schema.xml from Nutch and Solr
  and
  made sure omitNorms was set to true for the content, just as you
 said.
 
  Unfortunately the problem remains :-(
 
 
  On Thursday, January 19, 2012, Jan Høydahl jan@cominvent.com
  wrote:
  Hi,
 
  The schema you pasted in your mail is NOT Solr3.5's default example
  schema. Did you get it from the Nutch project?
 
  And the omitNorms parameter is supposed to go in the field tag in
  schema.xml, and the content field in the example schema does not
 have
  omitNorms=true. Try to change
 
  field name=content type=text stored=false

Re: solr replication

2012-01-25 Thread Parvin Gasimzade

Then as you say , shouldn't i define three SolrServer() using SolrJ?
For indexing call solrMasterServer, and for querying call solrSlaveServer1
or solrSlaveServer2?

On Wed, Jan 25, 2012 at 11:09 AM, darul daru...@gmail.com wrote:

 Here is the way I see it (and implemented it), while using SolrJ api you
 have
 to fire :

 - Indexation commands to your /indexation solr instance/ (master) example :
 http://myMaster:80/myCore/
 - Query commands to your /search solr instance/ (slave). You may have
 several slaves, and also find alternative as broker to make load balancing
 betweeen each
 http://mySlave1:80/myCore/
 http://mySlave2:80/myCore/
 ...

 You do not need any changes in code normally, replication is made
 automatically and defined in your solrconfig.xml configuration file.

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/solr-replication-tp3687106p3687168.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr replication

2012-01-25 Thread darul

You may define your specific configuration as a Grid with all your solr
instances and then using SolrJ and CommonsHttpSolrServer choose the right
url depending on indexation or search task.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-replication-tp3687106p3687208.html
Sent from the Solr - User mailing list archive at Nabble.com.

Difference between #indexed documents and #results in : query

2012-01-25 Thread m0rt0n

Hello,

I have seen that I am getting 913 documents indexed:

str name=Total Requests made to DataSource1/str
str name=Total Rows Fetched913/str
str name=Total Documents Skipped0/str
str name=Full Dump Started2012-01-25 10:22:39/str
str name=Indexing completed. Added/Updated: 913 documents. Deleted 0
documents./str
str name=Committed2012-01-25 10:22:44/str
str name=Optimized2012-01-25 10:22:44/str
str name=Total Documents Processed913/str
str name=Time taken 0:0:5.10/str

... and, when I do a search for *:* (all documents) I get 383 results
result name=response numFound=383 start=0 maxScore=1.0

Is this normal? if it is not, do you know why it could be this way and what
could I do to fix it?

Thanks in advance!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Difference-between-indexed-documents-and-results-in-query-tp3687217p3687217.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr replication

2012-01-25 Thread Parvin Gasimzade

Thank you for your response.
What you mean by Grid? Can you please send me any example or any link?


On Wed, Jan 25, 2012 at 11:30 AM, darul daru...@gmail.com wrote:

 You may define your specific configuration as a Grid with all your solr
 instances and then using SolrJ and CommonsHttpSolrServer choose the right
 url depending on indexation or search task.

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/solr-replication-tp3687106p3687208.html
 Sent from the Solr - User mailing list archive at Nabble.com.

solr FieldCollapsing, label and locale parameter

2012-01-25 Thread Ralf Matulat


Hi,
I'm using FieldCollapsing to group the results. Example: I search for 
/:/ and group by names, like:


|http://localhost:port/solr/select/?q=*:*
   group=true
   group.limit=200
   group.query=Jim
   group.query=Jon
   group.query=Frank Sinatra

|It looks like, solr is running (internal) an separate query for every name. 
(whatever)
The point is, that i have to change the search parameter local in order to set 
an different search operator (from OR to AND).
To get valid results I need an query like this:

|http://localhost:port/solr/select/?q=*:*
   group=true
   group.limit=200
   group.query={!q.op=AND defType=edismax}Jim
   group.query={!q.op=AND defType=edismax}Jon
   group.query={!q.op=AND defType=edismax}Frank Sinatra

|This works very well. The Problem is, that solr returns the label of the group 
including the locale parameter!

|lst name=grouped
lst name={!q.op=AND defType=edismax}Frank Sinatra  wrong 
label
int name=matches785/intresult name=doclist numFound=10 
start=0
doc
[...]

|An valide result is be:

|lst name=grouped
lst name=Frank Sinatra
int name=matches785/intresult name=doclist numFound=10 
start=0
doc
[...]

|Is there a way to change to label to the real term where solr is searching for?

Thanks and best regards
Ralf

Re: solr replication

2012-01-25 Thread darul

I mean by grid the list of your instances :

String masterUrl = http://masterUrl/core/...;
String[] slaveUrls = {http://slaveUrl/core/...;,
http://slaveUrl/core/...}

Then use your business logic to use the correct one with Http solrJ facade.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-replication-tp3687106p3687314.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr replication

2012-01-25 Thread Parvin Gasimzade

Ok thank you for your response.

On Wed, Jan 25, 2012 at 12:24 PM, darul daru...@gmail.com wrote:

 I mean by grid the list of your instances :

 String masterUrl = http://masterUrl/core/...;
 String[] slaveUrls = {http://slaveUrl/core/...;,
 http://slaveUrl/core/...}

 Then use your business logic to use the correct one with Http solrJ facade.

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/solr-replication-tp3687106p3687314.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: phrase auto-complete with suggester component

2012-01-25 Thread O. Klein


Tommy Chheng-2 wrote
 
 Thanks, I'll try out the custom class file. Any possibilities this
 class can be merged into solr? It seems like an expected behavior.
 
 
 On Tue, Jan 24, 2012 at 11:29 AM, O. Klein lt;klein@gt; wrote:
 You might wanna read
 http://lucene.472066.n3.nabble.com/suggester-issues-td3262718.html#a3264740
 which contains the solution to your problem.

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/phrase-auto-complete-with-suggester-component-tp3685572p3685730.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 -- 
 Tommy Chheng
 

I agree. Suggester could use some attention. Looking at Wiki there were some
features planned, but not much has happened lately.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/phrase-auto-complete-with-suggester-component-tp3685572p3687495.html
Sent from the Solr - User mailing list archive at Nabble.com.

Need help

2012-01-25 Thread Shambhu Kumar

I want to create One search.

The implementation is like this

I have 4 table in database (Suppose profiles, clients, requirement and case)
and having 4 entity in java code.

Now I created one jsp which is having one drop down (Which contains all 4
options that's are entity name) one search box and one button.

 

How to I need to configure for this. So that I can search from any table by
changing drop down.

 

 

Regards

Shambhu

Re: solr replication

2012-01-25 Thread Anderson vasconcelos

Hi Parvin

I did something that may help you. I set up apache (with mod_proxy and mode
balance) like a front-end and use this to distruted the request of my
aplication. Request for /update or /optmize, i'm redirect to master (or
masters) server and requests /search i redirect to slaves. Example:

Proxy balancer://solrclusterindex
BalancerMember http://127.0.0.1:8080/apache-solr-1.4.1/ disablereuse=On
route=jvm1
/Proxy

Proxy balancer://solrclustersearch
BalancerMember http://127.0.0.1:8080/apache-solr-1.4.1/ disablereuse=On
route=jvm1
BalancerMember http://10.16.129.61:8080/apache-solr-1.4.1/ disablereuse=On
route=jvm2
/Proxy

ProxyPassMatch /solrcluster(.*)/update(.*)$
balancer://solrclusterindex$1/update$2
ProxyPassMatch /solrcluster(.*)/select(.*)$
balancer://solrclustersearch$1/select$2

I hope it helps you

Re: Difference between #indexed documents and #results in : query

2012-01-25 Thread Jan Høydahl

Hi,

No, it's not normal :) Have you tried to hit SHIFT-F5 to make sure you're not 
getting tricked by browser caching? Or try a slightly different query like id:*
You can also visit the Schema browser page of Solr admin and check the stats 
on how many docs are in the index.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 25. jan. 2012, at 10:35, m0rt0n wrote:

 Hello,
 
 I have seen that I am getting 913 documents indexed:
 
 str name=Total Requests made to DataSource1/str
 str name=Total Rows Fetched913/str
 str name=Total Documents Skipped0/str
 str name=Full Dump Started2012-01-25 10:22:39/str
 str name=Indexing completed. Added/Updated: 913 documents. Deleted 0
 documents./str
 str name=Committed2012-01-25 10:22:44/str
 str name=Optimized2012-01-25 10:22:44/str
 str name=Total Documents Processed913/str
 str name=Time taken 0:0:5.10/str
 
 ... and, when I do a search for *:* (all documents) I get 383 results
 result name=response numFound=383 start=0 maxScore=1.0
 
 Is this normal? if it is not, do you know why it could be this way and what
 could I do to fix it?
 
 Thanks in advance!
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Difference-between-indexed-documents-and-results-in-query-tp3687217p3687217.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr replication

2012-01-25 Thread Parvin Gasimzade

Hi Anderson,

Thank you for your effort.I will try this.
Hope it will solve my problem.

Regards

On Wed, Jan 25, 2012 at 2:27 PM, Anderson vasconcelos 
anderson.v...@gmail.com wrote:

 Hi Parvin

 I did something that may help you. I set up apache (with mod_proxy and mode
 balance) like a front-end and use this to distruted the request of my
 aplication. Request for /update or /optmize, i'm redirect to master (or
 masters) server and requests /search i redirect to slaves. Example:

 Proxy balancer://solrclusterindex
 BalancerMember http://127.0.0.1:8080/apache-solr-1.4.1/ disablereuse=On
 route=jvm1
 /Proxy

 Proxy balancer://solrclustersearch
 BalancerMember http://127.0.0.1:8080/apache-solr-1.4.1/ disablereuse=On
 route=jvm1
 BalancerMember http://10.16.129.61:8080/apache-solr-1.4.1/ disablereuse=On
 route=jvm2
 /Proxy

 ProxyPassMatch /solrcluster(.*)/update(.*)$
 balancer://solrclusterindex$1/update$2
 ProxyPassMatch /solrcluster(.*)/select(.*)$
 balancer://solrclustersearch$1/select$2

 I hope it helps you

Re: Need help

2012-01-25 Thread gabriel shen

Treat one solr schema as a database table, one solr core contains only one
schema.

So in your case you should define 4 solr cores, each contains a schema
matching the database table you have.

/shen


On Wed, Jan 25, 2012 at 1:08 PM, Shambhu Kumar ss2k...@gmail.com wrote:

 I want to create One search.

 The implementation is like this

 I have 4 table in database (Suppose profiles, clients, requirement and
 case)
 and having 4 entity in java code.

 Now I created one jsp which is having one drop down (Which contains all 4
 options that's are entity name) one search box and one button.



 How to I need to configure for this. So that I can search from any table by
 changing drop down.





 Regards

 Shambhu

Re: Difference between #indexed documents and #results in : query

2012-01-25 Thread m0rt0n

Thanks a lot for your answer; really appreciated. 

Unfortunately, I am still getting the same number of results:
- I tried by refreshing the browser cache.
- I tried another search by the ID:*
- And went to the http://localhost:8983/solr/browse?q=

... and got the same number of results. (383 results found in 13 ms Page 1
of 1)

I don't understand why it says that it is indexing 913 (see below) and it
just finds 383, that makes no sense to me and I am starting to go crazy :-)

Any further help appreciated. Thanks!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Difference-between-indexed-documents-and-results-in-query-tp3687217p3687646.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Difference between #indexed documents and #results in : query

2012-01-25 Thread Sami Siren

Does all your 913 documents contain a unique key?  The uniqueKey field
is id by default.

--
 Sami Siren

On Wed, Jan 25, 2012 at 3:16 PM, m0rt0n rau...@gmail.com wrote:
 Thanks a lot for your answer; really appreciated.

 Unfortunately, I am still getting the same number of results:
 - I tried by refreshing the browser cache.
 - I tried another search by the ID:*
 - And went to the http://localhost:8983/solr/browse?q=

 ... and got the same number of results. (383 results found in 13 ms Page 1
 of 1)

 I don't understand why it says that it is indexing 913 (see below) and it
 just finds 383, that makes no sense to me and I am starting to go crazy :-)

 Any further help appreciated. Thanks!

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Difference-between-indexed-documents-and-results-in-query-tp3687217p3687646.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: highlighter not supporting surround parser

2012-01-25 Thread Ahmet Arslan

 i want performing span queries using
 surround parser and i want tos how the
 results with highlighter, but the problem is highlighter is
 not working
 properly with surround query parser.Are their any plugins or
 updates
 available to do it.

Hi Manyu, 

You can use https://issues.apache.org/jira/browse/SOLR-3060 for this.

Re: Do Hignlighting + proximity using surround query parser

2012-01-25 Thread Ahmet Arslan

  I got this working the way you
  describe it (in the getHighlightQuery()
  method). The span queries were tripping it up, so I
  extracted the query
  terms and created a DisMax query from them. There'll be
 a
  loss of accuracy
  in the highlighting, but in my case that's better than
 no
  highlighting.
  
  Should I just go ahead and submit a patch to
 SOLR-2703?
 
 I think a separate jira ticket would be more appropriate. 

Scott, I created SOLR-3060 for this.

Query for exact part of sentence

2012-01-25 Thread Arkadi Colson


Hi

I'm using the pecl PHP class to query SOLR and was wondering how to 
query for a part of a sentence exactly.


There are 2 data items index in SOLR
1327497476: 123 456 789
1327497521. 1234 5678 9011

However when running the query, both data items are returned as you can 
see below. Any idea why?


Thanks!

SolrObject Object
(
[responseHeader] =  SolrObject Object
(
[status] =  0
[QTime] =  5016
[params] =  SolrObject Object
(
[debugQuery] =  true
[shards] =  
solr01:8983/solr,solr02:8983/solr,solr03:8983/solr
[fl] =  
id,smsc_module,smsc_ssid,smsc_description,smsc_content,smsc_courseid,smsc_date_created,smsc_date_edited,score,metadata_stream_size,metadata_stream_source_info,metadata_stream_name,metadata_stream_content_type,last_modified,author,title,subject
[sort] =  smsc_date_created asc
[indent] =  on
[start] =  0
[q] =  (smsc_content:\123 456\ || smsc_description:\123 456\)  
(smsc_module:Intradesk)  (smsc_date_created:[2011-12-25T10:29:51Z TO NOW])  (smsc_ssid:38)
[distrib] =  true
[wt] =  xml
[version] =  2.2
[rows] =  55
)

)

[response] =  SolrObject Object
(
[numFound] =  2
[start] =  0
[docs] =  Array
(
[0] =  SolrObject Object
(
[smsc_module] =  Intradesk
[smsc_ssid] =  38
[id] =  1327497476
[smsc_courseid] =  0
[smsc_date_created] =  2011-12-25T10:29:51Z
[smsc_date_edited] =  2011-12-25T10:29:51Z
[score] =  10.028017
)

[1] =  SolrObject Object
(
[smsc_module] =  Intradesk
[smsc_ssid] =  38
[id] =  1327497521
[smsc_courseid] =  0
[smsc_date_created] =  2011-12-25T10:29:51Z
[smsc_date_edited] =  2011-12-25T10:29:51Z
[score] =  5.541335
)

)

)
[debug] =  SolrObject Object
(
[rawquerystring] =  (smsc_content:\123 456\ || smsc_description:\123 456\)  
(smsc_module:Intradesk)  (smsc_date_created:[2011-12-25T10:29:51Z TO NOW])  (smsc_ssid:38)
[querystring] =  (smsc_content:\123 456\ || smsc_description:\123 456\)  
(smsc_module:Intradesk)  (smsc_date_created:[2011-12-25T10:29:51Z TO NOW])  (smsc_ssid:38)
[parsedquery] =  +(smsc_content:123 smsc_content:456 
smsc_description:123 smsc_content:456) +smsc_module:intradesk 
+smsc_date_created:[2011-12-25T10:29:51Z TO 2012-01-25T13:33:21.098Z] +smsc_ssid:38
[parsedquery_toString] =  +(smsc_content:123 smsc_content:456 
smsc_description:123 smsc_content:456) +smsc_module:intradesk 
+smsc_date_created:[2011-12-25T10:29:51 TO 2012-01-25T13:33:21.098] 
+smsc_ssid:`#8;#0;#0;#0;
[QParser] =  LuceneQParser
[timing] =  SolrObject Object

Re: SpellCheck Help

2012-01-25 Thread Erick Erickson

You have to give us a lot more detail about exactly
what you've done and what your results are. Please review:
http://wiki.apache.org/solr/UsingMailingLists

Best
Erick

On Tue, Jan 24, 2012 at 7:42 PM, vishal_asc vishal.por...@ascendum.com wrote:
 I have installed the same solr 3.5 with jetty and integrating it magento 1.11
 but it seems to be not working.
 As my search result is not showing Did you mean string ? when I misspelled
 any word.

 I followed all steps necessary for magento solr integration.

 Please help ASAP.

 Thanks
 Vishal

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/SpellCheck-Help-tp3648589p3686756.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Difference between #indexed documents and #results in : query

2012-01-25 Thread m0rt0n

BINGO!!

Yep, I actually was assuming that the ID field was unique; and after your
response I went to my DBA and he told me that it wasn't.

Then, I made up a unique key by concattening three fields and that works.

Thanks a lot for your very helpful answer!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Difference-between-indexed-documents-and-results-in-query-tp3687217p3687970.html
Sent from the Solr - User mailing list archive at Nabble.com.

Indexing Using XML Message

2012-01-25 Thread Tod

I have a local data store containing a host of different document types. 
 This data store is separate from a remote Solr install making 
streaming not an option.  Instead I'd like to generate an XML file that 
contains all of the documents including content and metadata.


What would be the most appropriate way to accomplish this?  I could use 
the Tika CLI to generate XML but I'm not sure it would work or that its 
the most efficient way to handle things.  Can anyone offer some suggestions?



Thanks - Tod

What is the most basic schema.xml you can have for indexing a simple database?

2012-01-25 Thread Fergus McDowall

Is it do-able/sensible to build a schema.xml from the ground up?

Say that you are feeding the results of a database query into solr
containing the fields id(int), title(varchar), description(varchar),
pub_date(date) and tags(varchar)

What would be the simplest schema.xml that could support this structure in Solr?

Fergus

Re: Problem in Accessing DIH

2012-01-25 Thread Erick Erickson

You need to follow the instructions here:
http://wiki.apache.org/solr/DataImportHandler

In particular setting up the request handler in solrconfig.xml and
creating a data-config.xml file that's referenced in the
request handler. When this is done correctly, you should
see the request handler you defined as a link ont he
dataimport.jsp page.

Best
Erick

On Tue, Jan 24, 2012 at 5:31 AM, dsy99 ds...@rediffmail.com wrote:
 Dear all,
 I am using solr3.5 in which i tried to access DIH development console with
 URL mentioned below but getting the message Select handler.

 http://localhost:8983/solr/admin/dataimport.jsp

 May I know, how  can I select the handler so that I will be able to display
 the DIH control form and command output as raw XML.

 Thanking you.

 With Regds:
 Divakar

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Problem-in-Accessing-DIH-tp3684667p3684667.html
 Sent from the Solr - User mailing list archive at Nabble.com.

There's really no roadmap. If you have big enough need you
can work on this patch and submit of for someone to commit but
it looks like Greg had other priorities get in the way so you can't
count on anyone else carrying this forward..

If you want to integrate the patch, you check out the source,
apply the patch to the source and compile. It may not apply
cleanly, see the instructions at:
http://wiki.apache.org/solr/HowToContribute
But this patch doesn't really look like it is doing what you want
either, it's the stats component which provides summary
information.

As for a normalized dollar amount, that's really just converting
all prices to USD or EUR or whatever. If you also stored
a value for what type the original document is in, you can
display the correct currency for documents. The problem here
of course is that the normalization is done at index time, and
the exchange rates change.

Best
Erick

On Tue, Jan 24, 2012 at 5:37 AM, darul daru...@gmail.com wrote:
We may need a specific field to store and search over item prices.

Currency can be of different kind, EUR

There is an open ticket on Jira, but I do not find a way to integrate patch
sources, and it seems to be not closed yet.

Any idea of roadmap or expected available date for this powerful enhancement
:

http://wiki.apache.org/solr/MoneyFieldType

Erik said an alternative may be to store it in a normalized way :

http://lucene.472066.n3.nabble.com/Stats-help-needed-on-price-field-using-different-currencies-td2978082.html#a2997876

Any ideas ?

Thanks,

Jul

--
View this message in context:
http://lucene.472066.n3.nabble.com/Currency-field-type-tp3684682p3684682.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: full import is not working and still not showing any errors

2012-01-25 Thread Erick Erickson

please review:
http://wiki.apache.org/solr/UsingMailingLists

I infer you're using DIH, but you've never
really stated that.  What page are you refreshing?
What commands have you issued? Have you
looked at dataimport.jsp (the DIH debugging page)?


Best
Erick

On Tue, Jan 24, 2012 at 6:01 AM, scabra4 scab...@yahoo.com wrote:
 hi all, anyone can help me with this please.
 i am trying to do a full import, i've done everything correctly, now when i
 try the full import an xml page displays showing the following and i stays
 like this now matter how i refresh the page:
 This XML file does not appear to have any style information associated with
 it. The document tree is shown below.
      response
 lst name=responseHeader
 int name=status0/int
 int name=QTime0/int
 /lst
 lst name=initArgs
 lst name=defaults
 str name=configC:\solr\conf\data-config.xml/str
 /lst
 /lst
 str name=commandfull-import/str
 str name=statusbusy/str
 str name=importResponseA command is still running.../str
 lst name=statusMessages
 str name=Time Elapsed0:5:8.925/str
 str name=Total Requests made to DataSource1/str
 str name=Total Rows Fetched0/str
 str name=Total Documents Processed0/str
 str name=Total Documents Skipped0/str
 str name=Full Dump Started2012-01-24 16:29:31/str
 /lst
 str name=WARNINGThis response format is experimental.  It is likely to
 change in the future./str/response

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/full-import-is-not-working-and-still-not-showing-any-errors-tp3684751p3684751.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: What is the most basic schema.xml you can have for indexing a simple database?

2012-01-25 Thread Sethi, Parampreet

Hi Fergus,

The schema.xml has declaration of fields as well as analyzers/tokenizers
which are required as per the application demand. The easiest way is to
modify the schema.xml file which is delivered with
apache_solr/example/solr/conf.

In case you are looking for setting up Solr in front of database with
minimal manipulation of DB data, you can check it here
http://www.params.me/2011/03/configure-apache-solr-14-with-mysql.html. I
am using this setup in of my applications in production.

-param

On 1/25/12 11:10 AM, Fergus McDowall fergusmcdow...@gmail.com wrote:

Is it do-able/sensible to build a schema.xml from the ground up?

Say that you are feeding the results of a database query into solr
containing the fields id(int), title(varchar), description(varchar),
pub_date(date) and tags(varchar)

What would be the simplest schema.xml that could support this structure
in Solr?

Fergus

Re: Not getting the expected search results

2012-01-25 Thread Erick Erickson

First thing is that there's a helpful page for debuging this
called dataimport.jsp, see:
http://wiki.apache.org/solr/DataImportHandler

Second, and this is just a guess, what is your
uniqueKey defined in your schema? When Solr
adds documents, a document with the same
value in the field defined in uniqueKey as a
document already in the index causes the old
doc to be replaced by the new doc. So it's possible
that your select is replacing the document you want
in the first example, but not in the second.

Look on the admin/stats page. There are two numbers
reported here, numDoc and maxDocs. The difference
between these is the number of documents that have
been deleted from your index. The replacement I
outlined above is a delete followed by an add, so
if you start with a clean index and do your first import
and these numbers are different, then you are having
documents replaced...

Hope that helps
Erick

On Tue, Jan 24, 2012 at 7:02 AM, m0rt0n rau...@gmail.com wrote:
 Hello,

 I am a newbie in this Solr world and I am getting surprised because I try to
 do searches, both with the  browser interface and by using a Java client and
 the expected results do not appear.

 The issue is:

 1) I have set up an entity called via in my data-config.xml with 5 fields.
 I do the full-import and it indexes

 1.5M records:

 entity name=via query=select TVIA, NVIAC, CMUM, CVIA, CPRO from
 INE_VIAS
            field column=TVIA name=TVIA /
            field column=NVIAC name=NVIAC /
            field column=CMUM name=CMUM /
            field column=CVIA name=CVIA /
            field column=CPRO name=CPRO /
 /entity

 2) These 5 fields are mapped in the schema.xml, this way:
   field name=TVIA type=text_general indexed=true stored=true /
   field name=NVIAC type=text_general indexed=true stored=true /
   field name=CMUM type=text_general indexed=true stored=true /
   field name=CVIA type=string indexed=true stored=true /
   field name=CPRO type=int indexed=true stored=true /

 3) I try to do a search for Alcala street in Madrid:
 NVIAC:ALCALA AND CPRO:28 AND CMUM:079

 But it does just get two results (none of them, the desired one):
 docstr name=CMUM079/strint name=CPRO28/intstr
 name=CVIA45363/strstr name=NVIACALCALA

 GAZULES/strstr name=TVIACALLE/str/doc
 docstr name=CMUM079/strint name=CPRO28/intstr
 name=CVIA08116/strstr name=NVIACALCALA

 GUADAIRA/strstr name=TVIACALLE/str/doc

 4) When I do the indexing by delimiting the entity search:

 entity name=via query=select TVIA, NVIAC, CMUM, CVIA, CPRO from INE_VIAS
 WHERE NVIAC LIKE '%ALCALA%'

 The full import does 913 documents and I do the same search, but this time I
 get the desired result:

 docstr name=CMUM079/strint name=CPRO28/intstr
 name=CVIA00132/strstr name=NVIACALCALA/strstr
 name=TVIACALLE/str/doc

 Anyone can help me with that? I don't know why it does not work as expected
 when I do the full-import of the whole lot of streets.

 Thanks a lot in advance.


 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Not-getting-the-expected-search-results-tp3684974p3684974.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCell maximum file size

2012-01-25 Thread Erick Erickson

Mostly it depends on your container settings, quite often that's
where the limits are. I don't think Solr imposes any restrictions.

What size are we talking about anyway? There are implicit
issues with how much memory parsing the file requires, but you
can allocate lots of memory to the JVM to handle that.

Best
Erick

On Tue, Jan 24, 2012 at 10:24 AM, Augusto Camarotti
augu...@prpb.mpf.gov.br wrote:
 Hi everybody

 Does anyone knows if there is a maximum file size that can be uploaded to the 
 extractingrequesthandler via http request?

 Thanks in advance,

 Augusto Camarotti

Re: Indexing failover and replication

2012-01-25 Thread Erick Erickson

No, there no good ways to have a single slave know about
two masters and just use the right one. It sounds like you've
got each machine being both a master and a slave? This is
not supported. What you probably want to do is either set
up a repeater or just index to the two masters and manually
change the back to the primary if the primary goes down, having
all replication happen from the master.

Best
Erick

On Tue, Jan 24, 2012 at 11:36 AM, Anderson vasconcelos
anderson.v...@gmail.com wrote:
 Hi
 I'm doing now a test with replication using solr 1.4.1. I configured
 two servers (server1 and server 2) as master/slave to sincronized
 both. I put apache on the front side, and we index sometime in server1
 and sometime  in server2.

 I realized that the both index servers are now confused. In solr data
 folder, was created many index folders with the timestamp of
 syncronization (Exemple: index.20120124041340) with some segments
 inside.

 I thought that was possible to index in two master server and than
 synchronized both using replication. It's really possible do this with
 replication mechanism? If is possible, what I have done wrong?

 I need to have more than one node for indexing to guarantee failover
 feature for indexing. MultiMaster is the best way to guarantee
 failover feature for indexing?

 Thanks

RE: HTMLStripCharFilterFactory not working in Solr4?

2012-01-25 Thread Steven A Rowe

Hi Mike,

Yonik committed a fix to Solr trunk - your test on LUCENE-3721 succeeds for me 
now.  (On Solr trunk, *all* CharFilters have been non-functional since 
LUCENE-3396 was committed in r1175297 on 25 Sept 2011, until Yonik's fix today 
in r1235810; Solr 3.x was not affected - CharFilters have been working there 
all along.)

Steve

 -Original Message-
 From: Mike Hugo [mailto:m...@piragua.com]
 Sent: Tuesday, January 24, 2012 3:56 PM
 To: solr-user@lucene.apache.org
 Subject: Re: HTMLStripCharFilterFactory not working in Solr4?
 
 Thanks for the responses everyone.
 
 Steve, the test method you provided also works for me.  However, when I
 try
 a more end to end test with the HTMLStripCharFilterFactory configured for
 a
 field I am still having the same problem.  I attached a failing unit test
 and configuration to the following issue in JIRA:
 
 https://issues.apache.org/jira/browse/LUCENE-3721
 
 I appreciate all the prompt responses!  Looking forward to finding the
 root
 cause of this guy :)  If there's something I'm doing incorrectly in the
 configuration, please let me know!
 
 Mike
 
 On Tue, Jan 24, 2012 at 1:57 PM, Steven A Rowe sar...@syr.edu wrote:
 
  Hi Mike,
 
  When I add the following test to TestHTMLStripCharFilterFactory.java on
  Solr trunk, it passes:
 
  public void testNumericCharacterEntities() throws Exception {
   final String text = Bose#174; #8482;;  // |Bose® ™|
   HTMLStripCharFilterFactory htmlStripFactory = new
  HTMLStripCharFilterFactory();
   htmlStripFactory.init(Collections.String,StringemptyMap());
   CharStream charStream = htmlStripFactory.create(CharReader.get(new
  StringReader(text)));
   StandardTokenizerFactory stdTokFactory = new
 StandardTokenizerFactory();
   stdTokFactory.init(DEFAULT_VERSION_PARAM);
   Tokenizer stream = stdTokFactory.create(charStream);
   assertTokenStreamContents(stream, new String[] { Bose });
  }
 
  What's happening:
 
  First, htmlStripFactory converts #174; to ® and #8482; to ™.
   Then stdTokFactory declines to tokenize ® and ™, because they are
  belong to the Unicode general category Symbol, Other, and so are not
  included in any of the output tokens.
 
  StandardTokenizer uses the Word Break rules find UAX#29 
  http://unicode.org/reports/tr29/ to find token boundaries, and then
  outputs only alphanumeric tokens.  See the JFlex grammar for details: 
 
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/analysis/common/src/
 java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.jflex?view=
 markup
  .
 
  The behavior you're seeing is not consistent with the above test.
 
  Steve
 
   -Original Message-
   From: Mike Hugo [mailto:m...@piragua.com]
   Sent: Tuesday, January 24, 2012 1:34 PM
   To: solr-user@lucene.apache.org
   Subject: HTMLStripCharFilterFactory not working in Solr4?
  
   We recently updated to the latest build of Solr4 and everything is
  working
   really well so far!  There is one case that is not working the same
 way
  it
   was in Solr 3.4 - we strip out certain HTML constructs (like trademark
  and
   registered, for example) in a field as defined below - it was working
 in
   Solr3.4 with the configuration shown here, but is not working the same
  way
   in Solr4.
  
   The label field is defined as type=text_general
   field name=label type=text_general indexed=true stored=false
   required=false multiValued=true/
  
   Here's the type definition for text_general field:
   fieldType name=text_general class=solr.TextField
   positionIncrementGap=100
   analyzer type=index
   tokenizer class=solr.StandardTokenizerFactory/
   charFilter class=solr.HTMLStripCharFilterFactory/
   filter class=solr.StopFilterFactory
 ignoreCase=true
   words=stopwords.txt
   enablePositionIncrements=true/
   filter class=solr.LowerCaseFilterFactory/
   /analyzer
   analyzer type=query
   tokenizer class=solr.StandardTokenizerFactory/
   charFilter class=solr.HTMLStripCharFilterFactory/
   filter class=solr.StopFilterFactory
 ignoreCase=true
   words=stopwords.txt
   enablePositionIncrements=true/
   filter class=solr.LowerCaseFilterFactory/
   /analyzer
   /fieldType
  
  
   In Solr 3.4, that configuration was completely stripping html
 constructs
   out of the indexed field which is exactly what we wanted.  If for
  example,
   we then do a facet on the label field, like in the test below, we're
   getting some terms in the response that we would not like to be there.
  
  
   // test case (groovy)
   void specialHtmlConstructsGetStripped() {
   SolrInputDocument inputDocument = new SolrInputDocument()
   inputDocument.addField('label', 'Bose#174; #8482;')
  
   solrServer.add(inputDocument)
   solrServer.commit()
  
   QueryResponse

Re: HTMLStripCharFilterFactory not working in Solr4?

2012-01-25 Thread Mike Hugo

Thanks guys!  I'll grab the latest build from the solr4 jenkins server when
those commits get picked up and try it out.  Thanks for the quick
turnaround!

Mike

On Wed, Jan 25, 2012 at 11:01 AM, Steven A Rowe sar...@syr.edu wrote:

 Hi Mike,

 Yonik committed a fix to Solr trunk - your test on LUCENE-3721 succeeds
 for me now.  (On Solr trunk, *all* CharFilters have been non-functional
 since LUCENE-3396 was committed in r1175297 on 25 Sept 2011, until Yonik's
 fix today in r1235810; Solr 3.x was not affected - CharFilters have been
 working there all along.)

 Steve

  -Original Message-
  From: Mike Hugo [mailto:m...@piragua.com]
  Sent: Tuesday, January 24, 2012 3:56 PM
  To: solr-user@lucene.apache.org
  Subject: Re: HTMLStripCharFilterFactory not working in Solr4?
 
  Thanks for the responses everyone.
 
  Steve, the test method you provided also works for me.  However, when I
  try
  a more end to end test with the HTMLStripCharFilterFactory configured for
  a
  field I am still having the same problem.  I attached a failing unit test
  and configuration to the following issue in JIRA:
 
  https://issues.apache.org/jira/browse/LUCENE-3721
 
  I appreciate all the prompt responses!  Looking forward to finding the
  root
  cause of this guy :)  If there's something I'm doing incorrectly in the
  configuration, please let me know!
 
  Mike
 
  On Tue, Jan 24, 2012 at 1:57 PM, Steven A Rowe sar...@syr.edu wrote:
 
   Hi Mike,
  
   When I add the following test to TestHTMLStripCharFilterFactory.java on
   Solr trunk, it passes:
  
   public void testNumericCharacterEntities() throws Exception {
final String text = Bose#174; #8482;;  // |Bose® ™|
HTMLStripCharFilterFactory htmlStripFactory = new
   HTMLStripCharFilterFactory();
htmlStripFactory.init(Collections.String,StringemptyMap());
CharStream charStream = htmlStripFactory.create(CharReader.get(new
   StringReader(text)));
StandardTokenizerFactory stdTokFactory = new
  StandardTokenizerFactory();
stdTokFactory.init(DEFAULT_VERSION_PARAM);
Tokenizer stream = stdTokFactory.create(charStream);
assertTokenStreamContents(stream, new String[] { Bose });
   }
  
   What's happening:
  
   First, htmlStripFactory converts #174; to ® and #8482; to ™.
Then stdTokFactory declines to tokenize ® and ™, because they are
   belong to the Unicode general category Symbol, Other, and so are not
   included in any of the output tokens.
  
   StandardTokenizer uses the Word Break rules find UAX#29 
   http://unicode.org/reports/tr29/ to find token boundaries, and then
   outputs only alphanumeric tokens.  See the JFlex grammar for details: 
  
 
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/analysis/common/src/
 
 java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.jflex?view=
  markup
   .
  
   The behavior you're seeing is not consistent with the above test.
  
   Steve
  
-Original Message-
From: Mike Hugo [mailto:m...@piragua.com]
Sent: Tuesday, January 24, 2012 1:34 PM
To: solr-user@lucene.apache.org
Subject: HTMLStripCharFilterFactory not working in Solr4?
   
We recently updated to the latest build of Solr4 and everything is
   working
really well so far!  There is one case that is not working the same
  way
   it
was in Solr 3.4 - we strip out certain HTML constructs (like
 trademark
   and
registered, for example) in a field as defined below - it was working
  in
Solr3.4 with the configuration shown here, but is not working the
 same
   way
in Solr4.
   
The label field is defined as type=text_general
field name=label type=text_general indexed=true stored=false
required=false multiValued=true/
   
Here's the type definition for text_general field:
fieldType name=text_general class=solr.TextField
positionIncrementGap=100
analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
charFilter class=solr.HTMLStripCharFilterFactory/
filter class=solr.StopFilterFactory
  ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true/
filter class=solr.LowerCaseFilterFactory/
/analyzer
analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
charFilter class=solr.HTMLStripCharFilterFactory/
filter class=solr.StopFilterFactory
  ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true/
filter class=solr.LowerCaseFilterFactory/
/analyzer
/fieldType
   
   
In Solr 3.4, that configuration was completely stripping html
  constructs
out of the indexed field which is exactly what we wanted.  If for
   example,
we then do a facet on the label field, like in the test below, we're
getting some terms

Using multiple DirectSolrSpellcheckers for a query

2012-01-25 Thread Nalini Kartha

Hi,

We are trying to use the DirectSolrSpellChecker to get corrections for
mis-spelled query terms directly from fields in the Solr index.

However, we need to use multiple fields for spellchecking a query. It looks
looks like you can only use one spellchecker for a request and so the
workaround for this it to create a copy field from the fields required for
spell correction?

We'd like to avoid this because we allow users to perform different kinds
of queries on different sets of fields and so to provide meaningful
corrections we'd have to create multiple copy fields - one for each query
type.

Is there any reason why Solr doesn't support using multiple spellcheckers
for a query? Is it because of performance overhead?

Thanks,
Nalini

Re: Using SOLR Autocomplete for addresses (i.e. multiple terms)

2012-01-25 Thread Jan Høydahl

Hi,

I don't think that the suggester can output multiple fields. You would have to 
encode your data in a special way with separators.

Using the separate Solr core approach, you may return whatever fields you 
choose to the suggest Ajax component.
I've written up a blog post and uploaded an example to GitHub. See 
http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 3. jan. 2012, at 20:41, Dave wrote:

 I've got another question for anyone that might have some insight - how do
 you get all of your indexed information along with the suggestions? i.e. if
 each suggestion has an ID# associated with it, do I have to then query for
 that ID#, or is there some way or specifying a field list in the URL to the
 suggester?
 
 Thanks!
 Dave
 
 On Tue, Jan 3, 2012 at 9:41 AM, Dave dla...@gmail.com wrote:
 
 Hi Jan,
 
 Yes, I just saw the answer. I've implemented that, and it's working as
 expected. I do have Suggest running on its own core, separate from my
 standard search handler. I think, however, that the custom QueryConverter
 that was linked to is now too restrictive. For example, it works perfectly
 when someone enters brooklyn, n, but if they start by entering ny or
 new york it doesn't return anything. I think what you're talking about,
 suggesting from whole input and individual tokens is the way to go. Is
 there anything you can point me to as a starting point? I think I've got
 the basic setup, but I'm not quite comfortable enough with SOLR and the
 SOLR architecture yet (honestly I've only been using it for about 2 weeks
 now).
 
 Thanks for the help!
 
 Dave
 
 
 On Tue, Jan 3, 2012 at 8:24 AM, Jan Høydahl jan@cominvent.com wrote:
 
 Hi,
 
 As you see, you've got an answer at StackOverflow already with a proposed
 solution to implement your own QueryConverter.
 
 Another way is to create a Solr core solely for Suggest, and tune it
 exactly the way you like. Then you can have it suggest from the whole input
 as well as individual tokens and weigh these as you choose, as well as
 implement phonetic normalization and other useful tricks.
 
 --
 Jan Høydahl, search solution architect
 Cominvent AS - www.cominvent.com
 Solr Training - www.solrtraining.com
 
 On 3. jan. 2012, at 00:52, Dave wrote:
 
 Hi,
 
 I'm reposting my StackOverflow question to this thread as I'm not
 getting
 much of a response there. Thank you for any assistance you can provide!
 
 
 http://stackoverflow.com/questions/8705600/using-solr-autocomplete-for-addresses
 
 I'm new to SOLR, but I've got it up and running, indexing data via the
 DIH,
 and properly returning results for queries. I'm trying to setup another
 core to run suggester, in order to autocomplete geographical locations.
 We
 have a web application that needs to take a city, state / region,
 country
 input. We'd like to do this in a single entry box. Here are some
 examples:
 
 Brooklyn, New York, United States of America
 Philadelphia, Pennsylvania, United States of America
 Barcelona, Catalunya, Spain
 
 Assume for now that every location around the world can be split into
 this
 3-form input. I've setup my DIH to create a TemplateTransformer field
 that
 combines the 4 tables (city, state and country are all independent
 tables
 connected to each other by a master places table) into a field called
 fullplacename:
 
 field column=fullplacename template=${city_join.plainname},
 ${region_join.plainname}, ${country_join.plainname}/
 
 I've defined a text_auto field in schema.xml:
 
 fieldType class=solr.TextField name=text_auto
   analyzer
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.LowerCaseFilterFactory/
   /analyzer
 /fieldType
 
 and have defined these two fields as well:
 
 field name=name_autocomplete type=text_auto indexed=true
 stored=true multiValued=true /
 copyField source=fullplacename dest=name_autocomplete /
 
 Now, here's my problem. This works fine for the first term, i.e. if I
 type
 brooklyn I get the results I'd expect, using this URL to query:
 
 http://localhost:8983/solr/places/suggest?q=brooklyn
 
 However, as soon as I put a comma and/or a space in there, it breaks
 them
 up into 2 suggestions, and I get a suggestion for each:
 
 http://localhost:8983/solr/places/suggest?q=brooklyn%2C%20ny
 
 Gives me a suggestion for brooklyn and a suggestion for ny instead
 of a
 suggestion that matches brooklyn, ny. I've tried every solution I can
 find via google and haven't had any luck. Is there something simple that
 I've missed, or is this the wrong approach?
 
 Just in case, here's the searchComponent and requestHandler definition:
 
 requestHandler name=/suggest
 class=org.apache.solr.handler.component.SearchHandler
   lst name=defaults
   str name=spellchecktrue/str
   str name=spellcheck.dictionarysuggest/str
   str name=spellcheck.count10/str
   /lst
   arr

Re: Cluster Resizing question

2012-01-25 Thread Otis Gospodnetic

Jamie, depending on how quickly you need this, it may be better to follow 
SolrCloud development because cluster resizing will work differently there.

Otis 

Performance Monitoring SaaS for Solr - 
http://sematext.com/spm/solr-performance-monitoring/index.html 



 From: Jamie Johnson jej2...@gmail.com
To: solr-user@lucene.apache.org 
Sent: Wednesday, January 25, 2012 12:03 PM
Subject: Cluster Resizing question
 
Is this the JIRA that I should be tracking for resizing a cluster?

https://issues.apache.org/jira/browse/SOLR-2593

If not can someone point me to the appropriate location.  Also is
there a rough timeline for when this will be available?

Re: Indexing Using XML Message

2012-01-25 Thread Erick Erickson

So you can't even communicate with the remote Solr process by HTTP?
Because if you can, SolrJ would work.

Otherwise, you're stuck with creating a bunch of Solr-style XML
documents, they have a simple format. See the example/exampleDocs
directory in the standard distribution. You'll have to parse the
separate document types and put your required data into the Solr
XML format...

But I really don't understand why you need to. A Solr installation
that you can't get to via http is pretty useless, although I suppose
there can be security setups that preclude this. Assuming you can
get there via http, consider a SolrJ program combined with Tika to
parse the docs you have in all these formats and send them to Solr
via SolrJ...

Best
Erick

On Wed, Jan 25, 2012 at 7:41 AM, Tod listac...@gmail.com wrote:
 I have a local data store containing a host of different document types.
  This data store is separate from a remote Solr install making streaming not
 an option.  Instead I'd like to generate an XML file that contains all of
 the documents including content and metadata.

 What would be the most appropriate way to accomplish this?  I could use the
 Tika CLI to generate XML but I'm not sure it would work or that its the most
 efficient way to handle things.  Can anyone offer some suggestions?


 Thanks - Tod

Re: What is the most basic schema.xml you can have for indexing a simple database?

2012-01-25 Thread Fergus McDowall


Hi Param

Yes, refactoring the various example schema.xml's is what i have been doing up 
to now. The end results is usually quite verbose with a lot of redundancy. What 
is the most compact possible schema.xml?

Thanks for the link!

F

On 25. jan. 2012, at 17:31, Sethi, Parampreet parampreet.se...@teamaol.com 
wrote:

 Hi Fergus,
 
 The schema.xml has declaration of fields as well as analyzers/tokenizers
 which are required as per the application demand. The easiest way is to
 modify the schema.xml file which is delivered with
 apache_solr/example/solr/conf.
 
 In case you are looking for setting up Solr in front of database with
 minimal manipulation of DB data, you can check it here
 http://www.params.me/2011/03/configure-apache-solr-14-with-mysql.html. I
 am using this setup in of my applications in production.
 
 -param
 
 On 1/25/12 11:10 AM, Fergus McDowall fergusmcdow...@gmail.com wrote:
 
 Is it do-able/sensible to build a schema.xml from the ground up?
 
 Say that you are feeding the results of a database query into solr
 containing the fields id(int), title(varchar), description(varchar),
 pub_date(date) and tags(varchar)
 
 What would be the simplest schema.xml that could support this structure
 in Solr?
 
 Fergus

Re: Cluster Resizing question

2012-01-25 Thread Jamie Johnson

Thanks Otis.  I have been following the SolrCloud development, but I
was wondering specifically about elastically expanding the cloud by
adding shards.  I'm following the distributed indexing JIRA, but I'm
having difficulty finding a JIRA which specifically references the
issues with elasticity.  Are you aware of one?

On Wed, Jan 25, 2012 at 1:10 PM, Otis Gospodnetic
otis_gospodne...@yahoo.com wrote:
 Jamie, depending on how quickly you need this, it may be better to follow 
 SolrCloud development because cluster resizing will work differently there.

 Otis
 
 Performance Monitoring SaaS for Solr - 
 http://sematext.com/spm/solr-performance-monitoring/index.html



 From: Jamie Johnson jej2...@gmail.com
To: solr-user@lucene.apache.org
Sent: Wednesday, January 25, 2012 12:03 PM
Subject: Cluster Resizing question

Is this the JIRA that I should be tracking for resizing a cluster?

https://issues.apache.org/jira/browse/SOLR-2593

If not can someone point me to the appropriate location.  Also is
there a rough timeline for when this will be available?

Re: Indexing failover and replication

2012-01-25 Thread Anderson vasconcelos

Thanks for the Reply Erick
I will make the replication to both master manually.

Thanks

2012/1/25, Erick Erickson erickerick...@gmail.com:
 No, there no good ways to have a single slave know about
 two masters and just use the right one. It sounds like you've
 got each machine being both a master and a slave? This is
 not supported. What you probably want to do is either set
 up a repeater or just index to the two masters and manually
 change the back to the primary if the primary goes down, having
 all replication happen from the master.

 Best
 Erick

 On Tue, Jan 24, 2012 at 11:36 AM, Anderson vasconcelos
 anderson.v...@gmail.com wrote:
 Hi
 I'm doing now a test with replication using solr 1.4.1. I configured
 two servers (server1 and server 2) as master/slave to sincronized
 both. I put apache on the front side, and we index sometime in server1
 and sometime  in server2.

 I realized that the both index servers are now confused. In solr data
 folder, was created many index folders with the timestamp of
 syncronization (Exemple: index.20120124041340) with some segments
 inside.

 I thought that was possible to index in two master server and than
 synchronized both using replication. It's really possible do this with
 replication mechanism? If is possible, what I have done wrong?

 I need to have more than one node for indexing to guarantee failover
 feature for indexing. MultiMaster is the best way to guarantee
 failover feature for indexing?

 Thanks

Re: What is the most basic schema.xml you can have for indexing a simple database?

2012-01-25 Thread Erick Erickson

Fergus:

I have to ask what's driving the push for compactness? General tidiness (of
which I actually approve) or something else?

What is the redundancy you're seeing? Just the fact that some
fieldTypes will contain *almost* the same set of analyzers?

Posting your schema and asking can we make this smaller would
make this a much easier question to answer, especially if you added
some indications of what parts you were dissatisfied with.

Best
Erick

On Wed, Jan 25, 2012 at 10:21 AM, Fergus McDowall
fergusmcdow...@gmail.com wrote:

 Hi Param

 Yes, refactoring the various example schema.xml's is what i have been doing 
 up to now. The end results is usually quite verbose with a lot of redundancy. 
 What is the most compact possible schema.xml?

 Thanks for the link!

 F

 On 25. jan. 2012, at 17:31, Sethi, Parampreet 
 parampreet.se...@teamaol.com wrote:

 Hi Fergus,

 The schema.xml has declaration of fields as well as analyzers/tokenizers
 which are required as per the application demand. The easiest way is to
 modify the schema.xml file which is delivered with
 apache_solr/example/solr/conf.

 In case you are looking for setting up Solr in front of database with
 minimal manipulation of DB data, you can check it here
 http://www.params.me/2011/03/configure-apache-solr-14-with-mysql.html. I
 am using this setup in of my applications in production.

 -param

 On 1/25/12 11:10 AM, Fergus McDowall fergusmcdow...@gmail.com wrote:

 Is it do-able/sensible to build a schema.xml from the ground up?

 Say that you are feeding the results of a database query into solr
 containing the fields id(int), title(varchar), description(varchar),
 pub_date(date) and tags(varchar)

 What would be the simplest schema.xml that could support this structure
 in Solr?

 Fergus

Re: full import is not working and still not showing any errors

2012-01-25 Thread scabra4

Erick,

Thanks for your input, but I've solved the problem which was caused by the
jdbc driver.
This is my first time using solr, and i doing some search over internet just
to get familiar with it, and see how flexible it is.
Do you know whether i can specify complex Search, Filtration and Ranking
rules in solr? 

Regards,

--
View this message in context: 
http://lucene.472066.n3.nabble.com/full-import-is-not-working-and-still-not-showing-any-errors-tp3684751p3689042.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: JSON response truncated

2012-01-25 Thread Erick Erickson

Two things:
1 I suspect it's your servelet container rather than Solr since your JSON isn't
well formatted. I have no clue where to set that up, but that's
where I'd look.
2 A side note. You may run into the default of 10,000 tokens that are indexed,
see the maxFieldLength in solrconfig.xml. This is NOT what you're current
problem is since if you exceed this limit you should still get
well-formatted XML
But if you're sending large documents back and forth you might
see turncated
*fields*

Best
Erick

On Wed, Jan 25, 2012 at 1:18 PM, Sean Adams-Hiett
s...@webgeeksforhire.com wrote:
Summary of Issue:
When specifying output as JSON, I get a truncated response.

Details:
The JSON output I get is truncated, causing errors for any parser that
requires well-formed JSON. I have tried spot checking at a dozen different
records by adjusting the start= attribute. I am using Solr 3.5 running as
a Tomcat webapp on a portable hard drive. When getting the response as XML,
it appears to work fine. I have provided some examples of the query I am
using, as well as JSON and XML responses below.

I am definitely new to working directly with Solr, although I have used it
via Drupal for years and I have a pretty solid understanding of how it
works at a high level. My best guess is that there is some setting that I
am not aware of in schema.xml or solrconfig.xml that is causing this
outcome. Any help in figuring this out would be greatly appreciated.

Example query:
http://localhost:8080/solr/rolfe/select?indent=onversion=2.2q=*%3A*fq=start=0rows=10fl=*%2Cscoreqt=wt=jsonexplainOther=hl.fl=

Example JSON response:

{
responseHeader:{
status:0,
QTime:1},
response:{numFound:43678,start:0,maxScore:1.0,docs:[
{
contents:idlotoer patriota department of the rolfe
arrowlocal itemsrobi anderson of denver, colo., ur-rived in plover
the latter part of tho week for a short visit with relatives and
iri'-.'idst ; owen returned monday from hampton, iowa. mr owen
reports that .mrs owen w.is successfully operated upon in the hospital
there, ;md is now w.'ll on the road to recovery.real estate loans-wo
are quoting low rotes on real estate ioudb. if you are expecting to
make u loan on your farm this year, it will pay you to see us.wo
solicit your banking uusl--ness on the basis of prompt, efficient
service to you.peoples saving banki5st. 1883 the community
dank:~:~:~m~:~:~x\xkk-:-m~:~®x\¯ 11the variety of our win-ned
goods should appeal lo œ i }ou. especially at this sea- ||[ son
('aimed vegetables, canned fruits, meals, soups œ and so on.
make your housework lighter during œ this season by being a con- œ
slant \\isilor lo our canned goods department| saturday specials19c
19c 19c 19c 19c 19ci t11 i5:i!vlib. of pollock dakingpowder for ... 1
cnn white seal lllnc-kitaspbenlea for \\i i.h oysters forsaturday
special þ' cans hominy forsaturday special i pkgs. com starch.i willi
spoon) i large can plm applesaturday spi-riultwo kxtk\\ specials white
(jrtipcse\\tra special pem'lii'k-kxtra .special15c 15cfred ehler's|
the right place to tradem. i helvlg spent last sunday in hampton with
his daughter, mikk ivis, who is in a hospital fn that city, recovering
from the effects of ® recent operation. martin reports that she is
getting rlong nicely and will probably be home some, time the latter
part of the weekmr and mrs. f j sarhv were holfi visitors last
saturday and sundaythe m y ¯. club met with mrs l. n. moody last
thursday afternoonmr and mrs. chas. england, mid mrs. england's
father. mr brle.kson. of albert city, mr. and mrs. enoch erlckson and
children of marathon, and mr. and mrs. a b. cobbs of rolfe spent
sunday at the a. w. hess home in tjii8 city.jack (j ton on was
recently thrown from a horse and suffered a broken arm.if you want to
buy a ®ood corn planter, buy a. \cbbc\ and pet the rest. see j. w.
mangun, plover, iowa.p. j. nacke has purchased a buick touring car,
and is now busy learning to operate the same.miss freda gcmbler,
daughter of pred oembler, was taken suddenly ill while in school one
day last week. a physician was called and after examination pronounced
it a case of scarlet fevermr sherlock of bmmotsburg was a business
visitor here tuesday,1 h pollock has been making an improvement on bis
farm residence by tho addition of a largo porch.ii .1. watts of des
molncs spent sunday at the home of bis brother, chas. 15. watts.p 11.
henderson has sold the building now occupied by the harness shop to
geo. jcffriub.a. j eggspuehler has rctirfd from the management of the
drury store mid is upending his time on his farm south of townif von
want a gang plow that has no side draft, buy an \oliver\ of j w
\\!\\vr:!t\\, plover. iowa.m-v. iv. c. a. n'otksthe y w c a of plover
lias been þliihe active of late, and the meetings held recently have
enjoyed an excellent attendance. much interest has been shown in the
work. it is

WARNING: Unable to read: dataimport.properties DHI issue

2012-01-25 Thread Egonsith

I have tried to search for my specific problem but have not found solution. I
have also read the wiki on the DHI and seem to have everything set up right
but my Query still fails. Thank you for your help 

I am running Solr 3.1 with Tomcat 6.0 
Windows server 2003 r2 and SQL 2008 

I have the sqljdbc4.jar sitting in C:\Program Files\Apache Software
Foundation\Tomcat 6.0\lib

/My solrconfig.xml/
- requestHandler name=/dataimport
class=org.apache.solr.handler.dataimport.DataImportHandler
- lst name=defaults
  str name=configdb-data-config.xml/str 
  /lst
  /requestHandler

/My db-data-config.xml/
- dataConfig
  dataSource driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
url=://localhost:1433;DatabaseName=KnowledgeBase_DM user=user
password=password / 
- document
- entity dataSource=ds1 name=Titles query=SELECT mrID, mrTitle from
KnowledgeBase_DM.dbo.AskMe_Data
  field column=mrID name=id / 
  field column=mrTitle name=title / 
- entity name=Desc query=select meDescription from
KnowledgeBase_DM.dbo.AskMe_Data
  field column=meDescription name=description / 
  /entity
  /entity
  /document
  /dataConfig


/My logfile Output /
Jan 25, 2012 2:17:37 PM org.apache.solr.handler.dataimport.DataImportHandler
processConfiguration
INFO: Processing configuration from solrconfig.xml:
{config=db-data-config.xml}
Jan 25, 2012 2:17:37 PM org.apache.solr.handler.dataimport.DataImporter
loadDataConfig
INFO: Data Configuration loaded successfully
Jan 25, 2012 2:17:37 PM org.apache.solr.handler.dataimport.DataImporter
doFullImport
INFO: Starting Full Import
Jan 25, 2012 2:17:37 PM org.apache.solr.handler.dataimport.SolrWriter
readIndexerProperties
*WARNING: Unable to read: dataimport.properties*
Jan 25, 2012 2:17:37 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
call
INFO: Creating a connection for entity Titles with URL:
://localhost:1433;DatabaseName=KnowledgeBase_DM
Jan 25, 2012 2:17:37 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
call
INFO: Time taken for getConnection(): 0
Jan 25, 2012 2:17:37 PM org.apache.solr.common.SolrException log
*SEVERE: Exception while processing: Titles document :
SolrInputDocument[{}]:org.apache.solr.handler.dataimport.DataImportHandlerException:
Unable to execute query: SELECT mrID, mrTitle from
KnowledgeBase_DM.dbo.AskMe_Data Processing Document # 1*
at
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.init(JdbcDataSource.java:253)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:210)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39)
at
org.apache.solr.handler.dataimport.DebugLogger$2.getData(DebugLogger.java:188)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:238)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:591)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:267)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:186)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:353)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:411)
at
org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:205)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
at

Advice - evaluating Solr for categorization keyword search

2012-01-25 Thread Becky Neil

Hi all,
I've been tasked with evaluating whether Solr is the right solution for my
company's search needs.  If this isn't the right forum for this kind of
question, please let me know where to go instead!

We are currently using sql queries to find mysql db results that match a
single keyword in one short text field, so our search is pretty crude.

What we hope that Solr can do initially is:
1 enable more flexible search (booleans, more than one field
searched/matched, etc)
2 live search results (eg new records get added to the index upon creation)
3 search rankings (eg most relevant - least relevant)
4 categorize our db (take records and at least group them, better if it
could assign a label to each record)
5 locate nearby results (geospatial search)

What I hope you can advise on is:
A How would you go about #2 - making sure that new documents are
added/indexed asap, based on a new rows to the db? Is that as simple as a
setting in Solr, or does it take some coding (eg a listener object, a kron
job, etc.).  I tried looking at the wiki  tutorial but wasn't able to find
answers - I couldn't make sense of how to use UpdateRequestProcessor to do
it. (http://wiki.apache.org/solr/UpdateRequestProcessor)
B What's the status of document clustering? The wiki says it's not been
fully implemented. Would we be able to achieve any of #4 yet? If not, what
else should we consider?
C Would you use Solr over say Google Maps api to run location aware
searches?
D How long should we expect it to take to configure Solr on our servers
with our db, get the initial index set up, and enable live search results?
 Are we talking one week, or one month? Our db is not tiny, but it's not
huge - say around 8k records in each of ~20 tables. Most tables have around
10 fields, including at least one large text field and then a variety of
dates, numbers, and small text.

I really appreciate any advice you can offer!
Cheers,
Becky
http://www.coffeeandpower.com

Re: phrase auto-complete with suggester component

2012-01-25 Thread O. Klein


O. Klein wrote
 
 I agree. Suggester could use some attention. Looking at Wiki there were
 some features planned, but not much has happened lately.
 

Or check out this post
http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/
looking very promising as an alternative.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/phrase-auto-complete-with-suggester-component-tp3685572p3689240.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Multiple document types

2012-01-25 Thread Frank DeRose

Hi Simon,

No, not different entity types, but actually different document types (I 
think). What would be ideal is if we could have multiple document elements in 
the data-config.xml file and some way of mapping each different document 
element to a different sets of field in the schema.xml file, and to a different 
index. Then, when Solr got a search request on one url (say, for example, 
http://172.24.1.16:8080/gwsolr/cc/doctype1/select/?q=...;), it would search 
for a document in the first index and when it got a search request on a 
different url (say, for example, 
http://172.24.1.16:8080/gwsolr/pc/doctype1/select/?q=...;), it would search 
for the document in the second index. In like manner, administrative tasks 
(like dataimport) would also switch off of the url, so that the url would 
determine which index was to be loaded by the dataimport command.

F

-Original Message-
From: Simon Willnauer [mailto:simon.willna...@googlemail.com] 
Sent: Wednesday, January 25, 2012 2:08 PM
To: java-user
Subject: Re: Multiple document types

hey Frank,

can you elaborate what you mean by different doc types? Are you
referring to an entity ie. a table per entity to speak in SQL terms?
in general you should get better responses for solr related questions
on solr-user@lucene.apache.org

simon

On Wed, Jan 25, 2012 at 10:49 PM, Frank DeRose fder...@guidewire.com wrote:
 It seems that it is not possible to have multiple document types defined in a 
 single solr schema.xml file. If, in fact, this is not possible, then, what is 
 the recommended app server deployment strategy for supporting multiple 
 documents on solr? Do I need to have one webapp instance per document type? 
 For example, if I am deploying under tomcat, do I need to have a separate 
 webapps each with its own context-path and set of config files 
 (data-config.xml and schema.xml, in particular)?

 _
 Frank DeRose
 Guidewire Software | Senior Software Engineer
 Cell: 510 -589-0752
 fder...@guidewire.commailto:fder...@guidewire.com | 
 www.guidewire.comhttp://www.guidewire.com/
 Deliver insurance your way with flexible core systems from Guidewire.


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Query for documents that have ONLY a certain value in a multivalued field

2012-01-25 Thread Garrett Conaty

Does anyone know if there's a way using the SOLR query syntax to filter
documents that have only a certain value in a multivalued field?  As an
example if I have some field country that's multivalued and I want

q=id:[* TO *]fq=country:brazil   where 'brazil' is the only value
present.

I've run through a few possibilities to do this, but I think it would be
more common and a better solution would exist:

1) On index creation time, aggregate my source data and create a
count_country field that contains the number of terms in the country
field.  Then the query would be q=id:[* TO
*]fq=country:brazilfq=count_country=1

2) In the search client, use the terms component to retrieve all terms for
country and then do the exclusions in the client and construct the query
as follows q=id:[* TO
*]fq=country:brazilfq=-country:canadafq=-country:us   etc.

3) Write a function query or similar that could capture the info.



Thanks in advance,
Garrett Conaty

Re: phrase auto-complete with suggester component

2012-01-25 Thread Tommy Chheng

Thanks for link, that's the approach I'm going to try.

On Wed, Jan 25, 2012 at 2:39 PM, O. Klein kl...@octoweb.nl wrote:

 O. Klein wrote

 I agree. Suggester could use some attention. Looking at Wiki there were
 some features planned, but not much has happened lately.


 Or check out this post
 http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/
 looking very promising as an alternative.

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/phrase-auto-complete-with-suggester-component-tp3685572p3689240.html
 Sent from the Solr - User mailing list archive at Nabble.com.



-- 
Tommy Chheng

Re: Cluster Resizing question

2012-01-25 Thread Jamie Johnson

I think I need to provide a few more details here.  I need the ability
to add a shard to the cluster, in doing this I'd like to split an
existing index and spin up this new shard with 1/2 (or there abouts)
of this and allow the original to continue serving the pieces it has
now.  In our application we are using the murmurhash3 (taken from
http://www.yonik.com/) so that updates/deletes are sent to the
appropriate servers (we're using an old snapshot of solrcloud which
doesn't support the latest distributed indexing).  In my case the
hashing is based on the number of shards, which means you add a shard
and it breaks.  I've read in one of the JIRAs that instead the hashing
should be based on some other number (any ideas?) and then should be
used to calculate ranges which would in turn be stored in ZK so adding
another shard would be a matter of updating the range in ZK, stopping
the machine serving the index to be split and splitting said index
such that the 2 indexes created would map to the new bins.

All of that being said I have none of this implemented and would much
prefer this work happen within solr proper since it's already on the
roadmap and my code would ultimately be throw away.  Thus the reason
I'd like to understand what the plans are for this in Solr and
possibly start contributing to this development, assuming it meets my
timelines.

Any thoughts/comments are greatly appreciated.

On 1/25/12, Jamie Johnson jej2...@gmail.com wrote:
 Thanks Otis.  I have been following the SolrCloud development, but I
 was wondering specifically about elastically expanding the cloud by
 adding shards.  I'm following the distributed indexing JIRA, but I'm
 having difficulty finding a JIRA which specifically references the
 issues with elasticity.  Are you aware of one?

 On Wed, Jan 25, 2012 at 1:10 PM, Otis Gospodnetic
 otis_gospodne...@yahoo.com wrote:
 Jamie, depending on how quickly you need this, it may be better to follow
 SolrCloud development because cluster resizing will work differently
 there.

 Otis
 
 Performance Monitoring SaaS for Solr -
 http://sematext.com/spm/solr-performance-monitoring/index.html



 From: Jamie Johnson jej2...@gmail.com
To: solr-user@lucene.apache.org
Sent: Wednesday, January 25, 2012 12:03 PM
Subject: Cluster Resizing question

Is this the JIRA that I should be tracking for resizing a cluster?

https://issues.apache.org/jira/browse/SOLR-2593

If not can someone point me to the appropriate location.  Also is
there a rough timeline for when this will be available?

Re: Multiple document types

2012-01-25 Thread Simon Willnauer

On Thu, Jan 26, 2012 at 12:05 AM, Frank DeRose fder...@guidewire.com wrote:
 Hi Simon,

 No, not different entity types, but actually different document types (I 
 think). What would be ideal is if we could have multiple document elements 
 in the data-config.xml file and some way of mapping each different document 
 element to a different sets of field in the schema.xml file, and to a 
 different index. Then, when Solr got a search request on one url (say, for 
 example, http://172.24.1.16:8080/gwsolr/cc/doctype1/select/?q=...;), it 
 would search for a document in the first index and when it got a search 
 request on a different url (say, for example, 
 http://172.24.1.16:8080/gwsolr/pc/doctype1/select/?q=...;), it would search 
 for the document in the second index. In like manner, administrative tasks 
 (like dataimport) would also switch off of the url, so that the url would 
 determine which index was to be loaded by the dataimport command.

seems like you should look at solr's multicore feature:
http://wiki.apache.org/solr/CoreAdmin

simon

 F

 -Original Message-
 From: Simon Willnauer [mailto:simon.willna...@googlemail.com]
 Sent: Wednesday, January 25, 2012 2:08 PM
 To: java-user
 Subject: Re: Multiple document types

 hey Frank,

 can you elaborate what you mean by different doc types? Are you
 referring to an entity ie. a table per entity to speak in SQL terms?
 in general you should get better responses for solr related questions
 on solr-user@lucene.apache.org

 simon

 On Wed, Jan 25, 2012 at 10:49 PM, Frank DeRose fder...@guidewire.com wrote:
 It seems that it is not possible to have multiple document types defined in 
 a single solr schema.xml file. If, in fact, this is not possible, then, what 
 is the recommended app server deployment strategy for supporting multiple 
 documents on solr? Do I need to have one webapp instance per document type? 
 For example, if I am deploying under tomcat, do I need to have a separate 
 webapps each with its own context-path and set of config files 
 (data-config.xml and schema.xml, in particular)?

 _
 Frank DeRose
 Guidewire Software | Senior Software Engineer
 Cell: 510 -589-0752
 fder...@guidewire.commailto:fder...@guidewire.com | 
 www.guidewire.comhttp://www.guidewire.com/
 Deliver insurance your way with flexible core systems from Guidewire.


 -
 To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-user-h...@lucene.apache.org

52 matches

Mail list logo