ICUCollation throws exception

2012-07-17 Thread Oliver Schihin

Hello

According to release notes from 4.0.0-ALPHA, SOLR-2396, I replaced 
ICUCollationKeyFilterFactory with ICUCollationField in our schema. But this throws an 
exception, see the following excerpt from the log:


Jul 16, 2012 5:27:48 PM org.apache.solr.common.SolrException log
SEVERE: null:org.apache.solr.common.SolrException: Plugin init failure for [schema.xml] 
fieldType alphaOnlySort: Pl
ugin init failure for [schema.xml] analyzer/filter: class 
org.apache.solr.schema.ICUCollationField

at 
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:168)
at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:359)

The deprecated filter of ICUCollationKeyFilterFactory is working without any problem. This 
is how I did the schema (with the deprecated filter):


   !-- field type for sort strings --
   fieldType name=alphaOnlySort class=solr.TextField sortMissingLast=true 
omitNorms=true

  analyzer
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.ICUCollationKeyFilterFactory
locale=de@collation=phonebook
strength=primary
 /
  /analyzer
/fieldType


Do I have to replace jars in /contrib/analysis-extras/, or any other hints of what might 
be wrong in my install and configuration?


Thanks a lot
Oliver




Re: Wildcard query vs facet.prefix for autocomplete?

2012-07-17 Thread santamaria2
I'll consider using the other methods, but I'd like to know which would be
faster among the two approaches mentioned in my opening post.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-query-vs-facet-prefix-for-autocomplete-tp3995199p3995458.html
Sent from the Solr - User mailing list archive at Nabble.com.


ICUCollation throws exception

2012-07-17 Thread Oliver Schihin

Hello

According to release notes from 4.0.0-ALPHA, SOLR-2396, I replaced 
ICUCollationKeyFilterFactory with ICUCollationField in our schema. But this throws an 
exception, see the following excerpt from the log:


Jul 16, 2012 5:27:48 PM org.apache.solr.common.SolrException log
SEVERE: null:org.apache.solr.common.SolrException: Plugin init failure for [schema.xml] 
fieldType alphaOnlySort: Pl
ugin init failure for [schema.xml] analyzer/filter: class 
org.apache.solr.schema.ICUCollationField

at 
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:168)
at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:359)

The deprecated filter of ICUCollationKeyFilterFactory is working without any problem. This 
is how I did the schema (with the deprecated filter):


   !-- field type for sort strings --
   fieldType name=alphaOnlySort class=solr.TextField sortMissingLast=true 
omitNorms=true

  analyzer
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.ICUCollationKeyFilterFactory
locale=de@collation=phonebook
strength=primary
 /
  /analyzer
/fieldType


Do I have to replace jars in /contrib/analysis-extras/, or any other hints of what might 
be wrong in my install and configuration?


Thanks a lot
Oliver






RE: Query facet count and its matching documents

2012-07-17 Thread Gnanakumar
Any ideas on this?

 We're running Apache Solr v3.1 and SolrJ is our client.

 We're passing multiple Arbitrary Faceting Query (facet.query) to get the
 number of matching documents (the facet count) evaluated over the search
 results in a *single* Solr query.  My use case demands the actual matching
 facet results/documents/fields also along with facet count.

 My question is, is it possible to get facet query matching results along
 with facet count in a single Solr query call?





Re: Error 404 on every request

2012-07-17 Thread Nils Abegg
Hey Guys,

I'm trying to get solr running. I got it installed and I can access the admin 
dashboard, but if I try to index some docs, i always get a 404 Error.
I tried it with the following URLs:
http://mydomain/solr/update/json
http://mydomain/solr/mycore/update/json
http://mydomain/update/json
http://mydomain/mycore/update/json

I have installed the 4.0 Alpha with the build-in Jetty Server on Ubuntu Server 
12.04…i followed this tutorial to set it up:
http://kingstonlabs.blogspot.de/2012/06/installing-solr-36-on-ubuntu-1204.html

It seems that I get a 404 for every request wich is not related to the admin 
dashboard(solr/#/).

Thanks in advance for any help. ;)

Regards
Nils




Re: Error 404 on every request

2012-07-17 Thread Yonik Seeley
On Tue, Jul 17, 2012 at 6:01 AM, Nils Abegg nils.ab...@ffuf.de wrote:
 I have installed the 4.0 Alpha with the build-in Jetty Server on Ubuntu 
 Server 12.04…i followed this tutorial to set it up:
 http://kingstonlabs.blogspot.de/2012/06/installing-solr-36-on-ubuntu-1204.html

Instead of trying to install Solr, I'd suggest just starting with
the stock server included with the binary distribution.
If you have Java in your path, you just do:

cd example
java -jar start.jar

-Yonik
http://lucidimagination.com


Re: Error 404 on every request

2012-07-17 Thread Nils Abegg
Same issue with the stock server….i followed the steps of the wiki.
XML via post.jar its working, JSON via Curl is not.

Am 17.07.2012 um 12:05 schrieb Yonik Seeley:

 On Tue, Jul 17, 2012 at 6:01 AM, Nils Abegg nils.ab...@ffuf.de wrote:
 I have installed the 4.0 Alpha with the build-in Jetty Server on Ubuntu 
 Server 12.04…i followed this tutorial to set it up:
 http://kingstonlabs.blogspot.de/2012/06/installing-solr-36-on-ubuntu-1204.html
 
 Instead of trying to install Solr, I'd suggest just starting with
 the stock server included with the binary distribution.
 If you have Java in your path, you just do:
 
 cd example
 java -jar start.jar
 
 -Yonik
 http://lucidimagination.com



Re: Error 404 on every request

2012-07-17 Thread Nils Abegg
Ok, i got it working with path /update not /update/json
But it feels somewhat fishy to have solr sitting in my home dir.




AW: DIH XML configs for multi environment

2012-07-17 Thread Markus Klose
Hi

There is one more approach using the property mechanism.

You could specify the datasource like this:
dataSource name=database driver=${sqlDriver} url=${sqlURL}/

 And you can specifiy the properties in the solr.xml in your core configuration 
like this:

core instanceDir=core1 name=core1
property name=sqlURL value=jdbc:hsqldb:/temp/example/ex/

/core


Viele Grüße aus Augsburg

Markus Klose
SHI Elektronische Medien GmbH 
 

Adresse: Curt-Frenzel-Str. 12, 86167 Augsburg

Tel.:   0821 7482633 26
Tel.:   0821 7482633 0 (Zentrale)
Mobil:0176 56516869
Fax:   0821 7482633 29

E-Mail: markus.kl...@shi-gmbh.com
Internet: http://www.shi-gmbh.com

Registergericht Augsburg HRB 17382
Geschäftsführer: Peter Spiske
USt.-ID: DE 182167335





-Ursprüngliche Nachricht-
Von: Rahul Warawdekar [mailto:rahul.warawde...@gmail.com] 
Gesendet: Mittwoch, 11. Juli 2012 11:21
An: solr-user@lucene.apache.org
Betreff: Re: DIH XML configs for multi environment

http://wiki.eclipse.org/Jetty/Howto/Configure_JNDI_Datasource
http://docs.codehaus.org/display/JETTY/DataSource+Examples


On Wed, Jul 11, 2012 at 2:30 PM, Pranav Prakash pra...@gmail.com wrote:

 That's cool. Is there something similar for Jetty as well? We use Jetty!

 *Pranav Prakash*

 temet nosce



 On Wed, Jul 11, 2012 at 1:49 PM, Rahul Warawdekar  
 rahul.warawde...@gmail.com wrote:

  Hi Pranav,
 
  If you are using Tomcat to host Solr, you can define your data 
  source in context.xml file under tomcat configuration.
  You have to refer to this datasource with the same name in all the 3 
  environments from DIH data-config.xml.
  This context.xml file will vary across 3 environments having 
  different credentials for dev, stag and prod.
 
  eg
  DIH data-config.xml will refer to the datasource as listed below 
  dataSource jndiName=java:comp/env/*YOUR_DATASOURCE_NAME*
  type=JdbcDataSource readOnly=true /
 
  context.xml file which is located under /TOMCAT_HOME/conf folder 
  will have the resource entry as follows
Resource name=*YOUR_DATASOURCE_NAME* auth=Container
  type= username=X password=X
  driverClassName=
  url=
  maxActive=8
  /
 
  On Wed, Jul 11, 2012 at 1:31 PM, Pranav Prakash pra...@gmail.com
 wrote:
 
   The DIH XML config file has to be specified dataSource. In my 
   case, and possibly with many others, the logon credentials as well 
   as mysql
 server
   paths would differ based on environments (dev, stag, prod). I 
   don't
 want
  to
   end up coming with three different DIH config files, three 
   different handlers and so on.
  
   What is a good way to deal with this?
  
  
   *Pranav Prakash*
  
   temet nosce
  
 
 
 
  --
  Thanks and Regards
  Rahul A. Warawdekar
 




--
Thanks and Regards
Rahul A. Warawdekar


Re: Error 404 on every request

2012-07-17 Thread Erik Hatcher
/update/json was removed from the example configuration in 4.0 because /update 
now handles content based on content-type internally.  It may not be spelled 
out as clearly as it should be, but here's the CHANGES entry for it:

* SOLR-2857: Support XML,CSV,JSON, and javabin in a single RequestHandler and 
  choose the correct ContentStreamLoader based on Content-Type header.  This
  also deprecates the existing [Xml,JSON,CSV,Binary,Xslt]UpdateRequestHandler.

https://issues.apache.org/jira/browse/SOLR-2857

Yonik added a comment yesterday because of the issue you hit.  So maybe 
/update/json will re-emerge in the 4.0 final release example configuration?

Always be careful when comparing blogs/articles written about one version to a 
different version you're using.

Erik



On Jul 17, 2012, at 06:01 , Nils Abegg wrote:

 Hey Guys,
 
 I'm trying to get solr running. I got it installed and I can access the admin 
 dashboard, but if I try to index some docs, i always get a 404 Error.
 I tried it with the following URLs:
 http://mydomain/solr/update/json
 http://mydomain/solr/mycore/update/json
 http://mydomain/update/json
 http://mydomain/mycore/update/json
 
 I have installed the 4.0 Alpha with the build-in Jetty Server on Ubuntu 
 Server 12.04…i followed this tutorial to set it up:
 http://kingstonlabs.blogspot.de/2012/06/installing-solr-36-on-ubuntu-1204.html
 
 It seems that I get a 404 for every request wich is not related to the admin 
 dashboard(solr/#/).
 
 Thanks in advance for any help. ;)
 
 Regards
 Nils
 
 



Re: Solr facet multiple constraint

2012-07-17 Thread Erick Erickson
OK, maybe I'm finally getting it. When you do a facet.field=blahblah, you're
telling Solr to take all the documents that match the query, look in field
blahblah, and tally the documents that match _any_ value in the field. There's
no restriction at all on the _values_ that that tally is made for.

If I'm _finally_ understanding you, you only want tallys for the values in
your fq clause, have you considered facet queries?

facet.query=user:10facet.query=user:3 will give you the counts for all the
docs that match your query (q and fq clauses), but only return you the facets
for the indicated users. You can have as many facet.query clauses as you want.

Best
Erick

On Mon, Jul 16, 2012 at 4:00 AM, davidbougearel
david.bougea...@smile-benelux.com wrote:
 Ok i'm added the debug, there is the query from the response after executing
 query :

 facet=true,sort=publishingdate
 desc,debugQuery=true,facet.mincount=1,q=service:1 AND
 publicationstatus:LIVE,facet.field=pillar,wt=javabin,fq=(((pillar:10))),version=2}},response={numFound=2,start=0,docs=[SolrDocument[{uniquenumber=UniqueNumber1,
 name=Doc 1, publicationstatus=LIVE, service=1, servicename=service_1,
 pillar=[10], region=EU, regionname=Europe, documenttype=TRACKER,
 publishingdate=Sun Jul 15 09:03:32 CEST 2012, publishingyear=2012,
 teasersummary=Seo_Description, content=answer, creator=chandan, version=1,
 documentinstanceid=1}], SolrDocument[{uniquenumber=UniqueNumber2, name=Doc
 2, publicationstatus=LIVE, service=1, servicename=service_1, pillar=[10],
 region=EU, regionname=Europe, documenttype=TRACKER, publishingdate=Sat Jul
 14 09:03:32 CEST 2012, publishingyear=2012, teasersummary=Seo_Description,
 content=answer, creator=chandan, version=1,
 documentinstanceid=1}]]},facet_counts={facet_queries={},facet_fields={pillar={10=2}},facet_dates={},facet_ranges={}},debug={rawquerystring=service:1
 AND publicationstatus:LIVE,querystring=service:1 AND
 publicationstatus:LIVE,parsedquery=+service:1
 +publicationstatus:LIVE,parsedquery_toString=+service:1
 +publicationstatus:LIVE,explain={UniqueNumber1=
 1.2917422 = (MATCH) sum of:
   0.7741482 = (MATCH) weight(service:1 in 0), product of:
 0.7741482 = queryWeight(service:1), product of:
   1.0 = idf(docFreq=4, maxDocs=5)
   0.7741482 = queryNorm
 1.0 = (MATCH) fieldWeight(service:1 in 0), product of:
   1.0 = tf(termFreq(service:1)=1)
   1.0 = idf(docFreq=4, maxDocs=5)
   1.0 = fieldNorm(field=service, doc=0)
   0.517594 = (MATCH) weight(publicationstatus:LIVE in 0), product of:
 0.6330043 = queryWeight(publicationstatus:LIVE), product of:
   0.81767845 = idf(docFreq=5, maxDocs=5)
   0.7741482 = queryNorm
 0.81767845 = (MATCH) fieldWeight(publicationstatus:LIVE in 0), product
 of:
   1.0 = tf(termFreq(publicationstatus:LIVE)=1)
   0.81767845 = idf(docFreq=5, maxDocs=5)
   1.0 = fieldNorm(field=publicationstatus, doc=0)
 ,UniqueNumber2=
 1.2917422 = (MATCH) sum of:
   0.7741482 = (MATCH) weight(service:1 in 0), product of:
 0.7741482 = queryWeight(service:1), product of:
   1.0 = idf(docFreq=4, maxDocs=5)
   0.7741482 = queryNorm
 1.0 = (MATCH) fieldWeight(service:1 in 0), product of:
   1.0 = tf(termFreq(service:1)=1)
   1.0 = idf(docFreq=4, maxDocs=5)
   1.0 = fieldNorm(field=service, doc=0)
   0.517594 = (MATCH) weight(publicationstatus:LIVE in 0), product of:
 0.6330043 = queryWeight(publicationstatus:LIVE), product of:
   0.81767845 = idf(docFreq=5, maxDocs=5)
   0.7741482 = queryNorm
 0.81767845 = (MATCH) fieldWeight(publicationstatus:LIVE in 0), product
 of:
   1.0 = tf(termFreq(publicationstatus:LIVE)=1)
   0.81767845 = idf(docFreq=5, maxDocs=5)
   1.0 = fieldNorm(field=publicationstatus, doc=0)
 },QParser=LuceneQParser,filter_queries=[(((pillar:10)))

 As you can see in this request i'm talking about pillar not about user.

 Thanks for all, David.

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Solr-facet-multiple-constraint-tp3992974p3995215.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: When shall index be split over shards?

2012-07-17 Thread Erick Erickson
not really. It's a matter of when your system starts to bog down, and
unfortunately
there's no good way to give general guidance, especially on a number
like size of
the index. 90% of the index size could be stored data (*.fdt and *.fdx
files) that have
no bearing on search requirements.

My advice would be to set up a test system and keep adding documents
to it until it
blows up. You can fire queries you mine from the Solr logs at to
simulate load. You
may have to synthesize documents to get a good sense of this

But a 1G index is actually quite small by many standards. Of course it
depends on your
hardware...

Best
Erick

On Mon, Jul 16, 2012 at 6:05 AM, Alexander Aristov
alexander.aris...@gmail.com wrote:
 People,

 What would be your suggestion?

 I have a basic solr installation. Index is becoming bigger and bigger and
 it hit 1Gb level.

 When shall I consider adding shards and split index over them? are there
 general suggestions?


 Best Regards
 Alexander Aristov


Re: are stopwords indexed?

2012-07-17 Thread Erick Erickson
Two things:
1 did you re-index after you got your stopwords file set up? And I'd
blow away the index directory before re-indexing.
2 If you _store_ your field, the stopwords will be in your results
lists, but _not_ in your index. As a secondary
check, try going into your admin/schema browser link and looking
at the field in question. Stopwords are
by definition frequent so they should be at the top of your list.
3 Check a different way by using the TermsComponent (see:
http://wiki.apache.org/solr/TermsComponent/)
 this will also show you the _indexed_ as opposed to stored terms.

Best
Erick

On Mon, Jul 16, 2012 at 6:40 AM, Giovanni Gherdovich
g.gherdov...@gmail.com wrote:
 Hi all, thank you for your replies.

 Lance:
 Look at the index with the Schema Browser in the Solr UI. This pulls
 the terms for each field.

 I did it, and it was the first alarm I got.
 After the indexing, I went on the schema browser hoping
 to don't see any stopword in the top-terms, but...
 they were all there.

 Michael:
 Hi Giovanni,

 you have entered the stopwords into stopword.txt file, right? But in the
 definition of the field type you are referencing stopwords_FR.txt..

 good catch Micheal, but that's not the problem.

 In my message I referred to stopwords.txt, but actually my
 stopwords file is named  stopwords_FR.txt, consistently with
 what I put in my schema.xml

 By the way, your answers make me think that yes,
 I have a problem: stopwords should not appear in the index.

 what a weird situation:

 * querying with SOLR for a stopword (say and) gives me zero result
   (so, somewhere in the indexing / searching pipeline my stopwords
 file *is* taken into account)
 * checking the index files with LuCLI for the same stopword give me
 tons of hits.

 cheers,
 GGhh


Re: Metadata and FullText, indexed at different times - looking for best approach

2012-07-17 Thread Erick Erickson
In that case, I think your best option is to re-index the entire document
when you have the text available, metadata and all. Which actually
begs the question whether you want to index the bare metadata at
all. Is it the use-case that the user actually gets value when there's no
text? If not, forget DIH and just index the metadata as a result of the
text becoming available.

Best
Erick

On Mon, Jul 16, 2012 at 1:43 PM, Alexandre Rafalovitch
arafa...@gmail.com wrote:
 Thank you,

 I am already on 4alpha. Patch feels a little too unstable for my
 needs/familiarity with the codes.

 What about something around multiple cores? Could I have full-text
 fields stored in a separate cores and somehow (again, minimum
 hand-coding) do search against all those cores and get back combined
 list of document IDs? Or would it making comparative ranking/sorting
 impossible?

 Regards,
Alex.
 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all
 at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)


 On Sun, Jul 15, 2012 at 12:08 PM, Erick Erickson
 erickerick...@gmail.com wrote:
 You've got a couple of choices. There's a new patch in town
 https://issues.apache.org/jira/browse/SOLR-139
 that allows you to update individual fields in a doc if (and only if)
 all the fields in the original document were stored (actually, all the
 non-copy fields).

 So if you're storing (stored=true) all your metadata information, you can
 just update the document when the  text becomes available assuming you
 know the uniqueKey when you update.

 Under the covers, this will find the old document, get all the fields, add 
 the
 new fields to it, and re-index the whole thing.

 Otherwise, your fallback idea is a good one.

 Best
 Erick

 On Sat, Jul 14, 2012 at 11:05 PM, Alexandre Rafalovitch
 arafa...@gmail.com wrote:
 Hello,

 I have a database of metadata and I can inject it into SOLR with DIH
 just fine. But then, I also have the documents to extract full text
 from that I want to add to the same records as additional fields. I
 think DIH allows to run Tika at the ingestion time, but I may not have
 the full-text files at that point (they could arrive days later). I
 can match the file to the metadata by a file name matching a field
 name.

 What is the best approach to do that staggered indexing with minimum
 custom code? I guess my fallback position is a custom full-text
 indexer agent that re-adds the metadata fields when the file is being
 indexed. Is there anything better?

 I am a newbie using v4.0alpha of SOLR (and loving it).

 Thank you,
 Alex.
 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all
 at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)


Re: Solr 3.5 DIH delta-import replicating full index or Admin UI problem?

2012-07-17 Thread Erick Erickson
What you're seeing is the replication of the changed segments (new
segments actually). Replication only moves new or merged
segments and they will be a varying portion of the total index. If you
optimized, you'd see the entire index be moved (but you don't
need to do that!).

You should be able to search on the slave and find the documents you've
just added. If you can do that there's no problem.

Best
Erick

On Mon, Jul 16, 2012 at 3:10 PM, Arcadius Ahouansou
arcad...@menelic.com wrote:
 Hello.

 We are running Solr 3.5 multicore in master-slave mode.


 -Our delta-import looks like:
 /solr/core01/dataimport?command=delta-import*optimize=false*

 The size of the index in 1.18GB

 When delta-import is going on, on the slave admin UI
  8983/solr/core01/admin/replication/index.jsp
 I can see the following output:
 
 Master http://solrmaster01.somedomain.com:8983/solr/core01/replication
 Latest Index Version:null, Generation: null
 Replicatable Index Version:1342183977587, Generation: 33
 Poll Interval 00:00:60
 Local Index Index Version: 1342183977585, Generation: 32
 Location: /var/somedomain/solr/solrhome/core01/data/index
 Size: 1.18 GB
 Times Replicated Since Startup: 32
 Previous Replication Done At: Mon Jul 16 17:08:58 GMT 2012
 Config Files Replicated At: null
 Config Files Replicated: null
 Times Config Files Replicated Since Startup: null
 Next Replication Cycle At: Mon Jul 16 17:09:58 GMT 2012
 Current Replication Status Start Time: Mon Jul 16 17:08:58 GMT 2012
 Files Downloaded: 12 / 95
 *Downloaded: 4.33 KB / 1.18 GB [0.0%]*
 *Downloading File: _1o.fdt, Downloaded: 510 bytes / 510 bytes [100.0%]*
 Time Elapsed: 22s, Estimated Time Remaining: 6266208s, Speed: 201 bytes/s
 -


 - Does Downloaded: 4.33 KB / *1.18 GB [0.0%] *means that the solr slave
 is going to download the whole 1.18GB?

 -I have been monitoring this and the replications takes less that a minute.
 And checking the files in the index directory on the slave, the timestamps
 are quite different, so apparently, the slave is not downloading the full
 index all the time.

 -Please, has anyone else seen the whole index size being shown as
 denominator of the Downloaded fraction?

 -Anything I may be doing wrong?

 -Also notice the Files Downloaded: 12 / 95.  That bit never increase
 to 95 / 95


 Our solrconfig looks like this:

 --
 requestHandler name=/replication class=solr.ReplicationHandler 
 lst name=master
  str name=enable${enable.master:false}/str
  str name=replicateAftercommit/str
  str name=replicateAfterstartup/str
  str
 name=confFilessolrconfig.xml,synonyms.txt,schema.xml,stopwords.txt,data-config.xml/str
 /lst
 lst name=slave
  str name=enable${enable.slave:false}/str
  str name=masterUrlsome-master-full-url/str
  str name=pollInterval00:00:60/str
 /lst
 /requestHandler
 --


 Thanks.

 Arcadius.

 *
 *


Result docs missing only when shards parameter present in query?

2012-07-17 Thread Bill Havanki
I had the same problem as the original poster did two years ago (!), but
with Solr 3.4.0:

 I cannot get hits back and do not get a correct total number of records
when using shard searching.

When performing a sharded query, I would get empty / missing results - no
documents at all. Querying each shard individually worked, but anything
with the shards parameter yielded no result documents.

I was able to get results back by updating my schema to include
multiValued=false for the unique key field.

The problem I was seeing was that, when Solr was formulating the queries to
go get records from each shard, it was including square brackets around the
ids it was asking for, e.g.:

...q=123ids=[ID1],[ID2],[ID3]...

I delved into the Solr code and saw that this query string was being formed
(in QueryComponent.createRetrieveDocs()) by simply calling toString() on
the unique key field value for each document it wanted to get. My guess is
that the value objects somehow were ArrayLists (or something like that) and
not Strings, so those annoying square brackets showed up via toString(). By
emphasizing in the schema that the field was single-valued, those lists
would hopefully stop appearing, and I think they did. At least the brackets
went away.

Here's the relevant QueryComponent code (again, 3.4.0 - it's the same in
3.6.0, didn't check 4):

ArrayListString ids = new ArrayListString(shardDocs.size());
for (ShardDoc shardDoc : shardDocs) {
// TODO: depending on the type, we may need more tha a simple toString()?
  ids.add(shardDoc.id.toString());
}
sreq.params.add(ShardParams.IDS, StrUtils.join(ids, ','));

The comment in there seems to fit my theory. :)

Bill


Indexing data in csv format

2012-07-17 Thread gopes

Hi ,

I am trying to index data in csv format. But while indexing I get this
following message -

body
HTTP ERROR 404

pProblem accessing /solr/update/csv. Reason:
preNOT_FOUND/pre/phr //smallPowered by Jetty:///small/br/

solrconfig.xml has the following entries for CSVRequestHandler
requestHandler name=/update/csv class=solr.CSVRequestHandler 
startup=lazy
lst name=defaults
str name=separator;/str
str name=headertrue/str
str name=skippublish_date/str
str name=encapsulator/str
/lst
/requestHandler

Thanks,
Sarala

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-data-in-csv-format-tp3995549.html
Sent from the Solr - User mailing list archive at Nabble.com.


Disable cache ?

2012-07-17 Thread Bruno Mannina

Hi Solr Users,

I would like for my test disable the cache fonction, so I modified all 
information concerning cache in solrconfig.xml

but after restarting my Tomcat cache is always here.

Do you think I forgot something?

Requests are done with QTime=1 or QTime=0
and with this rapidity my program losts information.

I would like to do some tests less quickly?

Thanks a lot,
Bruno



Re: Disable cache ?

2012-07-17 Thread Tomás Fernández Löbbe
I think you could disable Solr caches by setting their size to 0 (deleting
them won't work, as for example, the FieldValueCache will take default
values, not sure about the other ones). I don't think you'll be able to
disable Lucene's Field Cache.

What's the test that you want to run? Why do you not want caches for it?

On Tue, Jul 17, 2012 at 1:10 PM, Bruno Mannina bmann...@free.fr wrote:

 Hi Solr Users,

 I would like for my test disable the cache fonction, so I modified all
 information concerning cache in solrconfig.xml
 but after restarting my Tomcat cache is always here.

 Do you think I forgot something?

 Requests are done with QTime=1 or QTime=0
 and with this rapidity my program losts information.

 I would like to do some tests less quickly?

 Thanks a lot,
 Bruno




Re: Disable cache ?

2012-07-17 Thread lboutros
Hi Bruno,

don't forget the OS disk cache.

On linux you can clear it with this tiny script :

#!/bin/bash

sync  echo 3  /proc/sys/vm/drop_caches

Ludovic.




-
Jouve
France.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Disable-cache-tp3995575p3995589.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: edismax not working in a core

2012-07-17 Thread Richard Frovarp

On 07/14/2012 05:32 PM, Erick Erickson wrote:

Really hard to say. Try executing your query on the cores with
debugQuery=on and compare the parsed results (for this you
can probably just ignore the explain bits of the output, concentrate
on the parsed query).



Okay, for the example core from the project, the query was:

test OR samsung

parsedquery:
+(DisjunctionMaxQuery((id:test^10.0 | text:test^0.5 | cat:test^1.4 | 
manu:test^1.1 | name:test^1.2 | features:test | sku:test^1.5)) 
DisjunctionMaxQuery((id:samsung^10.0 | text:samsung^0.5 | 
cat:samsung^1.4 | manu:samsung^1.1 | name:samsung^1.2 | features:samsung 
| sku:samsung^1.5)))


For my core the query was:

frovarp OR fee

parsedquery:

+((DisjunctionMaxQuery((content:fee | title:fee^5.0 | 
mainContent:fee^2.0)) DisjunctionMaxQuery((content:frovarp | 
title:frovarp^5.0 | mainContent:frovarp^2.0)))~2)


What is that ~2? That's the difference. The third core that works 
properly also doesn't have the ~2.


solr home in jar?

2012-07-17 Thread Matt Mitchell
Hi,

I'd like to bundle up a jar file, with a complete solr home and index.
This jar file is a dependency for another application, which uses an
instance of embedded solr, multi-core. Is there any way to have the
application's embedded solr, read the configs/index data from jar
dependency?

I attempted using CoreContainer with a resource loader (and many other
ways), but no luck! Any ideas?

- Matt


java.lang.AssertionError: System properties invariant violated.

2012-07-17 Thread Roman Chyla
Hello,

(Please excuse cross-posting, my problem is with a solr component, but
the underlying issue is inside the lucene test-framework)

I am porting 3x unittests to the solr/lucene trunk. My unittests are
OK and pass, but in the end fail because the new rule checks for
modifier properties. I know what the problem is, I am creating new
system properties in the @beforeClass, but I think I need to do it
there, because the project loads C library before initializing tests.

Anybody knows how to work around it cleanly? There is a property that
can be set to ignore certain names
(LuceneTestCase.IGNORED_INVARIANT_PROPERTIES), but unfortunately it is
declared as private.

Thank you,

  Roman


Exception:

java.lang.AssertionError: System properties invariant violated.
New keys:
  montysolr.bridge=montysolr.java_bridge.SimpleBridge
  montysolr.home=/dvt/workspace/montysolr
  montysolr.modulepath=/dvt/workspace/montysolr/src/python/montysolr
  solr.test.sys.prop1=propone
  solr.test.sys.prop2=proptwo

at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:66)
at 
org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
at 
org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)


Re: Metadata and FullText, indexed at different times - looking for best approach

2012-07-17 Thread Alexandre Rafalovitch
Thank you,

Re-index does look like a real option then. I am looking now at
storing text/files in MongoDB or like and indexing into SOLR from
that. Initially, I was going to skip the DB part for as long as
possible.

Regarding the use case, yes it does make sense to have just metadata.
It is rich, curated metadata that works without files (several, each
in its own language). So, before files show up, the search is against
title/subject/etc. When the files show up, one by one, they get added
into index for additional/enhanced results.

Again, thank you for walking through this with me.

Regards,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Tue, Jul 17, 2012 at 9:12 AM, Erick Erickson erickerick...@gmail.com wrote:
 In that case, I think your best option is to re-index the entire document
 when you have the text available, metadata and all. Which actually
 begs the question whether you want to index the bare metadata at
 all. Is it the use-case that the user actually gets value when there's no
 text? If not, forget DIH and just index the metadata as a result of the
 text becoming available.

 Best
 Erick

 On Mon, Jul 16, 2012 at 1:43 PM, Alexandre Rafalovitch
 arafa...@gmail.com wrote:
 Thank you,

 I am already on 4alpha. Patch feels a little too unstable for my
 needs/familiarity with the codes.

 What about something around multiple cores? Could I have full-text
 fields stored in a separate cores and somehow (again, minimum
 hand-coding) do search against all those cores and get back combined
 list of document IDs? Or would it making comparative ranking/sorting
 impossible?

 Regards,
Alex.
 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all
 at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)


 On Sun, Jul 15, 2012 at 12:08 PM, Erick Erickson
 erickerick...@gmail.com wrote:
 You've got a couple of choices. There's a new patch in town
 https://issues.apache.org/jira/browse/SOLR-139
 that allows you to update individual fields in a doc if (and only if)
 all the fields in the original document were stored (actually, all the
 non-copy fields).

 So if you're storing (stored=true) all your metadata information, you can
 just update the document when the  text becomes available assuming you
 know the uniqueKey when you update.

 Under the covers, this will find the old document, get all the fields, add 
 the
 new fields to it, and re-index the whole thing.

 Otherwise, your fallback idea is a good one.

 Best
 Erick

 On Sat, Jul 14, 2012 at 11:05 PM, Alexandre Rafalovitch
 arafa...@gmail.com wrote:
 Hello,

 I have a database of metadata and I can inject it into SOLR with DIH
 just fine. But then, I also have the documents to extract full text
 from that I want to add to the same records as additional fields. I
 think DIH allows to run Tika at the ingestion time, but I may not have
 the full-text files at that point (they could arrive days later). I
 can match the file to the metadata by a file name matching a field
 name.

 What is the best approach to do that staggered indexing with minimum
 custom code? I guess my fallback position is a custom full-text
 indexer agent that re-adds the metadata fields when the file is being
 indexed. Is there anything better?

 I am a newbie using v4.0alpha of SOLR (and loving it).

 Thank you,
 Alex.
 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all
 at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)


Re: Using Solr 3.4 running on tomcat7 - very slow search

2012-07-17 Thread Mou
Brian,

Thanks again.
swappiness is set to 60 and from vmstat , I can see no swapping is going on.
Also I am using fusion IO SSD for storing my index.

I also used the visualVM and it shows me that it is blocked on
lock=org.apache.lucene.index.SegmentCoreReaders@299172a7.

Any clue?


On Mon, Jul 16, 2012 at 10:38 PM, Bryan Loofbourrow [via Lucene] 
ml-node+s472066n3995452...@n3.nabble.com wrote:

 Another thing you may wish to ponder is this blog entry from Mike
 McCandless:
 http://blog.mikemccandless.com/2011/04/just-say-no-to-swapping.html

 In it, he discusses the poor interaction between OS swapping, and
 long-neglected allocations in a JVM. You're on Linux, which has decent
 control over swapping decisions, so you may find that a tweak is in order,
 especially if you can discover evidence that the hard drive is being
 worked hard during GC. If the problem exists, it might be especially
 pronounced in your large JVM.

 I have no direct evidence of thrashing during GC (I am not sure how to go
 about gathering such evidence), but I have seen, on a Windows machine, a
 Tomcat running Solr refuse to shut down for many minutes, while a Resource
 Monitor session reports that that same Tomcat process is frantically
 reading from the page file the whole time. So there is something besides
 plausibility to the idea.

 -- Bryan

  -Original Message-
  From: Mou [mailto:[hidden 
  email]http://user/SendEmail.jtp?type=nodenode=3995452i=0]

  Sent: Monday, July 16, 2012 9:09 PM
  To: [hidden email]http://user/SendEmail.jtp?type=nodenode=3995452i=1
  Subject: Re: Using Solr 3.4 running on tomcat7 - very slow search
 
  Thanks Brian. Excellent suggestion.
 
  I haven't used VisualVM before but I am going to use it to see where CPU
  is
  going. I saw that CPU is overly used. I haven't seen so much CPU use in
  testing.
  Although I think GC is not a problem, splitting the jvm per shard would
 be
  a good idea.
 
 
  On Mon, Jul 16, 2012 at 9:44 PM, Bryan Loofbourrow [via Lucene] 
  [hidden email] http://user/SendEmail.jtp?type=nodenode=3995452i=2
 wrote:
 
   5 min is ridiculously long for a query that used to take 65ms. That
  ought
   to be a great clue. The only two things I've seen that could cause
 that
   are thrashing, or GC. Hard to see how it could be thrashing, given
 your
   hardware, so I'd initially suspect GC.
  
   Aim VisualVM at the JVM. It shows how much CPU goes to GC over time,
 in

  a
   nice blue line. And if it's not GC, try out its Sampler tab, and see
  where
   the CPU is spending its time.
  
   FWIW, when asked at what point one would want to split JVMs and shard,
  on
   the same machine, Grant Ingersoll mentioned 16GB, and precisely for GC
   cost reasons. You're way above that. Maybe multiple JVMs and sharding,
   even on the same machine, would serve you better than a monster 70GB
  JVM.
  
   -- Bryan
  
-Original Message-
From: Mou [mailto:[hidden
  email]http://user/SendEmail.jtp?type=nodenode=3995446i=0]
  
Sent: Monday, July 16, 2012 7:43 PM
To: [hidden
  email]http://user/SendEmail.jtp?type=nodenode=3995446i=1
Subject: Using Solr 3.4 running on tomcat7 - very slow search
   
Hi,
   
Our index is divided into two shards and each of them has 120M docs
 ,
total
size 75G in each core.
The server is a pretty good one , jvm is given memory of 70G and
 about
same
is left for OS (SLES 11) .
   
We use all dynamic fields except th eunique id and are using long
   queries
but almost all of them are filter queires, Each query may have 10
 -30
  fq
parameters.
   
When I tested the index ( same size) but with max heap size 40 G,
   queries
  
were blazing fast. I used solrmeter to load test and it was happily
serving
12000 queries or more per min with avg 65 ms qtime.We had an
 excellent
filtercache hit ratio.
   
This index is only used for searching and being replicated every 7
 sec

from
the master.
   
But now in production server it is horribly slow and taking 5
   mins(qtime)
  
to
return a query ( same query).
What could go wrong?
   
Really appreciate your suggestions on debugging this thing..
   
   
   
--
View this message in context:
  http://lucene.472066.n3.nabble.com/Using-
Solr-3-4-running-on-tomcat7-very-slow-search-tp3995436.html
Sent from the Solr - User mailing list archive at Nabble.com.
  
  
   --
If you reply to this email, your message will be added to the
  discussion
   below:
  
   http://lucene.472066.n3.nabble.com/Using-Solr-3-4-running-on-tomcat7-
  very-slow-search-tp3995436p3995446.html
To unsubscribe from Using Solr 3.4 running on tomcat7 - very slow
  search, click
  
 
 herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=uns
 
 ubscribe_by_codenode=3995436code=bW91bmFuZGlAZ21haWwuY29tfDM5OTU0MzZ8Mjg
  1MTA5MTUw
   .
  
 
 

Could I use Solr to index multiple applications?

2012-07-17 Thread Zhang, Lisheng
Hi,
 
We have an application where we index data into many different directories 
(each directory
is corresponding to a different lucene IndexSearcher).
 
Looking at Solr config it seems that Solr expects there is only one indexed 
data directory,
can we use Solr for our application?
 
Thanks very much for helps, Lisheng
 


RE: SOLR 4 Alpha Out Of Mem Err

2012-07-17 Thread Nick Koton
After trying a number of things, I am successful in allowing the server to
auto commit and without having it hit thread/memory errors.  I have isolated
the required client change to replacing ConcurrentUpdateSolrServer with
HttpSolrServer.  I am able to maintain index rates of 3,000 documents/sec
with 6 shards and two servers per shard.  The servers receiving the index
requests hit steady state with approximately 800 threads per server.

So could there be something amiss in the server side implementation of
ConcurrentUpdateSolrServer?

Best regards,
Nick

-Original Message-
From: Nick Koton [mailto:nick.ko...@gmail.com] 
Sent: Monday, July 16, 2012 5:53 PM
To: 'solr-user@lucene.apache.org'
Subject: RE: SOLR 4 Alpha Out Of Mem Err

 That suggests you're running out of threads
Michael,
Thanks for this useful observation.  What I found just prior to the problem
situation was literally thousands of threads in the server JVM.  I have
pasted a few samples below obtained from the admin GUI.  I spent some time
today using this barometer, but I don't have enough to share right now.  I'm
looking at the difference between ConcurrentUpdateSolrServer and
HttpSolrServer and how my client may be misusing them.  I'll assume my
client is misbehaving and driving the server crazy for now.  If I figure out
how, I will share it so perhaps a safe guard can be put in place.

Nick


Server threads - very roughly 0.1 %:
cmdDistribExecutor-9-thread-7161 (10096)
java.util.concurrent.SynchronousQueue$TransferStack@17b90c55
.   sun.misc.Unsafe.park(Native Method)
.
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
.
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(Synchronous
Queue.java:424)
.
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueu
e.java:323)
.
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:874)
.
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:945)
.
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9
07)
.   java.lang.Thread.run(Thread.java:662)
-0.ms
-0.ms cmdDistribExecutor-9-thread-7160 (10086)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@5509b5
6
.   sun.misc.Unsafe.park(Native Method)
.   java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
.
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(
AbstractQueuedSynchronizer.java:1987)
.
org.apache.http.impl.conn.tsccm.WaitingThread.await(WaitingThread.java:158)
.
org.apache.http.impl.conn.tsccm.ConnPoolByRoute.getEntryBlocking(ConnPoolByR
oute.java:403)
.
org.apache.http.impl.conn.tsccm.ConnPoolByRoute$1.getPoolEntry(ConnPoolByRou
te.java:300)
.
org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager$1.getConnection(
ThreadSafeClientConnManager.java:224)
.
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDir
ector.java:401)
.
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.ja
va:820)
.
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.ja
va:754)
.
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.ja
va:732)
.
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java
:351)
.
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java
:182)
.
org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:325
)
.
org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:306
)
.   java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
.   java.util.concurrent.FutureTask.run(FutureTask.java:138)
.
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
.   java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
.   java.util.concurrent.FutureTask.run(FutureTask.java:138)
.
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja
va:886)
.
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9
08)
.   java.lang.Thread.run(Thread.java:662)
20.ms
20.ms cmdDistribExecutor-9-thread-7159 (10085)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@6f062d
d3
.   sun.misc.Unsafe.park(Native Method)
.   java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
.
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(
AbstractQueuedSynchronizer.java:1987)
.
org.apache.http.impl.conn.tsccm.WaitingThread.await(WaitingThread.java:158)
.
org.apache.http.impl.conn.tsccm.ConnPoolByRoute.getEntryBlocking(ConnPoolByR
oute.java:403)
.
org.apache.http.impl.conn.tsccm.ConnPoolByRoute$1.getPoolEntry(ConnPoolByRou
te.java:300)
.
org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager$1.getConnection(
ThreadSafeClientConnManager.java:224)
.
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDir
ector.java:401)
.

Solr 4.0 ALPHA: AbstractSolrTestCase depending on LuceneTestCase

2012-07-17 Thread Koorosh Vakhshoori
Hi,
  I have been developing extensions to SOLR code using 4.0 truck. For JUnit
testing I am extending AbstractSolrTestCase which in the ALPHA release is
located in JAR apache-solr-test-framework-4.0.0-ALPHA.jar. However, this
class extends LuceneTestCase which comes from JAR
lucene-test-framework-4.0-SNAPSHOT.jar. In the ALPHA release the later JAR
is not shipped or I can't find it. My question is which class should I use
for testing customized/extensions to SOLR/LUCENE code? Is there a better way
of doing this without build the lucene-test-framework-4.0-SNAPSHOT.jar from
the source code?

Thanks,

Koorosh


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-0-ALPHA-AbstractSolrTestCase-depending-on-LuceneTestCase-tp3995639.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Could I use Solr to index multiple applications?

2012-07-17 Thread Shashi Kant
Look up multicore solr. Another choice could be ElasticSearch - which
is more straightforward in managing multiple indexes IMO.



On Tue, Jul 17, 2012 at 7:53 PM, Zhang, Lisheng
lisheng.zh...@broadvision.com wrote:
 Hi,

 We have an application where we index data into many different directories 
 (each directory
 is corresponding to a different lucene IndexSearcher).

 Looking at Solr config it seems that Solr expects there is only one indexed 
 data directory,
 can we use Solr for our application?

 Thanks very much for helps, Lisheng



RE: Could I use Solr to index multiple applications?

2012-07-17 Thread Zhang, Lisheng
Thanks very much for quick help! Multicore sounds interesting,
I roughly read the doc, so we need to put each core name into
Solr config XML, if we add another core and change XML, do we
need to restart Solr?

Best regards, Lisheng

-Original Message-
From: shashi@gmail.com [mailto:shashi@gmail.com]On Behalf Of
Shashi Kant
Sent: Tuesday, July 17, 2012 5:46 PM
To: solr-user@lucene.apache.org
Subject: Re: Could I use Solr to index multiple applications?


Look up multicore solr. Another choice could be ElasticSearch - which
is more straightforward in managing multiple indexes IMO.



On Tue, Jul 17, 2012 at 7:53 PM, Zhang, Lisheng
lisheng.zh...@broadvision.com wrote:
 Hi,

 We have an application where we index data into many different directories 
 (each directory
 is corresponding to a different lucene IndexSearcher).

 Looking at Solr config it seems that Solr expects there is only one indexed 
 data directory,
 can we use Solr for our application?

 Thanks very much for helps, Lisheng



Re: Could I use Solr to index multiple applications?

2012-07-17 Thread Shashi Kant
My suggestion would be to look into Multi Tenancy http://www.elasticsearch.org/.
It is easy to setup and use for multiple indexes.


On Tue, Jul 17, 2012 at 9:26 PM, Zhang, Lisheng
lisheng.zh...@broadvision.com wrote:
 Thanks very much for quick help! Multicore sounds interesting,
 I roughly read the doc, so we need to put each core name into
 Solr config XML, if we add another core and change XML, do we
 need to restart Solr?

 Best regards, Lisheng

 -Original Message-
 From: shashi@gmail.com [mailto:shashi@gmail.com]On Behalf Of
 Shashi Kant
 Sent: Tuesday, July 17, 2012 5:46 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Could I use Solr to index multiple applications?


 Look up multicore solr. Another choice could be ElasticSearch - which
 is more straightforward in managing multiple indexes IMO.



 On Tue, Jul 17, 2012 at 7:53 PM, Zhang, Lisheng
 lisheng.zh...@broadvision.com wrote:
 Hi,

 We have an application where we index data into many different directories 
 (each directory
 is corresponding to a different lucene IndexSearcher).

 Looking at Solr config it seems that Solr expects there is only one indexed 
 data directory,
 can we use Solr for our application?

 Thanks very much for helps, Lisheng



Re: SOLR 4 Alpha Out Of Mem Err

2012-07-17 Thread Mark Miller

On Jul 17, 2012, at 8:08 PM, Nick Koton wrote:

 So could there be something amiss in the server side implementation of
 ConcurrentUpdateSolrServer?

See my earlier email. Once we decide on the appropriate change, we will address 
it.

- Mark Miller
lucidimagination.com













Re: Could I use Solr to index multiple applications?

2012-07-17 Thread Yury Kats
On 7/17/2012 9:26 PM, Zhang, Lisheng wrote:
 Thanks very much for quick help! Multicore sounds interesting,
 I roughly read the doc, so we need to put each core name into
 Solr config XML, if we add another core and change XML, do we
 need to restart Solr?

You can add/create cores on the fly, without restarting.
See http://wiki.apache.org/solr/CoreAdmin#CREATE


RE: Using Solr 3.4 running on tomcat7 - very slow search

2012-07-17 Thread Fuad Efendi

 FWIW, when asked at what point one would want to split JVMs and shard, 
 on the same machine, Grant Ingersoll mentioned 16GB, and precisely for 
 GC cost reasons. You're way above that.

- his index is 75G, and Grant mentioned RAM heap size; we can use terabytes
of index with 16Gb memory.







UTF-8

2012-07-17 Thread William Bell
-Dfile.encoding=UTF-8... Is this usually recommended for SOLR indexes?

Or is the encoding usually just handled by the servlet container like Jetty?

-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076


Re: Facet on all the dynamic fields with *_s feature

2012-07-17 Thread Rajani Maski
Hi Users,

  Any reply for the query below?


On Mon, Jul 16, 2012 at 6:27 PM, Rajani Maski rajinima...@gmail.com wrote:

 In this URL  -  https://issues.apache.org/jira/browse/SOLR-247

 there are *patches *and one patch with name *SOLR-247-FacetAllFields*

 Will that help me to fix this problem?

 If yes, how do I  add this to solr plugin ?


 Thanks  Regards
 Rajani




 On Mon, Jul 16, 2012 at 5:04 PM, Darren Govoni dar...@ontrenet.comwrote:

 You'll have to query the index for the fields and sift out the _s ones
 and cache them or something.

 On Mon, 2012-07-16 at 16:52 +0530, Rajani Maski wrote:

  Yes, This feature will solve the below problem very neatly.
 
  All,
 
   Is there any approach to achieve this for now?
 
 
  --Rajani
 
  On Sun, Jul 15, 2012 at 6:02 PM, Jack Krupansky 
 j...@basetechnology.comwrote:
 
   The answer appears to be No, but it's good to hear people express an
   interest in proposed features.
  
   -- Jack Krupansky
  
   -Original Message- From: Rajani Maski
   Sent: Sunday, July 15, 2012 12:02 AM
   To: solr-user@lucene.apache.org
   Subject: Facet on all the dynamic fields with *_s feature
  
  
   Hi All,
  
 Is this issue fixed in solr 3.6 or 4.0:  Faceting on all Dynamic
 field
   with facet.field=*_s
  
 Link  :  https://issues.apache.org/**jira/browse/SOLR-247
 https://issues.apache.org/jira/browse/SOLR-247
  
  
  
If it is not fixed, any suggestion on how do I achieve this?
  
  
   My requirement is just same as this one :
   http://lucene.472066.n3.**nabble.com/Dynamic-facet-**
   field-tc2979407.html#none
 http://lucene.472066.n3.nabble.com/Dynamic-facet-field-tc2979407.html#none
 
  
  
   Regards
   Rajani
  






Re: configuring solr3.6 for a large intensive index only run

2012-07-17 Thread nanshi
1) In SolrConfig.xml, find ramBufferSizeMB and change to:
 ramBufferSizeMB1024/ramBufferSizeMB

2) Also, try decrease the mergefactor to see if it will give you less
segments. In my experiment, it does.


--
View this message in context: 
http://lucene.472066.n3.nabble.com/configuring-solr3-6-for-a-large-intensive-index-only-run-tp3985733p3995659.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: UTF-8

2012-07-17 Thread Paul Libbrecht

My experience is that this property has made a whole lot of a difference. At 
least till solr 3.1.
The servlet container has not been the only bit.

paul

Le 18 juil. 2012 à 05:12, William Bell a écrit :

 -Dfile.encoding=UTF-8... Is this usually recommended for SOLR indexes?
 
 Or is the encoding usually just handled by the servlet container like Jetty?
 
 -- 
 Bill Bell
 billnb...@gmail.com
 cell 720-256-8076