data directory under solr

2008-07-23 Thread dudes dudes

Hi all, 

I have changed the data directory from /solr/data to /var/solr/data ...The 
directory is created once I start solr from tomcat. However;
I'm a bit confused once it comes to indexing data to /var/solr/data/index. 

NB: I'm using the post script (post.sh) that is provided by solr itself. 

To what do I have to change URL=http://localhost:8080/solr/update in post.sh  ?

thanks for any suggestions and thoughts.
ak

_
Find the best and worst places on the planet
http://clk.atdmt.com/UKM/go/101719807/direct/01/

Re: Out of memory on Solr sorting

2008-07-23 Thread Norberto Meijome
On Tue, 22 Jul 2008 20:19:49 +
sundar shankar <[EMAIL PROTECTED]> wrote:

> Thanks for the explanation mark. The reason I had it as 512 max was cos 
> earlier the data file was just about 30 megs and it increased to this much 
> for of the usage of EdgeNGramFactoryFilter for 2 fields. Thats great to know 
> it just happens for the first search. But this exception has been occuring 
> for me for the whole of today. Should I fiddle around with the warmer 
> settings too? I have also instructed an increase in Heap to 1024. Will keep 
> you posted on the turn arounds.

have you tried reducing the number of documents (leaving every setting stable, 
of course) to see at what point you are safe ? it may show tell you something 
about the relationship between # of *your* docs and memory needed.

b

_
{Beto|Norberto|Numard} Meijome

"A tyrant...is always stirring up some war or other, in order that the people 
may require a leader."
  Plato

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: data directory under solr

2008-07-23 Thread Noble Paul നോബിള്‍ नोब्ळ्
The url remains same irrespective of the data directory


On Wed, Jul 23, 2008 at 12:34 PM, dudes dudes <[EMAIL PROTECTED]> wrote:
>
> Hi all,
>
> I have changed the data directory from /solr/data to /var/solr/data ...The 
> directory is created once I start solr from tomcat. However;
> I'm a bit confused once it comes to indexing data to /var/solr/data/index.
>
> NB: I'm using the post script (post.sh) that is provided by solr itself.
>
> To what do I have to change URL=http://localhost:8080/solr/update in post.sh  
> ?
>
> thanks for any suggestions and thoughts.
> ak
>
> _
> Find the best and worst places on the planet
> http://clk.atdmt.com/UKM/go/101719807/direct/01/



-- 
--Noble Paul


RE: data directory under solr

2008-07-23 Thread dudes dudes

Noble, 
thanks very much,,, just tested it and works fine :)

ak

> Date: Wed, 23 Jul 2008 12:48:52 +0530
> From: [EMAIL PROTECTED]
> To: solr-user@lucene.apache.org
> Subject: Re: data directory under solr
> 
> The url remains same irrespective of the data directory
> 
> 
> On Wed, Jul 23, 2008 at 12:34 PM, dudes dudes  wrote:
>>
>> Hi all,
>>
>> I have changed the data directory from /solr/data to /var/solr/data ...The 
>> directory is created once I start solr from tomcat. However;
>> I'm a bit confused once it comes to indexing data to /var/solr/data/index.
>>
>> NB: I'm using the post script (post.sh) that is provided by solr itself.
>>
>> To what do I have to change URL=http://localhost:8080/solr/update in post.sh 
>>  ?
>>
>> thanks for any suggestions and thoughts.
>> ak
>>
>> _
>> Find the best and worst places on the planet
>> http://clk.atdmt.com/UKM/go/101719807/direct/01/
> 
> 
> 
> -- 
> --Noble Paul

_
Play and win great prizes with Live Search and Kung Fu Panda
http://clk.atdmt.com/UKM/go/101719966/direct/01/

Re: Vote on a new solr logo

2008-07-23 Thread Shalin Shekhar Mangar
54 votes and counting! Let's give it one more day and close it tomorrow
(July 24, 2008)

Preliminary results may bias the poll :)

On Wed, Jul 23, 2008 at 5:03 AM, Chris Harris <[EMAIL PROTECTED]> wrote:

> How about releasing the preliminary results so we can see if a run-off
> is in order!
>
> On Tue, Jul 22, 2008 at 6:37 AM, Mark Miller <[EMAIL PROTECTED]>
> wrote:
> > My opinion: if its already a runaway, we might as well not prolong
> things.
> > If not though, we should probably give some time for any possible
> laggards.
> > The 'admin look' poll received its first 19-20 votes in the first night /
> > morning, and has only gotten 2 or 3 since then, so probably no use going
> to
> > long.
> >
> > - Mark
> >
> > Shalin Shekhar Mangar wrote:
> >>
> >> 28 votes so far and counting!
> >>
> >> When should we close this poll?
>



-- 
Regards,
Shalin Shekhar Mangar.


RE: Out of memory on Solr sorting

2008-07-23 Thread Daniel Alheiros
Hi

I haven't read the whole thread so I will take my chances here.

I've been fighting recently to keep my Solr instances stable because
they were frequently crashing with OutOfMemoryErrors. I'm using Solr 1.2
and when it happens there is a bug that makes the index locked unless
you restart Solr... So in my cenario it was extremelly damaging.

After some profiling I realized that my major problem was caused by the
way the JVM heap was being used as I haven't configured it to run using
any advanced configuration (I had just made it bigger - Xmx and Xms 1.5
Gb), it's running on Sun JVM 1.5 (the most recent 1.5 available) and
it's deployed on a Jboss 4.2 on a RHEL. 

So my findings were too many objects were being allocated on the old
generation area of the heap, which makes them harder to be disposed, and
also the default behaviour was letting the heap get too filled up before
kicking a GC and according to the JVM specs the default is if after a
short period when a full gc is executed if a certain percentage of the
heap is not freed an OutOfMemoryError should be thrown.

I've changed my JVM startup params and it's working extremelly stable
since then:

-Xmx2048m -Xms2048m -XX:MinHeapFreeRatio=50 -XX:NewSize=1024m
-XX:NewRatio=2 -Dsun.rmi.dgc.client.gcInterval=360
-Dsun.rmi.dgc.server.gcInterval=360

I hope it helps.

Regards,
Daniel Alheiros

-Original Message-
From: Fuad Efendi [mailto:[EMAIL PROTECTED] 
Sent: 22 July 2008 23:23
To: solr-user@lucene.apache.org
Subject: RE: Out of memory on Solr sorting

Yes, it is a cache, it stores "sorted" by "sorted field" array of
Document IDs together with sorted fields; query results can intersect
with it and reorder accordingly.

But memory requirements should be well documented.

It uses internally WeakHashMap which is not good(!!!) - a lot of
"underground" warming ups of caches which SOLR is not aware of...  
Could be.

I think Lucene-SOLR developers should join this discussion:


/**
  * Expert: The default cache implementation, storing all values in
memory.
  * A WeakHashMap is used for storage.
  *
..

   // inherit javadocs
   public StringIndex getStringIndex(IndexReader reader, String field)
   throws IOException {
 return (StringIndex) stringsIndexCache.get(reader, field);
   }

   Cache stringsIndexCache = new Cache() {

 protected Object createValue(IndexReader reader, Object fieldKey)
 throws IOException {
   String field = ((String) fieldKey).intern();
   final int[] retArray = new int[reader.maxDoc()];
   String[] mterms = new String[reader.maxDoc()+1];
   TermDocs termDocs = reader.termDocs();
   TermEnum termEnum = reader.terms (new Term (field, ""));






Quoting Fuad Efendi <[EMAIL PROTECTED]>:

> I am hoping [new StringIndex (retArray, mterms)] is called only once 
> per-sort-field and cached somewhere at Lucene;
>
> theoretically you need multiply number of documents on size of field 
> (supposing that field contains unique text); you need not tokenize 
> this field; you need not store TermVector.
>
> for 2 000 000 documents with simple untokenized text field such as 
> title of book (256 bytes) you need probably 512 000 000 bytes per 
> Searcher, and as Mark mentioned you should limit number of searchers 
> in SOLR.
>
> So that Xmx512M is definitely not enough even for simple cases...
>
>
> Quoting sundar shankar <[EMAIL PROTECTED]>:
>
>> I haven't seen the source code before, But I don't know why the
>> sorting isn't done after the fetch is done. Wouldn't that make it
>> more faster. at least in case of field level sorting? I could be
>> wrong too and the implementation might probably be better. But   
>> don't  know why all of the fields have had to be loaded.
>>
>>
>>
>>
>>
>>> Date: Tue, 22 Jul 2008 14:26:26 -0700> From: [EMAIL PROTECTED]> To:
>>> solr-user@lucene.apache.org> Subject: Re: Out of memory on Solr
>>> sorting> > > Ok, after some analysis of FieldCacheImpl:> > - it is
>>>   supposed that (sorted) Enumeration of "terms" is less than >   
>>> total  number of documents> (so that SOLR uses specific field type  
>>> for  sorted searches: > solr.StrField with omitNorms="true")> >
>>> It   creates int[reader.maxDoc()] array, checks (sorted)  
>>> Enumeration  of  > "terms" (untokenized solr.StrField), and 
>>> populates array  with  document > Ids.> > > - it also creates array 
>>> of String>  String[]  mterms = new String[reader.maxDoc()+1];> > Why

>>> do we  need that? For  1G
>>> document with average term/StrField size > of  100 bytes (which   
>>> could be unique text!!!) it will create kind of  > huge 100Gb cache

>>> which is not really needed...> StringIndex  value = new StringIndex

>>> (retArray, mterms);> > If I understand  correctly...
>>> StringIndex  _must_ be a file in a > filesystem for  such a case... 
>>> We create  StringIndex, and retrieve top > 10  documents, huge 
>>> overhead.> > > >
 > Quoting Fuad E

Re: spellchecker problems (bugs)

2008-07-23 Thread Jonathan Lee
I ran into a similar issue and found that I am able to get around it by:

1. Similar to what https://issues.apache.org/jira/browse/SOLR-622 will do,
issue a spellcheck.reload=true command on the firstSearcher event to read
any existing index off disk. Here are the relevant parts of my
solrconfig.xml:

  

   
myHandler
*:*
0
true
true
  

  
  
... 

  spellcheck

  
  

  default
  name_spell
  ...

text_spell
  


2. I believe there is a bug in IndexBased- and FileBasedSpellChecker.java
where the analyzer variable is only set on the build command. Therefore,
when the index is reloaded, but not built after starting solr, issuing a
query with the spellcheck.q parameter will cause a NullPointerException to
be thrown (SpellCheckComponent.java:158). Moving the analyzer logic to the
constructor seems to fix the problem.

I did not see a jira ticket for this (nor am I sure it's a real bug :), so I
have attached a patch with these changes. Please let me know if I have
overlooked something here and if I should attach this to an actual ticket.

-Jonathan


> From: Geoffrey Young <[EMAIL PROTECTED]>
> Reply-To: 
> Date: Tue, 22 Jul 2008 11:07:41 -0400
> To: 
> Subject: Re: spellchecker problems (bugs)
> 
> 
> 
> Shalin Shekhar Mangar wrote:
>> The problems you described in the spellchecker are noted in
>> https://issues.apache.org/jira/browse/SOLR-622 -- I shall create an issue to
>> synchronize spellcheck.build so that the index is not corrupted.
> 
> I'd like to discuss this a little...
> 
> I'm not sure that I want to rebuild the spelling index each time the
> underlying data index changes - the process takes very long and my
> updates are frequent changes to non-spelling related data.
> 
> what I'd really like is for a change to my index to not cause an
> exception.  IIRC the "old" way of using a spellchecker didn't work like
> this at all - I could completely rm data/index and leave data/spell in
> place, add new data, not issue cmd=build and the spelling parts still
> worked just fine (albeit with old data).
> 
> not to say that SOLR-622 isn't a good idea (it is) but I don't really
> think the entire solution is keeping the spellcheck index in sync.  do
> they need to be kept in sync for things not to implode on me?
> 
> --Geoff



spell-checker and faceting

2008-07-23 Thread dudes dudes

Hi, 

I'm trying to couple spell-checking mechanism with faceting in one url 
statement.. I can get the spell check right, but the facet doesn't work when 
it's combined 
with spell-checker... 

http://localhost:8080/solr/spellCheckCompRH?q=smath&spellcheck.q=smath&spellcheck=true&spellcheck.build=true&select?q=smath&rows=0&facet=true&facet.limit=1&facet.field=firstname
 

it corrects smath to Smith, but doesn't facet it.

thanks for your time
ak
_
The John Lewis Clearance - save up to 50% with FREE delivery
http://clk.atdmt.com/UKM/go/101719806/direct/01/

Re: spellchecker problems (bugs)

2008-07-23 Thread Geoffrey Young



2. I believe there is a bug in IndexBased- and FileBasedSpellChecker.java
where the analyzer variable is only set on the build command. Therefore,
when the index is reloaded, but not built after starting solr, issuing a
query with the spellcheck.q parameter will cause a NullPointerException to
be thrown (SpellCheckComponent.java:158). Moving the analyzer logic to the
constructor seems to fix the problem.

I did not see a jira ticket for this (nor am I sure it's a real bug :), so I
have attached a patch with these changes. Please let me know if I have
overlooked something here and if I should attach this to an actual ticket.


I don't see a patch, but I was just about to reply to the previous post 
in this thread that I thought a new jira issue was warranted - I see the 
exact same exception in line 158 under the exact circumstances, using 
current trunk as of this minute.


as far as I can tell, 622 would help... if I wanted to build on each 
start, which I may not.  the exception seems like a different issue that 
should be separately tracked.


--Geoff


Re: spell-checker and faceting

2008-07-23 Thread Geoffrey Young



dudes dudes wrote:
Hi, 

I'm trying to couple spell-checking mechanism with faceting in one url statement.. I can get the spell check right, but the facet doesn't work when it's combined 
with spell-checker... 

http://localhost:8080/solr/spellCheckCompRH?q=smath&spellcheck.q=smath&spellcheck=true&spellcheck.build=true&select?q=smath&rows=0&facet=true&facet.limit=1&facet.field=firstname 


it corrects smath to Smith, but doesn't facet it.


I was able to get faceting working without issue.  it seems to me your 
query string is off - note the 'select?q=smath' in the middle of your 
query.  I'd try again with that part removed.


also note you only need to spellcheck.build=true once, not on each request.

--Geoff


Re: spellchecker problems (bugs)

2008-07-23 Thread Jonathan Lee
I don't see the patch attached to my original email either -- does solr-user
not allow attachments?

This is ugly, but here's the patch inline:

Index: src/test/org/apache/solr/spelling/FileBasedSpellCheckerTest.java
===
--- src/test/org/apache/solr/spelling/FileBasedSpellCheckerTest.java
(revision 679057)
+++ src/test/org/apache/solr/spelling/FileBasedSpellCheckerTest.java
(working copy)
@@ -70,7 +70,7 @@
 indexDir.mkdirs();
 spellchecker.add(FileBasedSpellChecker.INDEX_DIR,
indexDir.getAbsolutePath());
 SolrCore core = h.getCore();
-String dictName = checker.init(spellchecker, core.getResourceLoader());
+String dictName = checker.init(spellchecker, core);
 assertTrue(dictName + " is not equal to " + "external",
dictName.equals("external") == true);
 checker.build(core, null);
 
@@ -108,7 +108,7 @@
 spellchecker.add(FileBasedSpellChecker.FIELD_TYPE, "teststop");
 spellchecker.add(AbstractLuceneSpellChecker.SPELLCHECKER_ARG_NAME,
spellchecker);
 SolrCore core = h.getCore();
-String dictName = checker.init(spellchecker, core.getResourceLoader());
+String dictName = checker.init(spellchecker, core);
 assertTrue(dictName + " is not equal to " + "external",
dictName.equals("external") == true);
 checker.build(core, null);
 
@@ -149,7 +149,7 @@
 spellchecker.add(AbstractLuceneSpellChecker.SPELLCHECKER_ARG_NAME,
spellchecker);
 
 SolrCore core = h.getCore();
-String dictName = checker.init(spellchecker, core.getResourceLoader());
+String dictName = checker.init(spellchecker, core);
 assertTrue(dictName + " is not equal to " + "external",
dictName.equals("external") == true);
 checker.build(core, null);
 
Index: src/test/org/apache/solr/spelling/IndexBasedSpellCheckerTest.java
===
--- src/test/org/apache/solr/spelling/IndexBasedSpellCheckerTest.java
(revision 679057)
+++ src/test/org/apache/solr/spelling/IndexBasedSpellCheckerTest.java
(working copy)
@@ -104,7 +104,7 @@
 spellchecker.add(AbstractLuceneSpellChecker.SPELLCHECKER_ARG_NAME,
spellchecker);
 SolrCore core = h.getCore();
 
-String dictName = checker.init(spellchecker, core.getResourceLoader());
+String dictName = checker.init(spellchecker, core);
 assertTrue(dictName + " is not equal to " +
SolrSpellChecker.DEFAULT_DICTIONARY_NAME,
 dictName.equals(SolrSpellChecker.DEFAULT_DICTIONARY_NAME) ==
true);
 RefCounted holder = core.getSearcher();
@@ -177,7 +177,7 @@
 spellchecker.add(IndexBasedSpellChecker.FIELD, "title");
 spellchecker.add(AbstractLuceneSpellChecker.SPELLCHECKER_ARG_NAME,
spellchecker);
 SolrCore core = h.getCore();
-String dictName = checker.init(spellchecker, core.getResourceLoader());
+String dictName = checker.init(spellchecker, core);
 assertTrue(dictName + " is not equal to " +
SolrSpellChecker.DEFAULT_DICTIONARY_NAME,
 dictName.equals(SolrSpellChecker.DEFAULT_DICTIONARY_NAME) ==
true);
 RefCounted holder = core.getSearcher();
@@ -233,7 +233,7 @@
 spellchecker.add(AbstractLuceneSpellChecker.SPELLCHECKER_ARG_NAME,
spellchecker);
 spellchecker.add(AbstractLuceneSpellChecker.STRING_DISTANCE,
JaroWinklerDistance.class.getName());
 SolrCore core = h.getCore();
-String dictName = checker.init(spellchecker, core.getResourceLoader());
+String dictName = checker.init(spellchecker, core);
 assertTrue(dictName + " is not equal to " +
SolrSpellChecker.DEFAULT_DICTIONARY_NAME,
 dictName.equals(SolrSpellChecker.DEFAULT_DICTIONARY_NAME) ==
true);
 RefCounted holder = core.getSearcher();
@@ -283,7 +283,7 @@
 spellchecker.add(IndexBasedSpellChecker.FIELD, "title");
 spellchecker.add(AbstractLuceneSpellChecker.SPELLCHECKER_ARG_NAME,
spellchecker);
 SolrCore core = h.getCore();
-String dictName = checker.init(spellchecker, core.getResourceLoader());
+String dictName = checker.init(spellchecker, core);
 assertTrue(dictName + " is not equal to " +
SolrSpellChecker.DEFAULT_DICTIONARY_NAME,
 dictName.equals(SolrSpellChecker.DEFAULT_DICTIONARY_NAME) ==
true);
 RefCounted holder = core.getSearcher();
Index: src/java/org/apache/solr/handler/component/SpellCheckComponent.java
===
--- src/java/org/apache/solr/handler/component/SpellCheckComponent.java
(revision 679057)
+++ src/java/org/apache/solr/handler/component/SpellCheckComponent.java
(working copy)
@@ -243,10 +243,9 @@
   String className = (String) spellchecker.get("classname");
   if (className == null)
 className = IndexBasedSpellChecker.class.getName();
-  SolrResourceLoader loader = core.getResourceLoader();
-  SolrSpellChecker checker = (SolrSpellChecker)
loader.newInstance(className);
+  SolrSpellChecker checker = (SolrSpellChecke

Re: spellchecker problems (bugs)

2008-07-23 Thread Geoffrey Young



Jonathan Lee wrote:

I don't see the patch attached to my original email either -- does solr-user
not allow attachments?

This is ugly, but here's the patch inline:


issue created in jira:

  https://issues.apache.org/jira/browse/SOLR-648

--Geoff


performance implications on using lots of values in fq

2008-07-23 Thread briand

I have documents in SOLR such that each document contains one to many points
(latitude and longitudes).   Currently we store the multiple points for a
given document in the db and query the db to find all of the document ids
around a given point first.   Once we have the list of ids, we populate the
fq with those ids and the q value and send that off to SOLR to do a search.  
In the "longest" query to SOLR we're populating about 450 ids into the fq
parameter at this time.   I was wondering if anyone knows the performance
implications of passing so many ids into the fq and when it would
potentially be a problem for SOLR?   Currently the query passing in 450 ids
is not a problem at all and returns in less than a second.   Thanks. 
-- 
View this message in context: 
http://www.nabble.com/performance-implications-on-using-lots-of-values-in-fq-tp18617397p18617397.html
Sent from the Solr - User mailing list archive at Nabble.com.



Adding the Lucene org.apache.lucene.analysis.compound.HyphenationCompoundWordTokenFilter to solr for german compound words

2008-07-23 Thread Barry Harding
Hi can anybody point me in the right direction in how I go about adding
the 

org.apache.lucene.analysis.compound.HyphenationCompoundWordTokenFilter

Token filter to the solr schema.xml.

 

 

I need to be able to break German compound words, and from what I have
read this Token filter would seem to be what I need to use, my question
is how do I configure SOLR to use this filter text field types.

 

Is it possible to just call it directly from the confog file or do I
need to wrap it in a custom class in some way

 

Thanks

 

Barry H



Misco is a division of Systemax Europe Ltd.  Registered in Scotland Number 
114143.  Registered Office: Caledonian Exchange, 19a Canning Street, Edinburgh 
EH3 8EG.  Telephone +44 (0)1933 686000.

Re: Adding the Lucene org.apache.lucene.analysis.compound.HyphenationCompoundWordTokenFilter to solr for german compound words

2008-07-23 Thread Grant Ingersoll

See http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

Essentially, you need to create a TokenFilterFactory that wraps it.   
Please feel free to donate it, too, if you can.


-Grant

On Jul 23, 2008, at 2:42 PM, Barry Harding wrote:

Hi can anybody point me in the right direction in how I go about  
adding

the

org.apache.lucene.analysis.compound.HyphenationCompoundWordTokenFilter

Token filter to the solr schema.xml.





I need to be able to break German compound words, and from what I have
read this Token filter would seem to be what I need to use, my  
question

is how do I configure SOLR to use this filter text field types.



Is it possible to just call it directly from the confog file or do I
need to wrap it in a custom class in some way



Thanks



Barry H



Misco is a division of Systemax Europe Ltd.  Registered in Scotland  
Number 114143.  Registered Office: Caledonian Exchange, 19a Canning  
Street, Edinburgh EH3 8EG.  Telephone +44 (0)1933 686000.


--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ









Re: Adding the Lucene org.apache.lucene.analysis.compound.HyphenationCompoundWordTokenFilter to solr for german compound words

2008-07-23 Thread Chris Hostetter

FYI: In general we try to make sure that whenever posible we have a 
Factory for any TokenFilter or Tkenizer that ships with Lucene-Core or the 
Lucene Analysis contrib ... we have a stub-analysis-factory-maker.pl 
script that automates this in most cases, and requires a small amount of 
coding for others -- but in some cases there is no easy way to create a 
"generic" factor for a TokenFilter, HyphenationCompoundWordTokenFilter is 
an example of this becuase it requires a HyphenationTree to construct it, 
and HyphenationTree is a fairly complicated class, that didnt' lend itself 
to an easy XML configuration for construction.

But if you have a specific HyphenationTree instance you wnat to use, you 
can hardcode that into a custom TokenFilterFactory.

*BUT* before you do that, consider whether or not the 
DictionaryCompoundWordTokenFilter will meet your needs -- there is already 
a Solr Factory checked in for that.

: See http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
: 
: Essentially, you need to create a TokenFilterFactory that wraps it.  Please
: feel free to donate it, too, if you can.
: 
: -Grant
: 
: On Jul 23, 2008, at 2:42 PM, Barry Harding wrote:
: 
: > Hi can anybody point me in the right direction in how I go about adding
: > the
: > 
: > org.apache.lucene.analysis.compound.HyphenationCompoundWordTokenFilter
: > 
: > Token filter to the solr schema.xml.
: > 
: > 
: > 
: > 
: > 
: > I need to be able to break German compound words, and from what I have
: > read this Token filter would seem to be what I need to use, my question
: > is how do I configure SOLR to use this filter text field types.
: > 
: > 
: > 
: > Is it possible to just call it directly from the confog file or do I
: > need to wrap it in a custom class in some way
: > 
: > 
: > 
: > Thanks
: > 
: > 
: > 
: > Barry H
: > 
: > 
: > 
: > Misco is a division of Systemax Europe Ltd.  Registered in Scotland Number
: > 114143.  Registered Office: Caledonian Exchange, 19a Canning Street,
: > Edinburgh EH3 8EG.  Telephone +44 (0)1933 686000.
: 
: --
: Grant Ingersoll
: http://www.lucidimagination.com
: 
: Lucene Helpful Hints:
: http://wiki.apache.org/lucene-java/BasicsOfPerformance
: http://wiki.apache.org/lucene-java/LuceneFAQ
: 
: 
: 
: 
: 
: 
: 



-Hoss



changing fileds name

2008-07-23 Thread anshuljohri

Hi,

I need to change the filed names in schema.xml eg. default names are
id,sku,name,text etc. But i want to use my own name instead of these names.
Lets say i use title, desc, sub, cat respectively. Than where i have to put
my changes. I see that these default names are used in solrconfig.xml also
at many places.

I tried a lot but couldn't do all the changes. Can anyone plz guide me. As i
am new to Solr, plz help me.
Is there any separate tutorial for this which can guide me?

Thanks,
Anshul Johri
-- 
View this message in context: 
http://www.nabble.com/changing-fileds-name-tp18620015p18620015.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Out of memory on Solr sorting

2008-07-23 Thread sundar shankar

Hi Daniel,
 I am afraid that didnt solve my problem. I was guessing my problem 
was that I have too much of data and too little memory allocated for that. I 
happened to read in couple of the posts which mentioned that I need VM that is 
close to the size of my data(folder). I have like 540 Megs now and a little 
more than a million and a half docs. Ideally in that case 512 megs should be 
enough for me. In fact I am able to perform all other operations now, commit, 
optmize, select, update, nightly cron jobs to index data again. etc etc with no 
hassles. Even my load tests perform very well. Just the sort and it doesnt seem 
to work. I allocated 2 gigs of memory now. Still same results. Used the GC 
params u gave me too. No change what so ever. Am not sure, whats going on. Is 
there something that I can do to find out how much is needed in actuality as my 
production server might need to be configured in accordance.

I dont store any documents. We basically fetch standard column data from oracle 
database store them into Solr fields. Before I had EdgeNGram configured and had 
Solr 1.2, My data size was less that half of what it is right now. I guess if I 
remember right, it was of the order of 100 megs. The max size of a field right 
now might not cross a 100 chars too. Quizzled even more now. 

-Sundar

P.S: My configurations : 
Solr 1.3 
Red hat 
540 megs of data (1855013 docs)
2 gigs of memory installed and allocated like this
JAVA_OPTS=$JAVA_OPTS -Xms2048m -Xmx2048m -XX:MinHeapFreeRatio=50 
-XX:NewSize=1024m -XX:NewRatio=2 -Dsun.rmi.dgc.client.gcInterval=360 
-Dsun.rmi.dgc.server.gcInterval=360

Jboss 4.05


> Subject: RE: Out of memory on Solr sorting
> Date: Wed, 23 Jul 2008 10:49:06 +0100
> From: [EMAIL PROTECTED]
> To: solr-user@lucene.apache.org
> 
> Hi
> 
> I haven't read the whole thread so I will take my chances here.
> 
> I've been fighting recently to keep my Solr instances stable because
> they were frequently crashing with OutOfMemoryErrors. I'm using Solr 1.2
> and when it happens there is a bug that makes the index locked unless
> you restart Solr... So in my cenario it was extremelly damaging.
> 
> After some profiling I realized that my major problem was caused by the
> way the JVM heap was being used as I haven't configured it to run using
> any advanced configuration (I had just made it bigger - Xmx and Xms 1.5
> Gb), it's running on Sun JVM 1.5 (the most recent 1.5 available) and
> it's deployed on a Jboss 4.2 on a RHEL. 
> 
> So my findings were too many objects were being allocated on the old
> generation area of the heap, which makes them harder to be disposed, and
> also the default behaviour was letting the heap get too filled up before
> kicking a GC and according to the JVM specs the default is if after a
> short period when a full gc is executed if a certain percentage of the
> heap is not freed an OutOfMemoryError should be thrown.
> 
> I've changed my JVM startup params and it's working extremelly stable
> since then:
> 
> -Xmx2048m -Xms2048m -XX:MinHeapFreeRatio=50 -XX:NewSize=1024m
> -XX:NewRatio=2 -Dsun.rmi.dgc.client.gcInterval=360
> -Dsun.rmi.dgc.server.gcInterval=360
> 
> I hope it helps.
> 
> Regards,
> Daniel Alheiros
> 
> -Original Message-
> From: Fuad Efendi [mailto:[EMAIL PROTECTED] 
> Sent: 22 July 2008 23:23
> To: solr-user@lucene.apache.org
> Subject: RE: Out of memory on Solr sorting
> 
> Yes, it is a cache, it stores "sorted" by "sorted field" array of
> Document IDs together with sorted fields; query results can intersect
> with it and reorder accordingly.
> 
> But memory requirements should be well documented.
> 
> It uses internally WeakHashMap which is not good(!!!) - a lot of
> "underground" warming ups of caches which SOLR is not aware of...  
> Could be.
> 
> I think Lucene-SOLR developers should join this discussion:
> 
> 
> /**
>   * Expert: The default cache implementation, storing all values in
> memory.
>   * A WeakHashMap is used for storage.
>   *
> ..
> 
>// inherit javadocs
>public StringIndex getStringIndex(IndexReader reader, String field)
>throws IOException {
>  return (StringIndex) stringsIndexCache.get(reader, field);
>}
> 
>Cache stringsIndexCache = new Cache() {
> 
>  protected Object createValue(IndexReader reader, Object fieldKey)
>  throws IOException {
>String field = ((String) fieldKey).intern();
>final int[] retArray = new int[reader.maxDoc()];
>String[] mterms = new String[reader.maxDoc()+1];
>TermDocs termDocs = reader.termDocs();
>TermEnum termEnum = reader.terms (new Term (field, ""));
> 
> 
> 
> 
> 
> 
> Quoting Fuad Efendi <[EMAIL PROTECTED]>:
> 
> > I am hoping [new StringIndex (retArray, mterms)] is called only once 
> > per-sort-field and cached somewhere at Lucene;
> >
> > theoretically you need multiply number of documents on size of field 
> > (supposing that field con