date:20081027

Error in Integrating JBoss 4.2 and Solr-1.3.0:

2008-10-27 Thread con


I am trying to integrate JBOSS and Solr(multicore)

To get started, I am trying to deploy a single instance of solr.
1) I have Edited
C:/jboss/jboss-4.2.1.GA/server/default/conf/jboss-service.xml and entered
the following details:

  
  
 http://www.w3.org/2001/XMLSchema-instance";
xmlns:jndi="urn:jboss:jndi-binding-service:1.0"
   
xs:schemaLocation="urn:jboss:jndi-binding-service:1.0resource:jndi-binding-service_1_0.xsd"
>

C:\apache-solr-1.3.0\example\solr

 
  

   
2) Copied the war file from the solr distribution, converted it to solr.zip,
unzipped and made the following changes and again bundled to zip and then to
war, and pasted to the default/deploy folder of JBOSS.
 
a)  Edited the web.xml and added the following just before the
 tag :

 
 solr/home
 java.lang.String
 

 b) Created a jboss-web.xml file inside the WEB-INF folder
The file contains :

  
   solr
   
   solr/home
   /solr/home
   
   
  
  
But when I start the server I am getting the following error.

18:25:25,229 ERROR [URLDeploymentScanner] Incomplete Deployment listing:

--- Packages waiting for a deployer ---
[EMAIL PROTECTED] {
url=file:/C:/jboss/jboss-4.2.1.GA/server/default/deploy/solr.war }
  deployer: null
  status: null
  state: INIT_WAITING_DEPLOYER
  watch: file:/C:/jboss/jboss-4.2.1.GA/server/default/deploy/solr.war
  altDD: null
  lastDeployed: 1224766525227
  lastModified: 1224766525226
  mbeans:

--- Incompletely deployed packages ---
[EMAIL PROTECTED] {
url=file:/C:/jboss/jboss-4.2.1.GA/server/default/deploy/solr.war }
  deployer: null
  status: null
  state: INIT_WAITING_DEPLOYER
  watch: file:/C:/jboss/jboss-4.2.1.GA/server/default/deploy/solr.war
  altDD: null
  lastDeployed: 1224766525227
  lastModified: 1224766525226
  mbeans:


Is there anything wrong in the steps i followed or whether i missed some
steps?
Is there any other good docs in this subject.
Also where can I specify the index directory?

Any suggestion/advice is really appreciable.

Thanks
con 
-- 
View this message in context: 
http://www.nabble.com/Error-in-Integrating-JBoss-4.2-and-Solr-1.3.0%3A-tp20202032p20202032.html
Sent from the Solr - User mailing list archive at Nabble.com.

Question about textTight

2008-10-27 Thread Stephen Weiss


Hi,

So I've been using the textTight field to hold filenames, and I've run  
into a weird problem.  Basically, people want to search by part of a  
filename (say, the filename is stm0810m_ws_001ftws and they want to  
find everything starting with stm0810m_ (stm0810m_*).  I'm hoping  
someone might have done this before (I bet someone has).


Lots of things work - you can search for stm0810m_ws_001ftws and get a  
result, or (stm 0810 m*), or various other combinations.  What does  
not work, is searching for (stm0810m_*) or (stm 0810 m_*) or anything  
like that - a problem, because often they don't want things with ma_  
or mx_, but just m_.  It's almost like underscores just break  
everything, escaping them does nothing.


Here's the field definition (it should be what came with my solr):

positionIncrementGap="100" >

  

synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
words="stopwords.txt"/>
generateWordParts="0" generateNumberParts="0" catenateWords="1"  
catenateNumbers="1" catenateAll="0"/>


protected="protwords.txt"/>


  


and usage:

   


Now, I thought textTight would be good because it's the one best  
suited for SKU's, but I guess I'm wrong.  What should I be using for  
this?  Would changing any of these "generateWordParts" or  
"catenateAll" options help?  I can't seem to find any documentation so  
I'm really not sure what it would do, but reindexing this whole thing  
will take quite some time so I'd rather know what will actually work  
before I just start changing things.


Thanks so much for any insight!

--
Steve

Re: replication handler - compression

2008-10-27 Thread Noble Paul നോബിള്‍ नोब्ळ्

> It is useful only if your bandwidth is very low.
> Otherwise the cost of copying/comprressing/decompressing can take up
> more time than we save.

I mean compressing and transferring. If the optimized index itself has
a very high compression ratio  then it is worth exploring the option
of compresssing and transferring. And do not assume that all the files
in the index directory is transferred during replication. It only
transfers the files which are used by the current commit point and the
ones which are absent in the slave

>
>
>
> On Tue, Oct 28, 2008 at 2:49 AM, Simon Collins
> <[EMAIL PROTECTED]> wrote:
>> Is there an option on the replication handler to compress the files?
>>
>>
>>
>> I'm trying to replicate off site, and seem to have accumulated about
>> 1.4gb. When compressed with winzip of all things i can get this down to
>> about 10% of the size.
>>
>>
>>
>> Is compression in the pipeline / can it be if not!
>>
>>
>>
>> simon
>>
>>
>>
>> This message has been scanned for malware by SurfControl plc. 
>> www.surfcontrol.com
>>
>
>
>
> --
> --Noble Paul
>

-- 
--Noble Paul

Re: timeouts

2008-10-27 Thread Noble Paul നോബിള്‍ नोब्ळ्

I may be a bit off the mark. It seems that DataImportHandler may be
able to do this very easily for you.
http://wiki.apache.org/solr/DataImportHandler#jdbcdatasource


On Fri, Oct 24, 2008 at 6:28 PM, Simon Collins
<[EMAIL PROTECTED]> wrote:
> Hi
>
>
>
> We're running solr on a win 2k3 box under tomcat with about 100,000 records.
> When doing large updates of records via solr sharp, solr completely freezes
> and doesn't come back until we restart tomcat.
>
>
>
> This has only started happening since putting mysql on the same box (as a
> source of the data to update from).
>
>
>
> Are there any known issues with running solr and mysql on the same box? When
> it's frozen, the cpu usage is around 1-2% not exactly out of resources!
>
>
>
> Am i best using something else instead of tomcat? We're still trialling solr
> (presently, used for our main site search www.shoe-shop.com and search and
> navigation for our microsites ). It's an excellent search product, but I
> don't want to fork out on new hardware for it just yet – until i know how
> more about the performance and which environment i'm best to go for
> (win/linux).
>
>
>
> If anyone has any suggestions/needs more info, i'd be extremely grateful.
>
>
>
> Thanks
> Simon
>
> 
> Simon Collins
> Systems Analyst
>
> Telephone: 01904 606 867
> Fax Number: 01904 528 791
> shoe-shop.com ltd
> Catherine House
> Northminster Business Park
> Upper Poppleton, YORK
> YO26 6QU
>
> www.shoe-shop.com
>
> 
>
> This message (and any associated files) is intended only for the use of the
> individual or entity to which it is addressed and may contain information
> that is confidential, subject to copyright or constitutes a trade secret. If
> you are not the intended recipient you are hereby notified that any
> dissemination, copying or distribution of this message, or files associated
> with this message, is strictly prohibited. If you have received this message
> in error, please notify us immediately by replying to the message and
> deleting it from your computer. Messages sent to and from us may be
> monitored.
>
> Internet communications cannot be guaranteed to be secure or error-free as
> information could be intercepted, corrupted, lost, destroyed, arrive late or
> incomplete, or contain viruses. Therefore, we do not accept responsibility
> for any errors or omissions that are present in this message, or any
> attachment, that have arisen as a result of e-mail transmission. If
> verification is required, please request a hard-copy version. Any views or
> opinions presented are solely those of the author and do not necessarily
> represent those of the company. (PAVD001)
>
> Shoe-shop.com Limited is a company registered in England and Wales with
> company number 03817232. Vat Registration GB 734 256 241. Registered Office
> Catherine House, Northminster Business Park, Upper Poppleton, YORK, YO26
> 6QU.
>
> This message has been scanned for malware by SurfControl plc.
> www.surfcontrol.com



-- 
--Noble Paul

Re: Advice needed on master-slave configuration

2008-10-27 Thread Noble Paul നോബിള്‍ नोब्ळ्

This is the JIRA location

https://issues.apache.org/jira/secure/Dashboard.jspa

The trunk is not changed a lot since 1.3 release. If it works for you
you can just stick to the one you are using till you get a patch.
--Noble

On Mon, Oct 27, 2008 at 9:04 PM, William Pierce <[EMAIL PROTECTED]> wrote:
> Folks:
>
> The replication handler works wonderfully!  Thanks all!   Now can someone
> point me at a wiki so I can submit a jira issue lobbying for the inclusion
> of this replication functionality in a 1.3 patch?
>
> Thanks,
> - Bill
>
> --
> From: "Noble Paul ??? ??" <[EMAIL PROTECTED]>
> Sent: Thursday, October 23, 2008 10:34 PM
> To: 
> Subject: Re: Advice needed on master-slave configuration
>
>> It was committed on 10/21
>>
>> take the latest 10/23 build
>> http://people.apache.org/builds/lucene/solr/nightly/solr-2008-10-23.zip
>>
>> On Fri, Oct 24, 2008 at 2:27 AM, William Pierce <[EMAIL PROTECTED]>
>> wrote:
>>>
>>> I tried the nightly build from 10/18 -- I did the following:
>>>
>>> a) I downloaded the nightly build of 10/18 (the zip file).
>>>
>>> b) I unpacked it and copied the war file to my tomcat lib folder.
>>>
>>> c) I made the relevant changes in the config files per the instructions
>>> shown in the wiki.
>>>
>>> When tomcat starts, I see the error message in tomcat logs...
>>>
>>> Caused by: java.lang.ClassNotFoundException: solr.ReplicationHandler
>>>   at
>>>
>>> org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1358)
>>>   at
>>>
>>> org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1204)
>>>   at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
>>>   at java.lang.Class.forName0(Native Method)
>>>   at java.lang.Class.forName(Class.java:247)
>>>   at
>>>
>>> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:258)
>>>   ... 36 more
>>>
>>> Where do I get the nightly bits that will enable me to try this
>>> replication
>>> handler?
>>>
>>> Thanks,
>>> - Bill
>>>
>>> --
>>> From: "Noble Paul ??? ??" <[EMAIL PROTECTED]>
>>> Sent: Wednesday, October 22, 2008 10:51 PM
>>> To: 
>>> Subject: Re: Advice needed on master-slave configuration
>>>
 If you are using a nightly you can try the new SolrReplication feature
 http://wiki.apache.org/solr/SolrReplication


 On Thu, Oct 23, 2008 at 4:32 AM, William Pierce <[EMAIL PROTECTED]>
 wrote:
>
> Otis,
>
> Yes,  I had forgotten that Windows will not permit me to overwrite
> files
> currently in use.   So my copy scripts are failing.  Windows will not
> even
> allow a rename of a folder containing a file in use so I am not sure
> how
> to
> do this
>
> I am going to dig around and see what I can come up with short of
> stopping/restarting tomcat...
>
> Thanks,
> - Bill
>
>
> --
> From: "Otis Gospodnetic" <[EMAIL PROTECTED]>
> Sent: Wednesday, October 22, 2008 2:30 PM
> To: 
> Subject: Re: Advice needed on master-slave configuration
>
>> Normally you don't have to start Q, but only "reload" Solr searcher
>> when
>> the index has been copied.
>> However, you are on Windows, and its FS has the tendency not to let
>> you
>> delete/overwrite files that another app (Solr/java) has opened.  Are
>> you
>> able to copy the index from U to Q?  How are you doing it?  Are you
>> deleting
>> index files from the index dir on Q that are no longer in the index
>> dir
>> on
>> U?
>>
>>
>> Otis
>> --
>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>
>>
>>
>> - Original Message 
>>>
>>> From: William Pierce <[EMAIL PROTECTED]>
>>> To: solr-user@lucene.apache.org
>>> Sent: Wednesday, October 22, 2008 5:24:28 PM
>>> Subject: Advice needed on master-slave configuration
>>>
>>> Folks:
>>>
>>> I have two instances of solr running one on the master (U) and the
>>> other
>>> on
>>> the slave (Q).  Q is used for queries only, while U is where
>>> updates/deletes
>>> are done.   I am running on Windows so unfortunately I cannot use the
>>> distribution scripts.
>>>
>>> Every N hours when changes are committed and the index on U is
>>> updated,
>>> I
>>> want to copy the files from the master to the slave.Do I need to
>>> halt
>>> the solr server on Q while the index is being updated?  If not,  how
>>> do
>>> I
>>> copy the files into the data folder while the server is running? Any
>>> pointers would be greatly appreciated!
>>>
>>> Thanks!
>>>
>>> - Bill
>>
>>
>



 --
 --Noble Paul

>>>
>>
>>
>>
>> --
>> --Noble Pa

Re: replication handler - compression

2008-10-27 Thread Noble Paul നോബിള്‍ नोब्ळ्

Are you sure you optimized the index?
It is useful only if your bandwidth is very low.
Otherwise the cost of copying/comprressing/decompressing can take up
more time than we save.



On Tue, Oct 28, 2008 at 2:49 AM, Simon Collins
<[EMAIL PROTECTED]> wrote:
> Is there an option on the replication handler to compress the files?
>
>
>
> I'm trying to replicate off site, and seem to have accumulated about
> 1.4gb. When compressed with winzip of all things i can get this down to
> about 10% of the size.
>
>
>
> Is compression in the pipeline / can it be if not!
>
>
>
> simon
>
>
>
> This message has been scanned for malware by SurfControl plc. 
> www.surfcontrol.com
>



-- 
--Noble Paul

Re: Entity extraction?

2008-10-27 Thread Grant Ingersoll

On Oct 27, 2008, at 8:53 PM, Ryan McKinley wrote:

On Oct 27, 2008, at 6:10 PM, Grant Ingersoll wrote:

Warning: shameless plug: Tom Morton and I have a chapter on NER
and OpenNLP (and Solr, for that matter) in our book "Taming
Text" (Manning) and the code will be open once we have a place to
put it (hopefully soon). In fact, you'll see us doing a lot of
this kind of stuff w/ Solr and it should all be coming back to Solr/
Lucene/Mahout at some point (for instance, see https://issues.apache.org/jira/browse/SOLR-769
, as I'm sure FAST told you they can do clustering, too!)

--end shameless plug ---

thats great!

I just got the MEAP copy, it looks really good
http://www.manning.com/ingersoll/

Thanks!

As for Mahout, NER is a classification problem, and there are some
tools in Mahout to do classification, but nothing specifically
targeted at NER at the moment. Mahout, like Nutch, also takes
advantage of Hadoop for scaling. The combination of Mahout in Solr
makes a lot of sense, IMO.

Perhaps this is more appropriate to ask on the mahout list, but...
when you say "Mahout, like Nutch, also takes advantage of Hadoop for
scaling", does that mean that much of Mahout requires hadoop? Is it
possible to do smaller scale problems on a simple setup and only
invoke hadoop when required?

Yes, probably better asked on Mahout, but to answer your question,
yes, most of the implementations require Hadoop so far, but it is not
a strict requirement. That being said, it is fairly easy to run them
on a simple setup (i.e. single node).

Re: Entity extraction?

2008-10-27 Thread Ryan McKinley



On Oct 27, 2008, at 6:10 PM, Grant Ingersoll wrote:

Warning: shameless plug:  Tom Morton and I have a chapter on NER and  
OpenNLP (and Solr, for that matter) in our book "Taming  
Text" (Manning) and the code will be open once we have a place to  
put it (hopefully soon).  In fact, you'll see us doing a lot of this  
kind of stuff w/ Solr and it should all be coming back to Solr/ 
Lucene/Mahout at some point (for instance, see https://issues.apache.org/jira/browse/SOLR-769 
, as I'm sure FAST told you they can do clustering, too!)

--end shameless plug ---



thats great!

I just got the MEAP copy, it looks really good
http://www.manning.com/ingersoll/


As for Mahout, NER  is a classification problem, and there are some  
tools in Mahout to do classification,  but nothing specifically  
targeted at NER at the moment.  Mahout, like Nutch, also takes  
advantage of Hadoop for scaling.  The combination of Mahout in Solr  
makes a lot of sense, IMO.




Perhaps this is more appropriate to ask on the mahout list, but...   
when you say "Mahout, like Nutch, also takes advantage of Hadoop for  
scaling", does that mean that much of Mahout requires hadoop?  Is it  
possible to do smaller scale problems on a simple setup and only  
invoke hadoop when required?


ryan

Re: Entity extraction?

2008-10-27 Thread Grant Ingersoll

Warning: shameless plug:  Tom Morton and I have a chapter on NER and  
OpenNLP (and Solr, for that matter) in our book "Taming  
Text" (Manning) and the code will be open once we have a place to put  
it (hopefully soon).  In fact, you'll see us doing a lot of this kind  
of stuff w/ Solr and it should all be coming back to Solr/Lucene/ 
Mahout at some point (for instance, see https://issues.apache.org/jira/browse/SOLR-769 
, as I'm sure FAST told you they can do clustering, too!)

--end shameless plug ---

As for Mahout, NER  is a classification problem, and there are some  
tools in Mahout to do classification,  but nothing specifically  
targeted at NER at the moment.  Mahout, like Nutch, also takes  
advantage of Hadoop for scaling.  The combination of Mahout in Solr  
makes a lot of sense, IMO.



On Oct 25, 2008, at 11:25 PM, Vaijanath N. Rao wrote:


Hi,

One can use the OpenNLP Max entropy library and create there own  
named-entity extraction.

I had used it in one of the projects which I did with Solr.

It is easy to integrate most of the NLP libraries with Solr. Though  
we had named-entity extraction embedded in our crawler which would  
populate a field called entities in the database, which we would  
ingest in Solr as yet another field.


--Thanks and Regards
Vaijanath N. Rao

Julien Nioche wrote:

Hi,

Open Source NLP platforms like GATE (http://gate.ac.uk) or Apache  
UIMA are
typically used for these types of tasks. GATE in particular comes  
with an
application called ANNIE which does Named Entity Recognition.  
OpenCalais
does that as well and should be easy to embed, but it can't be  
tuned to do

more specific things unlike UIMA or GATE based applications.

Depending on the architecture you have in mind it could be worth
investigating Nutch and add the NER as a custom plugin; NLP being  
often a
CPU intensive task you could leverage the scalability of Hadoop in  
Nutch.
There is a patch which allows to delegate the indexing to SOLR. As  
someone

else already said these named entities could then be used as facets.

HTH

Julien





--
Grant Ingersoll
Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
http://www.lucenebootcamp.com


Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ

replication handler - compression

2008-10-27 Thread Simon Collins

Is there an option on the replication handler to compress the files?

 

I'm trying to replicate off site, and seem to have accumulated about
1.4gb. When compressed with winzip of all things i can get this down to
about 10% of the size.

 

Is compression in the pipeline / can it be if not!

 

simon



This message has been scanned for malware by SurfControl plc. 
www.surfcontrol.com

Re: solr 1.3 language managing boosting

2008-10-27 Thread Grant Ingersoll


Boost at index time or at query time?

For index time, you would add the boost on the field/document.  At  
query time, you can add boosts to each term that belongs to a specific  
field.



On Oct 27, 2008, at 2:10 PM, sunnyfr wrote:



Hi,

I've my field in the schema which are text_es, text_fr, text_ln   
And I
would like to boost them according the field language, How could I  
do that,

According to the fact that I've stored all this field ???
Thanks a lot for your help,

Sunny
--
View this message in context: 
http://www.nabble.com/solr-1.3-language-managing-boosting-tp20193102p20193102.html
Sent from the Solr - User mailing list archive at Nabble.com.



--
Grant Ingersoll
Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
http://www.lucenebootcamp.com


Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ

Re: Delete by query isn't working for a specific query

2008-10-27 Thread Alexander Ramos Jardim

Thank you both for your nice answers. I will try it out.

2008/10/27 Erik Hatcher <[EMAIL PROTECTED]>

> I don't think delete-by-query supports purely negative queries, even though
> they are supported for q and fq parameters for searches.
>
> Try using:
>
>   *:* AND -deptId:[1 TO *]
>
>Erik
>
>
> On Oct 27, 2008, at 9:21 AM, Alexander Ramos Jardim wrote:
>
>  Hey pals,
>>
>> I am trying to delete a couple documents that don't have any value on a
>> given integer field. This is the command I am executing:
>>
>> $curl http://:/solr/update -H 'Content-Type:text/xml' -d
>> '-(deptId:[1 TO *])'
>> $curl http://:/solr/update -H 'Content-Type:text/xml' -d
>> ''
>>
>> But the documents don't get deleted.
>>
>> Solr doesn't return any error message, its log seems ok. Any idea?
>> --
>> Alexander Ramos Jardim
>>
>
>


-- 
Alexander Ramos Jardim

solr 1.3 language managing boosting

2008-10-27 Thread sunnyfr


Hi,

I've my field in the schema which are text_es, text_fr, text_ln  And I
would like to boost them according the field language, How could I do that,
According to the fact that I've stored all this field ??? 
Thanks a lot for your help, 

Sunny
-- 
View this message in context: 
http://www.nabble.com/solr-1.3-language-managing-boosting-tp20193102p20193102.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Entity extraction?

2008-10-27 Thread Benson Margulies

Extractors are exactly as good as the data you have to train or
configure them with. An open source extractor platform may still
require you to come up with a rather large heap of data from
somewhere.

Not all the vendors of extractors lose money.

How useful NEE is for search is an ongoing question that depends on
what sort of data you are working with and what sort of precision
challenges most concern you.


On Mon, Oct 27, 2008 at 12:34 PM, Walter Underwood
<[EMAIL PROTECTED]> wrote:
> Verity sold a lot of features based on "we might need it at some point."
> Very few people deployed the advanced features. They just didn't need them.
>
> wunder
>
> On 10/27/08 9:27 AM, "Charlie Jackson" <[EMAIL PROTECTED]> wrote:
>
>> Yeah, when they first mentioned it, my initial thought was "cool, but we 
>> don't
>> need it." However, some of the higher ups in the company are saying we might
>> want it at some point, so I've been asked to look into it. I'll be sure to 
>> let
>> them know about the flaws in the concept, thanks for that info.
>>
>> 
>> Charlie Jackson
>> [EMAIL PROTECTED]
>>
>>
>> -Original Message-
>> From: Walter Underwood [mailto:[EMAIL PROTECTED]
>> Sent: Monday, October 27, 2008 11:17 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Entity extraction?
>>
>> The vendor mentioned entity extraction, but that doesn't mean you need it.
>> Entity extraction is a pretty specific technology, and it has been a
>> money-losing product at many companies for many years, going back to
>> Xerox ThingFinder well over ten years ago.
>>
>> My guess is that very few people really need entity extraction.
>>
>> Using EE for automatic taxonomy generation is even harder to get right.
>> At best, that is a way to get a starter set of categories that you can
>> edit. You will not get a production quality taxonomy automatically.
>>
>> wunder
>>
>> On 10/27/08 8:31 AM, "Charlie Jackson" <[EMAIL PROTECTED]> wrote:
>>
>>> True, though I may be able to convince the powers that be that it's worth 
>>> the
>>> investment.
>>>
>>> There are a number of open source or free tools listed on the Wikipedia 
>>> entry
>>> for entity extraction
>>> (http://en.wikipedia.org/wiki/Named_entity_recognition#Open_source_or_free)
>>> --
>>> does anyone have any experience with any of these?
>>>
>>> 
>>> Charlie Jackson
>>> 312-873-6537
>>> [EMAIL PROTECTED]
>>>
>>> -Original Message-
>>> From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
>>> Sent: Monday, October 27, 2008 10:23 AM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Entity extraction?
>>>
>>> For the record, LingPipe is not free.  It's good, but it's not free.
>>>
>>>
>>> Otis
>>> --
>>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>>
>>>
>>>
>>> - Original Message 
 From: Rafael Rossini <[EMAIL PROTECTED]>
 To: solr-user@lucene.apache.org
 Sent: Friday, October 24, 2008 6:08:14 PM
 Subject: Re: Entity extraction?

 Solr can do a simple facet seach like FAST, but the entity extraction
 demands other tecnologies. I do not know how FAST does it but at the 
 company
 I´m working on (www.cortex-intelligence.com), we use a mix of statistical
 and language-specific tasks to recognize and categorize entities in the
 text. Ling Pipe is another tool (free) that does that too. In case you 
 would
 like to see a simple demo: http://www.cortex-intelligence.com/tech/

 Rossini


 On Fri, Oct 24, 2008 at 6:18 PM, Charlie Jackson
> wrote:

> During a recent sales pitch to my company by FAST, they mentioned entity
> extraction. I'd never heard of it before, but they described it as
> basically recognizing people/places/things in documents being indexed
> and then being able to do faceting on this data at query time. Does
> anything like this already exist in SOLR? If not, I'm not opposed to
> developing it myself, but I could use some pointers on where to start.
>
>
>
> Thanks,
>
> - Charlie
>
>
>>>
>>>
>>>
>>
>>
>>
>
>

Re: Entity extraction?

2008-10-27 Thread Walter Underwood

Verity sold a lot of features based on "we might need it at some point."
Very few people deployed the advanced features. They just didn't need them.

wunder

On 10/27/08 9:27 AM, "Charlie Jackson" <[EMAIL PROTECTED]> wrote:

> Yeah, when they first mentioned it, my initial thought was "cool, but we don't
> need it." However, some of the higher ups in the company are saying we might
> want it at some point, so I've been asked to look into it. I'll be sure to let
> them know about the flaws in the concept, thanks for that info.
> 
> 
> Charlie Jackson
> [EMAIL PROTECTED]
> 
> 
> -Original Message-
> From: Walter Underwood [mailto:[EMAIL PROTECTED]
> Sent: Monday, October 27, 2008 11:17 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Entity extraction?
> 
> The vendor mentioned entity extraction, but that doesn't mean you need it.
> Entity extraction is a pretty specific technology, and it has been a
> money-losing product at many companies for many years, going back to
> Xerox ThingFinder well over ten years ago.
> 
> My guess is that very few people really need entity extraction.
> 
> Using EE for automatic taxonomy generation is even harder to get right.
> At best, that is a way to get a starter set of categories that you can
> edit. You will not get a production quality taxonomy automatically.
> 
> wunder
> 
> On 10/27/08 8:31 AM, "Charlie Jackson" <[EMAIL PROTECTED]> wrote:
> 
>> True, though I may be able to convince the powers that be that it's worth the
>> investment. 
>> 
>> There are a number of open source or free tools listed on the Wikipedia entry
>> for entity extraction
>> (http://en.wikipedia.org/wiki/Named_entity_recognition#Open_source_or_free)
>> --
>> does anyone have any experience with any of these?
>> 
>> 
>> Charlie Jackson
>> 312-873-6537
>> [EMAIL PROTECTED]
>> 
>> -Original Message-
>> From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
>> Sent: Monday, October 27, 2008 10:23 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Entity extraction?
>> 
>> For the record, LingPipe is not free.  It's good, but it's not free.
>> 
>> 
>> Otis
>> --
>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>> 
>> 
>> 
>> - Original Message 
>>> From: Rafael Rossini <[EMAIL PROTECTED]>
>>> To: solr-user@lucene.apache.org
>>> Sent: Friday, October 24, 2008 6:08:14 PM
>>> Subject: Re: Entity extraction?
>>> 
>>> Solr can do a simple facet seach like FAST, but the entity extraction
>>> demands other tecnologies. I do not know how FAST does it but at the company
>>> I´m working on (www.cortex-intelligence.com), we use a mix of statistical
>>> and language-specific tasks to recognize and categorize entities in the
>>> text. Ling Pipe is another tool (free) that does that too. In case you would
>>> like to see a simple demo: http://www.cortex-intelligence.com/tech/
>>> 
>>> Rossini
>>> 
>>> 
>>> On Fri, Oct 24, 2008 at 6:18 PM, Charlie Jackson
 wrote:
>>> 
 During a recent sales pitch to my company by FAST, they mentioned entity
 extraction. I'd never heard of it before, but they described it as
 basically recognizing people/places/things in documents being indexed
 and then being able to do faceting on this data at query time. Does
 anything like this already exist in SOLR? If not, I'm not opposed to
 developing it myself, but I could use some pointers on where to start.
 
 
 
 Thanks,
 
 - Charlie
 
 
>> 
>> 
>> 
> 
> 
>

Re: Entity extraction?

2008-10-27 Thread Rafael Rossini

Well... IMHO that depends. One of the services we provide is a "automatic
clipping" in which our client chooses 20~30 texts from the media he woud
like to be aware. With classification algorithms we then keep him aware of
every new text of his interest. We gained about 10% of precision just by
adding EE information to the algorithm.

Rossini

On Mon, Oct 27, 2008 at 2:17 PM, Walter Underwood <[EMAIL PROTECTED]>wrote:

> The vendor mentioned entity extraction, but that doesn't mean you need it.
> Entity extraction is a pretty specific technology, and it has been a
> money-losing product at many companies for many years, going back to
> Xerox ThingFinder well over ten years ago.
>
> My guess is that very few people really need entity extraction.
>
> Using EE for automatic taxonomy generation is even harder to get right.
> At best, that is a way to get a starter set of categories that you can
> edit. You will not get a production quality taxonomy automatically.
>
> wunder
>
> On 10/27/08 8:31 AM, "Charlie Jackson" <[EMAIL PROTECTED]> wrote:
>
> > True, though I may be able to convince the powers that be that it's worth
> the
> > investment.
> >
> > There are a number of open source or free tools listed on the Wikipedia
> entry
> > for entity extraction
> > (
> http://en.wikipedia.org/wiki/Named_entity_recognition#Open_source_or_free)
> --
> > does anyone have any experience with any of these?
> >
> > 
> > Charlie Jackson
> > 312-873-6537
> > [EMAIL PROTECTED]
> >
> > -Original Message-
> > From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
> > Sent: Monday, October 27, 2008 10:23 AM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Entity extraction?
> >
> > For the record, LingPipe is not free.  It's good, but it's not free.
> >
> >
> > Otis
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >
> >
> >
> > - Original Message 
> >> From: Rafael Rossini <[EMAIL PROTECTED]>
> >> To: solr-user@lucene.apache.org
> >> Sent: Friday, October 24, 2008 6:08:14 PM
> >> Subject: Re: Entity extraction?
> >>
> >> Solr can do a simple facet seach like FAST, but the entity extraction
> >> demands other tecnologies. I do not know how FAST does it but at the
> company
> >> I´m working on (www.cortex-intelligence.com), we use a mix of
> statistical
> >> and language-specific tasks to recognize and categorize entities in the
> >> text. Ling Pipe is another tool (free) that does that too. In case you
> would
> >> like to see a simple demo: http://www.cortex-intelligence.com/tech/
> >>
> >> Rossini
> >>
> >>
> >> On Fri, Oct 24, 2008 at 6:18 PM, Charlie Jackson
> >>> wrote:
> >>
> >>> During a recent sales pitch to my company by FAST, they mentioned
> entity
> >>> extraction. I'd never heard of it before, but they described it as
> >>> basically recognizing people/places/things in documents being indexed
> >>> and then being able to do faceting on this data at query time. Does
> >>> anything like this already exist in SOLR? If not, I'm not opposed to
> >>> developing it myself, but I could use some pointers on where to start.
> >>>
> >>>
> >>>
> >>> Thanks,
> >>>
> >>> - Charlie
> >>>
> >>>
> >
> >
> >
>
>

RE: Entity extraction?

2008-10-27 Thread Charlie Jackson

Yeah, when they first mentioned it, my initial thought was "cool, but we don't 
need it." However, some of the higher ups in the company are saying we might 
want it at some point, so I've been asked to look into it. I'll be sure to let 
them know about the flaws in the concept, thanks for that info.


Charlie Jackson
[EMAIL PROTECTED]


-Original Message-
From: Walter Underwood [mailto:[EMAIL PROTECTED] 
Sent: Monday, October 27, 2008 11:17 AM
To: solr-user@lucene.apache.org
Subject: Re: Entity extraction?

The vendor mentioned entity extraction, but that doesn't mean you need it.
Entity extraction is a pretty specific technology, and it has been a
money-losing product at many companies for many years, going back to
Xerox ThingFinder well over ten years ago.

My guess is that very few people really need entity extraction.

Using EE for automatic taxonomy generation is even harder to get right.
At best, that is a way to get a starter set of categories that you can
edit. You will not get a production quality taxonomy automatically.

wunder

On 10/27/08 8:31 AM, "Charlie Jackson" <[EMAIL PROTECTED]> wrote:

> True, though I may be able to convince the powers that be that it's worth the
> investment. 
> 
> There are a number of open source or free tools listed on the Wikipedia entry
> for entity extraction
> (http://en.wikipedia.org/wiki/Named_entity_recognition#Open_source_or_free) --
> does anyone have any experience with any of these?
> 
> 
> Charlie Jackson
> 312-873-6537
> [EMAIL PROTECTED]
> 
> -Original Message-
> From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
> Sent: Monday, October 27, 2008 10:23 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Entity extraction?
> 
> For the record, LingPipe is not free.  It's good, but it's not free.
> 
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
>> From: Rafael Rossini <[EMAIL PROTECTED]>
>> To: solr-user@lucene.apache.org
>> Sent: Friday, October 24, 2008 6:08:14 PM
>> Subject: Re: Entity extraction?
>> 
>> Solr can do a simple facet seach like FAST, but the entity extraction
>> demands other tecnologies. I do not know how FAST does it but at the company
>> I´m working on (www.cortex-intelligence.com), we use a mix of statistical
>> and language-specific tasks to recognize and categorize entities in the
>> text. Ling Pipe is another tool (free) that does that too. In case you would
>> like to see a simple demo: http://www.cortex-intelligence.com/tech/
>> 
>> Rossini
>> 
>> 
>> On Fri, Oct 24, 2008 at 6:18 PM, Charlie Jackson
>>> wrote:
>> 
>>> During a recent sales pitch to my company by FAST, they mentioned entity
>>> extraction. I'd never heard of it before, but they described it as
>>> basically recognizing people/places/things in documents being indexed
>>> and then being able to do faceting on this data at query time. Does
>>> anything like this already exist in SOLR? If not, I'm not opposed to
>>> developing it myself, but I could use some pointers on where to start.
>>> 
>>> 
>>> 
>>> Thanks,
>>> 
>>> - Charlie
>>> 
>>> 
> 
> 
>

Greek - solr 1.3

2008-10-27 Thread sunnyfr


Hi,

I would like to know if I have to do something special for Greek's
characters?

My schema is configurate like that:

  



It just stores documents which doesn't have greek characters  
All every language are working fine.

Any idea ???
Thanks a lot,
-- 
View this message in context: 
http://www.nabble.com/Greek---solr-1.3-tp20191072p20191072.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Entity extraction?

2008-10-27 Thread Walter Underwood

The vendor mentioned entity extraction, but that doesn't mean you need it.
Entity extraction is a pretty specific technology, and it has been a
money-losing product at many companies for many years, going back to
Xerox ThingFinder well over ten years ago.

My guess is that very few people really need entity extraction.

Using EE for automatic taxonomy generation is even harder to get right.
At best, that is a way to get a starter set of categories that you can
edit. You will not get a production quality taxonomy automatically.

wunder

On 10/27/08 8:31 AM, "Charlie Jackson" <[EMAIL PROTECTED]> wrote:

> True, though I may be able to convince the powers that be that it's worth the
> investment. 
> 
> There are a number of open source or free tools listed on the Wikipedia entry
> for entity extraction
> (http://en.wikipedia.org/wiki/Named_entity_recognition#Open_source_or_free) --
> does anyone have any experience with any of these?
> 
> 
> Charlie Jackson
> 312-873-6537
> [EMAIL PROTECTED]
> 
> -Original Message-
> From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
> Sent: Monday, October 27, 2008 10:23 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Entity extraction?
> 
> For the record, LingPipe is not free.  It's good, but it's not free.
> 
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
>> From: Rafael Rossini <[EMAIL PROTECTED]>
>> To: solr-user@lucene.apache.org
>> Sent: Friday, October 24, 2008 6:08:14 PM
>> Subject: Re: Entity extraction?
>> 
>> Solr can do a simple facet seach like FAST, but the entity extraction
>> demands other tecnologies. I do not know how FAST does it but at the company
>> I´m working on (www.cortex-intelligence.com), we use a mix of statistical
>> and language-specific tasks to recognize and categorize entities in the
>> text. Ling Pipe is another tool (free) that does that too. In case you would
>> like to see a simple demo: http://www.cortex-intelligence.com/tech/
>> 
>> Rossini
>> 
>> 
>> On Fri, Oct 24, 2008 at 6:18 PM, Charlie Jackson
>>> wrote:
>> 
>>> During a recent sales pitch to my company by FAST, they mentioned entity
>>> extraction. I'd never heard of it before, but they described it as
>>> basically recognizing people/places/things in documents being indexed
>>> and then being able to do faceting on this data at query time. Does
>>> anything like this already exist in SOLR? If not, I'm not opposed to
>>> developing it myself, but I could use some pointers on where to start.
>>> 
>>> 
>>> 
>>> Thanks,
>>> 
>>> - Charlie
>>> 
>>> 
> 
> 
>

Re: Question about facet.prefix usage

2008-10-27 Thread Peter Cline

Hi Simon,
I came across your post to the solr users list about using facet
prefixes, shown below. I was wondering if you were still using your
modified version of SimpleFacets.java, and if so -- if you could send me
a copy. I'll need to implement something similar, and it never hurts to
start from existing material.

Thanks,
Peter

Simon Hu wrote:

I also need the exact same feature. I was not able to find an easy solution
and ended up modifying class SimpleFacets to make it accept an array of
facet prefixes per field. If you are interested, I can email you the
modified SimpleFacets.java.

-Simon

steve berry-2 wrote:

Question: Is it possible to pass complex queries to facet.prefix?
Example instead of facet.prefix:foo I want facet.prefix:foo OR
facet.prefix:bar

My application is for browsing business records that fall into
categories. The user is only allowed to see businesses falling into
categories which they have access to.

I have a series of documents dumped into the following basic structure
which I was hoping would help me deal with this:

123
Business Corp.
28255-0001
.
charlotte_2006 Banks
charlotte_2007 Banks
sanfrancisco_2006 Banks
sanfrancisco_2007 Banks
... (lots more market_category entries) ...

124
Factory Corp.
28205-0001
.
charlotte_2006 Banks
charlotte_2007 Banks
austin_2006 Banks
austin_2007 Banks
... (lots more market_category entries) ...

The multivalued market_category fields are flattened relational data
attributed to that business and I want to use those values for facted
navigation /but/ I want the facets to be restricted depending on what
products the user has access to. For example a user may have access to
sanfrancisco_2007 and sanfrancisco_2006 data but nothing else.

So I've created a request using facet.prefix that looks something like
this:
http://SOLRSERVER:8080/solr/select?q.op=AND&q=docType:gen&facet.field=market_category&facet.prefix=charlotte_2007

This ends up producing perfectly suitable facet results that look like
this:
..

1
1
1

1
1
1
0

Bingo! facet.prefix does exactly what I want it to.

Now I want to go a step further and pass a compound statement to the
facet.prefix along the lines of "facet.prefix:charlotte_2007 OR
sanfrancisco_2007" or "facet.prefix:charlotte_2007 OR charlotte_2006" to
return more complex facet sets. As far as I can tell looking at the docs
this won't work.

Is this possible using the existing facet.prefix functionality? Anyone
have a better idea of how I should accomplish this?

Thanks,
steve berry
American City Business Journals

RE: Entity extraction?

2008-10-27 Thread Charlie Jackson

True, though I may be able to convince the powers that be that it's worth the 
investment. 

There are a number of open source or free tools listed on the Wikipedia entry 
for entity extraction 
(http://en.wikipedia.org/wiki/Named_entity_recognition#Open_source_or_free) -- 
does anyone have any experience with any of these? 


Charlie Jackson
312-873-6537
[EMAIL PROTECTED]

-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
Sent: Monday, October 27, 2008 10:23 AM
To: solr-user@lucene.apache.org
Subject: Re: Entity extraction?

For the record, LingPipe is not free.  It's good, but it's not free.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Rafael Rossini <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Friday, October 24, 2008 6:08:14 PM
> Subject: Re: Entity extraction?
> 
> Solr can do a simple facet seach like FAST, but the entity extraction
> demands other tecnologies. I do not know how FAST does it but at the company
> I´m working on (www.cortex-intelligence.com), we use a mix of statistical
> and language-specific tasks to recognize and categorize entities in the
> text. Ling Pipe is another tool (free) that does that too. In case you would
> like to see a simple demo: http://www.cortex-intelligence.com/tech/
> 
> Rossini
> 
> 
> On Fri, Oct 24, 2008 at 6:18 PM, Charlie Jackson 
> > wrote:
> 
> > During a recent sales pitch to my company by FAST, they mentioned entity
> > extraction. I'd never heard of it before, but they described it as
> > basically recognizing people/places/things in documents being indexed
> > and then being able to do faceting on this data at query time. Does
> > anything like this already exist in SOLR? If not, I'm not opposed to
> > developing it myself, but I could use some pointers on where to start.
> >
> >
> >
> > Thanks,
> >
> > - Charlie
> >
> >

Re: Advice needed on master-slave configuration

2008-10-27 Thread William Pierce


Folks:

The replication handler works wonderfully!  Thanks all!   Now can someone 
point me at a wiki so I can submit a jira issue lobbying for the inclusion 
of this replication functionality in a 1.3 patch?


Thanks,
- Bill

--
From: "Noble Paul ??? ??" <[EMAIL PROTECTED]>
Sent: Thursday, October 23, 2008 10:34 PM
To: 
Subject: Re: Advice needed on master-slave configuration


It was committed on 10/21

take the latest 10/23 build
http://people.apache.org/builds/lucene/solr/nightly/solr-2008-10-23.zip

On Fri, Oct 24, 2008 at 2:27 AM, William Pierce <[EMAIL PROTECTED]> 
wrote:

I tried the nightly build from 10/18 -- I did the following:

a) I downloaded the nightly build of 10/18 (the zip file).

b) I unpacked it and copied the war file to my tomcat lib folder.

c) I made the relevant changes in the config files per the instructions
shown in the wiki.

When tomcat starts, I see the error message in tomcat logs...

Caused by: java.lang.ClassNotFoundException: solr.ReplicationHandler
   at
org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1358)
   at
org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1204)
   at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Class.java:247)
   at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:258)
   ... 36 more

Where do I get the nightly bits that will enable me to try this 
replication

handler?

Thanks,
- Bill

--
From: "Noble Paul ??? ??" <[EMAIL PROTECTED]>
Sent: Wednesday, October 22, 2008 10:51 PM
To: 
Subject: Re: Advice needed on master-slave configuration


If you are using a nightly you can try the new SolrReplication feature
http://wiki.apache.org/solr/SolrReplication


On Thu, Oct 23, 2008 at 4:32 AM, William Pierce <[EMAIL PROTECTED]>
wrote:


Otis,

Yes,  I had forgotten that Windows will not permit me to overwrite 
files

currently in use.   So my copy scripts are failing.  Windows will not
even
allow a rename of a folder containing a file in use so I am not sure 
how

to
do this

I am going to dig around and see what I can come up with short of
stopping/restarting tomcat...

Thanks,
- Bill


--
From: "Otis Gospodnetic" <[EMAIL PROTECTED]>
Sent: Wednesday, October 22, 2008 2:30 PM
To: 
Subject: Re: Advice needed on master-slave configuration

Normally you don't have to start Q, but only "reload" Solr searcher 
when

the index has been copied.
However, you are on Windows, and its FS has the tendency not to let 
you
delete/overwrite files that another app (Solr/java) has opened.  Are 
you

able to copy the index from U to Q?  How are you doing it?  Are you
deleting
index files from the index dir on Q that are no longer in the index 
dir

on
U?


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 


From: William Pierce <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Wednesday, October 22, 2008 5:24:28 PM
Subject: Advice needed on master-slave configuration

Folks:

I have two instances of solr running one on the master (U) and the
other
on
the slave (Q).  Q is used for queries only, while U is where
updates/deletes
are done.   I am running on Windows so unfortunately I cannot use the
distribution scripts.

Every N hours when changes are committed and the index on U is 
updated,

I
want to copy the files from the master to the slave.Do I need to
halt
the solr server on Q while the index is being updated?  If not,  how 
do

I
copy the files into the data folder while the server is running? Any
pointers would be greatly appreciated!

Thanks!

- Bill









--
--Noble Paul







--
--Noble Paul

Re: Entity extraction?

2008-10-27 Thread Otis Gospodnetic

For the record, LingPipe is not free.  It's good, but it's not free.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Rafael Rossini <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Friday, October 24, 2008 6:08:14 PM
> Subject: Re: Entity extraction?
> 
> Solr can do a simple facet seach like FAST, but the entity extraction
> demands other tecnologies. I do not know how FAST does it but at the company
> I´m working on (www.cortex-intelligence.com), we use a mix of statistical
> and language-specific tasks to recognize and categorize entities in the
> text. Ling Pipe is another tool (free) that does that too. In case you would
> like to see a simple demo: http://www.cortex-intelligence.com/tech/
> 
> Rossini
> 
> 
> On Fri, Oct 24, 2008 at 6:18 PM, Charlie Jackson 
> > wrote:
> 
> > During a recent sales pitch to my company by FAST, they mentioned entity
> > extraction. I'd never heard of it before, but they described it as
> > basically recognizing people/places/things in documents being indexed
> > and then being able to do faceting on this data at query time. Does
> > anything like this already exist in SOLR? If not, I'm not opposed to
> > developing it myself, but I could use some pointers on where to start.
> >
> >
> >
> > Thanks,
> >
> > - Charlie
> >
> >

2008-10-27 Thread sunnyfr


Hi,,
I'm using, solr1.3 tomcat55 and I've got this error, when I fire :
...8180/solr/video/select/?q=échelle


−

0
150
−

Ã©chelle


−

−

2007-10-31T10:48:34Z
5625531
FR
10
−

Régis pompier
http://www.nabble.com/solr1.3---tomcat-55-%3Cb%3E%3Cstr-name%3D%22q%22%3E%C3%83%C2%A9chelle%3C-str%3E%3C-b%3E-tp20189184p20189184.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Sorting performance + replication of index between cores

2008-10-27 Thread christophe


Hi,

After fully reloading my index, using another field than a Data does not 
help that much.

Using a warmup query avoids having the first request slow, but:
- Frequents commits means that the Searcher is reloaded frequently 
and, as the warmup takes time, the clients must wait.
- Having warmup slows down the index process (I guess this is 
because after a commit, the Searchers are recreated)


So I'm considering, as suggested,  to have two instances: one for 
indexing and one for searching.
I was wondering if there are simple ways to replicate the index in a 
single Solr server running two cores ? Any such config already tested ? 
I guess that the standard replication based on rsync can be simplified a 
lot in this case as the two indexes are on the same server.


Thanks
Christophe

Beniamin Janicki wrote:
:so you can send your updates anytime you want, and as long as you only 
:commit every 5 minutes (or commit on a master as often as you want, but 
:only run snappuller/snapinstaller on your slaves every 5 minutes) your 
:results will be at most 5minutes + warming time stale.


This is what I do as well ( commits are done once per 5 minutes ). I've got
master - slave configuration. Master has turned off all caches (commented in
solrconfig.cml) and setup only 2 maxWarmingSearchers. Index size has 5GB
,Xmx= 1GB and committing takes around 10 secs ( on default configuration
with warming it took from 30 mins up to 2 hours). 


Slave caches are configured to have autowarmCount="0" and
maxWarmingSearchers=1 , and I have new data 1 second after snapshoot is
done. I haven't noticed any huge delays while serving search request.
Try to use those values - may be they'll help in your case too.

Ben Janicki


-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED] 
Sent: 22 October 2008 04:56

To: solr-user@lucene.apache.org
Subject: Re: Sorting performance


: The problem is that I will have hundreds of users doing queries, and a
: continuous flow of document coming in.
: So a delay in warming up a cache "could" be acceptable if I do it a few
times
: per day. But not on a too regular basis (right now, the first query that
loads
: the cache takes 150s).
: 
: However: I'm not sure why it looks not to be a good idea to update the

caches

you can refresh the caches automaticly after updating, the "newSearcher" 
event is fired whenever a searcher is opened (but before it's used by 
clients) so you can configure warming queries for it -- it doesn't have to 
be done manually (or by the first user to use that reader)


so you can send your updates anytime you want, and as long as you only 
commit every 5 minutes (or commit on a master as often as you want, but 
only run snappuller/snapinstaller on your slaves every 5 minutes) your 
results will be at most 5minutes + warming time stale.



-Hoss

solr 1.3 multi language ?

2008-10-27 Thread sunnyfr


Hi,
I try to boost some language, I would like to know if it's necessary to
store them to be able to boost them using dismax?

Thanks a lot,
Sunny
-- 
View this message in context: 
http://www.nabble.com/solr-1.3-multi-language---tp20188549p20188549.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Delete by query isn't working for a specific query

2008-10-27 Thread Erik Hatcher

I don't think delete-by-query supports purely negative queries, even  
though they are supported for q and fq parameters for searches.


Try using:

   *:* AND -deptId:[1 TO *]

Erik

On Oct 27, 2008, at 9:21 AM, Alexander Ramos Jardim wrote:


Hey pals,

I am trying to delete a couple documents that don't have any value  
on a

given integer field. This is the command I am executing:

$curl http://:/solr/update -H 'Content-Type:text/xml' -d
'-(deptId:[1 TO *])'
$curl http://:/solr/update -H 'Content-Type:text/xml' -d
''

But the documents don't get deleted.

Solr doesn't return any error message, its log seems ok. Any idea?
--
Alexander Ramos Jardim

Re: Delete by query isn't working for a specific query

2008-10-27 Thread Mark Miller


Alexander Ramos Jardim wrote:

Hey pals,

I am trying to delete a couple documents that don't have any value on a
given integer field. This is the command I am executing:

$curl http://:/solr/update -H 'Content-Type:text/xml' -d
'-(deptId:[1 TO *])'
$curl http://:/solr/update -H 'Content-Type:text/xml' -d
''

But the documents don't get deleted.

Solr doesn't return any error message, its log seems ok. Any idea?
  
I think deletebyquery uses the Lucene query parser right? So you can't 
do a pure negative query - gotto do a matchall first.


- Mark

Delete by query isn't working for a specific query

2008-10-27 Thread Alexander Ramos Jardim

Hey pals,

I am trying to delete a couple documents that don't have any value on a
given integer field. This is the command I am executing:

$curl http://:/solr/update -H 'Content-Type:text/xml' -d
'-(deptId:[1 TO *])'
$curl http://:/solr/update -H 'Content-Type:text/xml' -d
''

But the documents don't get deleted.

Solr doesn't return any error message, its log seems ok. Any idea?
-- 
Alexander Ramos Jardim

Error in Integrating JBoss 4.2 and Solr-1.3.0:

Question about textTight

Re: replication handler - compression

Re: timeouts

Re: Advice needed on master-slave configuration

Re: replication handler - compression

Re: Entity extraction?

Re: Entity extraction?

Re: Entity extraction?

replication handler - compression

Re: solr 1.3 language managing boosting

Re: Delete by query isn't working for a specific query

solr 1.3 language managing boosting

Re: Entity extraction?

Re: Entity extraction?

Re: Entity extraction?

RE: Entity extraction?

Greek - solr 1.3

Re: Entity extraction?

Re: Question about facet.prefix usage

RE: Entity extraction?

Re: Advice needed on master-slave configuration

Re: Entity extraction?

solr1.3 / tomcat 55 Ã©chelle

Re: Sorting performance + replication of index between cores

solr 1.3 multi language ?

Re: Delete by query isn't working for a specific query

Re: Delete by query isn't working for a specific query

Delete by query isn't working for a specific query

29 matches

Site Navigation

Mail list logo

Footer information