Re: Dynamic fields with more than 100 fields inside

2010-02-08 Thread Shalin Shekhar Mangar
On Mon, Feb 8, 2010 at 9:47 PM, Xavier Schepler < xavier.schep...@sciences-po.fr> wrote: > Hey, > > I'm thinking about using dynamic fields. > > I need one or more user specific field in my schema, for example, > "concept_user_*", and I will have maybe more than 200 users using this > feature. > O

Re: TermInfosReader.get ArrayIndexOutOfBoundsException

2010-02-08 Thread Lance Norskog
The index is corrupted. In some places ArrayIndex and NPE are not wrapped as CorruptIndexException. Try running your code with the Lucene assertions on. Add this to the JVM arguments: -ea:org.apache.lucene... On Mon, Feb 8, 2010 at 1:02 PM, Burton-West, Tom wrote: > Hello all, > > After optimiz

Re: unloading a solr core doesn't free any memory

2010-02-08 Thread Lance Norskog
The 'jconsole' program lets you monitor GC operation in real-time. http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html On Mon, Feb 8, 2010 at 8:44 AM, Simon Rosenthal wrote: > What Garbage Collection parameters is the JVM using ?   the memory will not > always be freed immediately

Re: Slow QueryComponent.process() when queries have numbers in them

2010-02-08 Thread Lance Norskog
The single-digit numbers are probably in all of docs. You might want to rip them out with a SynonymFilter. The more docs that a query finds, the longer the query takes. On Fri, Feb 5, 2010 at 1:23 PM, Simon Wistow wrote: > On Wed, Feb 03, 2010 at 07:38:13PM -0800, Lance Norskog said: >> The debug

Re: Fundamental questions of how to build up solr for huge portals

2010-02-08 Thread Lance Norskog
In general, search is disk/intensive. Lots of RAM (up to 32G at today's prices) and fast hard disks matter. For administration, the single biggest disruptor is updating the index. If you can keep index updates to off-peak hours it will be ok. If not, index on one server and server queries from anot

Re: source tree for lucene

2010-02-08 Thread Lance Norskog
The Solr trunk/lib directory contains lucene libs as of January 18. (They were checked in on that date.) On Thu, Feb 4, 2010 at 4:28 PM, Joe Calderon wrote: > i want to recompile lucene with > http://issues.apache.org/jira/browse/LUCENE-2230, but im not sure > which source tree to use, i tried us

Re: Is it posible to exclude results from other languages?

2010-02-08 Thread Lance Norskog
There is On Thu, Feb 4, 2010 at 10:07 AM, Raimon Bosch wrote: > > > Yes, It's true that we could do it in index time if we had a way to know. I > was thinking in some solution in search time, maybe measuring the % of > stopwords of each document. Normally, a document of another language won't > h

Re: solr multicore and nfs

2010-02-08 Thread Lance Norskog
Solr generally does not work well over NFS. This looks like a transient NFS error; apps have to assume that NFS will randomly fail and that they have to try again. This may be due to a locking problem. There is a LockFactory class in Lucene that controls how indexes are shared between programs. So

RE: DataImportHandler can't understand query

2010-02-08 Thread Shah, Nirmal
Did you try < > Nirmal Shah Remedy Consultant|Column Technologies|Cell: (630) 244-1648 -Original Message- From: javaxmlsoapdev [mailto:vika...@yahoo.com] Sent: Monday, February 08, 2010 5:42 PM To: solr-user@lucene.apache.org Subject: Re: DataImportHandler can't understand query Note I

Re: DataImportHandler can't understand query

2010-02-08 Thread javaxmlsoapdev
Note I already tried to escape < character with \< but still it throws same error. Any idea? Thanks, javaxmlsoapdev wrote: > > I have a complex query (runs fine in database), which I am trying to > include in DataImportHandler query. Query has case statements with < > in > it > > e.g. >

DataImportHandler can't understand query

2010-02-08 Thread javaxmlsoapdev
I have a complex query (runs fine in database), which I am trying to include in DataImportHandler query. Query has case statements with < > in it e.g. case when (ASSIGNED_TO < > '' and TRANSLATE(ASSIGNED_TO, '', '0123456789')='') DataImportHandler failes to understand query with follow

Re: Indexing / querying multiple data types

2010-02-08 Thread Sven Maurmann
Hi, could you be a little more precise about your configuration? It may be much easier to answer your question then. Cheers, Sven --On Montag, 8. Februar 2010 17:39 + stefan.ma...@bt.com wrote: OK - so I've now got my data-config.xml sorted so that I'm pulling in the expected number o

Re: Embedded Solr problem

2010-02-08 Thread Sven Maurmann
Hi Ranveer, I assume that you have enough knowlesge in Java. You should essentially your code for instantiating the server (depending on what you intend to do this may be done in a separate class or in a method of the class doing the queries). Then you use this instance to handle all the queries

TermInfosReader.get ArrayIndexOutOfBoundsException

2010-02-08 Thread Burton-West, Tom
Hello all, After optimizing rather large indexes on 10 shards (each index holds about 500,000 documents and is about 270-300 GB in size) we started getting intermittent TermInfosReader.get() ArrayIndexOutOfBounds exceptions. The exceptions sometimes seem to occur on all 10 shards at the sam

Re: Trouble parsing XML from replication?command=status

2010-02-08 Thread Jason Rutherglen
javabin parses fine, which leads me to believe there's a bug lurking... Though I'm not going spend time solving it. On Mon, Feb 8, 2010 at 12:18 PM, Jason Rutherglen wrote: > Via Firefox on Ubuntu I downloaded the results of > replication?command=status to a file, then wrote a little app to parse

DataImportHandler

2010-02-08 Thread Sean Timm
It looks like the dataimporter.functions.escapeSql(String) function escapes quotes, but fails to escape '\' characters which are problematic especially when the field value ends in a \. Also, on failure, I get an alarming notice of a possible resource leak. I couldn't find Jira issues for eit

Re: Collating results from multiple indexes

2010-02-08 Thread Jan Høydahl / Cominvent
Hi, There is no JOIN functionality in Solr. The common solution is either to accept the high volume update churn, or to add client side code to build a "join" layer on top of the two indices. I know that Attivio (www.attivio.com) have built some kind of JOIN functionality on top of Solr in thei

Trouble parsing XML from replication?command=status

2010-02-08 Thread Jason Rutherglen
Via Firefox on Ubuntu I downloaded the results of replication?command=status to a file, then wrote a little app to parse out the XML. Unfortunately it's not parsing. I'm wondering if it's because it's in XML, which nothing in Solr parses (SnapPuller for example is using javabin). Caused by: java

Re: Call URL, simply parse the results using SolrJ

2010-02-08 Thread Jason Rutherglen
Ahmet, Thanks, though that isn't quite what I was going for, and it's resolved besides... On Mon, Feb 8, 2010 at 10:24 AM, Ahmet Arslan wrote: >> So here's what happens if I pass in a >> URL with parameters, SolrJ chokes: >> >> Exception in thread "main" java.lang.RuntimeException: >> Invalid ba

RE: Multi-word synonyms containing commas

2010-02-08 Thread Agethle, Matthias
Ok, that works (now I found it also in the example synonyms-file...) But what if I overwrite the synonyms-file after SOLR-startup? Is core-reloading the only way to do this? I think of this steps: 1. Generate new synonym-file 2. Reload core and wait a minute 3. Re-index (as I'm using synonyms

Re: Call URL, simply parse the results using SolrJ

2010-02-08 Thread Ahmet Arslan
> So here's what happens if I pass in a > URL with parameters, SolrJ chokes: > > Exception in thread "main" java.lang.RuntimeException: > Invalid base > url for solrj.  The base URL must not contain > parameters: > http://locahost:8080/solr/main/select?q=video&qt=dismax You can't pass url with pa

Re: Call URL, simply parse the results using SolrJ

2010-02-08 Thread Jason Rutherglen
Here's what I did to resolve this: XMLResponseParser parser = new XMLResponseParser(); URL urlo = new URL(url); InputStreamReader isr = new InputStreamReader(urlo.openConnection().getInputStream()); NamedList namedList = parser.processResponse(isr); QueryResponse response = new QueryResponse(named

Re: Call URL, simply parse the results using SolrJ

2010-02-08 Thread Jason Rutherglen
So here's what happens if I pass in a URL with parameters, SolrJ chokes: Exception in thread "main" java.lang.RuntimeException: Invalid base url for solrj. The base URL must not contain parameters: http://locahost:8080/solr/main/select?q=video&qt=dismax at org.apache.solr.client.solrj.im

Re: Multi-word synonyms containing commas

2010-02-08 Thread Ahmet Arslan
> Hi, > > is it possible to have a synonym file where single synonyms > can also contain commas, e.g. names like "Washington, > George". Sure, you just need to escape that comma. e.g. Washington\, George, wg a\,a => b\,b

Indexing / querying multiple data types

2010-02-08 Thread stefan.maric
OK - so I've now got my data-config.xml sorted so that I'm pulling in the expected number of indexed documents for my two data sets So I've defined two entities (name1 & name2) and they both make use of the same fields -- I'm not sure if this is a good thing to have done When I run a query I

Call URL, simply parse the results using SolrJ

2010-02-08 Thread Jason Rutherglen
Sorry for the poorly worded title... For SOLR-1761 I want to pass in a URL and parse the query response... However it's non-obvious to me how to do this using the SolrJ API, hence asking the experts here. :)

Re: unloading a solr core doesn't free any memory

2010-02-08 Thread Simon Rosenthal
What Garbage Collection parameters is the JVM using ? the memory will not always be freed immediately after an event like unloading a core or starting a new searcher. 2010/2/8 Tim Terlegård > To me it doesn't look like unloading a Solr Core frees the memory that > the core has used. Is this ho

Dynamic fields with more than 100 fields inside

2010-02-08 Thread Xavier Schepler
Hey, I'm thinking about using dynamic fields. I need one or more user specific field in my schema, for example, "concept_user_*", and I will have maybe more than 200 users using this feature. One user will send and retrieve values from its field. It will then be used to filter result. How w

Re: trouble with DTD

2010-02-08 Thread gwk
On 2/8/2010 3:15 PM, Jens Kapitza wrote: hi @all, using solr and dataimport stuff to import ends up in RuntimeException. Caused by: java.lang.RuntimeException: [com.ctc.wstx.exc.WstxLazyException] com.ctc.wstx.exc.WstxParsingException: Undeclared general entity "eacute" at [row,col {unknown

Re: trouble with DTD

2010-02-08 Thread Erick Erickson
Are you sure this isn't just a typo? "eacute" <- "execute"? On Mon, Feb 8, 2010 at 9:15 AM, Jens Kapitza wrote: > hi @all, > > using solr and dataimport stuff to import ends up in RuntimeException. > > Caused by: java.lang.RuntimeException: [com.ctc.wstx.exc.WstxLazyException] > com.ctc.wstx.exc.

trouble with DTD

2010-02-08 Thread Jens Kapitza
hi @all, using solr and dataimport stuff to import ends up in RuntimeException. Caused by: java.lang.RuntimeException: [com.ctc.wstx.exc.WstxLazyException] com.ctc.wstx.exc.WstxParsingException: Undeclared general entity "eacute" at [row,col {unknown-source}]: [49,23] Browsing the code show

RE: How to configure multiple data import types

2010-02-08 Thread Ken Lane (kenlane)
It sounds like you are doing it correctly, Stefan. Must be something syntactical. The schema.xml and solrconfig.xml does not factor into your problem, only the data-config. I do the same thing you are trying to do. A watered down version is: Hope this helps...

unloading a solr core doesn't free any memory

2010-02-08 Thread Tim Terlegård
To me it doesn't look like unloading a Solr Core frees the memory that the core has used. Is this how it should be? I have a big index with 50 million documents. After loading a core it takes 300 MB RAM. After a query with a couple of sort fields Solr takes about 8 GB RAM. Then I unload (CoreAdmin

Multi-word synonyms containing commas

2010-02-08 Thread Agethle, Matthias
Hi, is it possible to have a synonym file where single synonyms can also contain commas, e.g. names like "Washington, George". Perhaps it would suffice to tell the SynonymFilterFacotry to use another separator character (instead of the comma)? I tried this and changed the line where the parseRul

Request time out in solr

2010-02-08 Thread Vijayant Kumar
Hi I had indexed the solr index by DIH, I am using Webservice::solr perl Module to update/delte my solr index at run time from frontend. I want to know How can I set request timeout through perl by webservice::solr end or solr end so that I could hanlde request timeout exception. -- Thank you,

Re: DataImportHandler - case sensitivity of column names

2010-02-08 Thread Shalin Shekhar Mangar
On Mon, Feb 8, 2010 at 3:59 PM, Alexey Serba wrote: > I encountered the problem with Oracle converting column names to upper > case. As a result SolrInputDocument is created with field names in > upper case and "Document [null] missing required field: id" exception > is thrown ( although ID field

Re: How to configure multiple data import types

2010-02-08 Thread Shalin Shekhar Mangar
On Mon, Feb 8, 2010 at 6:03 PM, wrote: > No my views have already taken care of pulling the related data together > > I've indexed my first data set and now want to configure a second > (non-related) data set so that a User can issue a query for data set #1 > whilst another user might be querying

RE: How to configure multiple data import types

2010-02-08 Thread stefan.maric
No my views have already taken care of pulling the related data together I've indexed my first data set and now want to configure a second (non-related) data set so that a User can issue a query for data set #1 whilst another user might be querying for data set #2 Should I be defining multiple

Re: How to configure multiple data import types

2010-02-08 Thread Noble Paul നോബിള്‍ नोब्ळ्
are you referring to nested entities? http://wiki.apache.org/solr/DIHQuickStart#Index_data_from_multiple_tables_into_Solr On Mon, Feb 8, 2010 at 5:42 PM, wrote: > I have got a dataimport request handler configured to index data by selecting > data from a DB view > > I now need to index addition

Re: SV: Running Solr (LucidWorks) as a Windows Server

2010-02-08 Thread Ron Chan
assuming you have the example running from example folder in the standard distribution by doing java -jar start.jar this is what I did to get the same running as a service download the jetty distribution (I used 6.1.21) copy the bin folder over to example copy etc\jetty-win32-service.xml to

How to configure multiple data import types

2010-02-08 Thread stefan.maric
I have got a dataimport request handler configured to index data by selecting data from a DB view I now need to index additional data sets from other views so that I can support other search queries I defined additional definitions within the section of my data-config.xml But I only seem t

DataImportHandler - case sensitivity of column names

2010-02-08 Thread Alexey Serba
I encountered the problem with Oracle converting column names to upper case. As a result SolrInputDocument is created with field names in upper case and "Document [null] missing required field: id" exception is thrown ( although ID field is defined ). I do not specify "field" elements explicitly.

Re: Use of solr.ASCIIFoldingFilterFactory

2010-02-08 Thread Yann PICHOT
Hello, Thank's, your response solve my problem. Thank's for all, On Sun, Feb 7, 2010 at 4:00 PM, Sven Maurmann wrote: > Hi, > > you might have run into an encoding problem. If you use Tomcat as > the container for Solr you should probably consult the following > > > http://wiki.apache.org/solr