Re: DataImportHandler TikaEntityProcessor FieldReaderDataSource

2010-02-05 Thread Jorg Heymans
dow, thanks for that Paul :-| I suppose schema validation for data-config.xml is already in Jira somewhere ? Jorg 2010/2/5 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com wrong datasource name=orablob type=FieldStreamDataSource / right dataSource name=orablob

Re: Thanks Robert!

2010-02-05 Thread Shalin Shekhar Mangar
On Fri, Feb 5, 2010 at 4:07 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Robert, thanks for redoing all the Solr analyzers to the new API! It helps to have many examples to work from, best practices so to speak. +1 Thank you so much Robert! -- Regards, Shalin Shekhar Mangar.

manual adding index information to solr (when indexer application overlooks data)

2010-02-05 Thread Frank van Lingen
Hi, We are using an application on our database to push information to solr for search. Sometimes this application 'skips' certain articles (we are investigating why this happen). What I would like to know, if we can 'manual' give solr the appropriate xml which is usually sent by the

Re: DataImportHandler TikaEntityProcessor FieldReaderDataSource

2010-02-05 Thread Noble Paul നോബിള്‍ नोब्ळ्
unfortunately, no On Fri, Feb 5, 2010 at 2:23 PM, Jorg Heymans jorg.heym...@gmail.com wrote: dow, thanks for that Paul :-| I suppose schema validation for data-config.xml is already in Jira somewhere ? Jorg 2010/2/5 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com wrong   datasource

Re: manual adding index information to solr (when indexer application overlooks data)

2010-02-05 Thread Shalin Shekhar Mangar
On Fri, Feb 5, 2010 at 2:57 PM, Frank van Lingen fr...@vanlingen.namewrote: Hi, We are using an application on our database to push information to solr for search. Sometimes this application 'skips' certain articles (we are investigating why this happen). What I would like to know, if

Re: DataImportHandler TikaEntityProcessor FieldReaderDataSource

2010-02-05 Thread Jorg Heymans
there is one now :) https://issues.apache.org/jira/browse/SOLR-1758 Cheers, Jorg 2010/2/5 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com unfortunately, no On Fri, Feb 5, 2010 at 2:23 PM, Jorg Heymans jorg.heym...@gmail.com wrote: dow, thanks for that Paul :-| I suppose schema

(default) maximum chars per field

2010-02-05 Thread Markus.Rietzler
hi, what is the default maximum charsize per field? i found a macChars paramater for copyField but i don't think, that this is what i am looking for. we have indexed some documents via tika/solrcell. only the beginning of these documents can be searched. where can i define the maximum size of a

Re: (default) maximum chars per field

2010-02-05 Thread Shalin Shekhar Mangar
On Fri, Feb 5, 2010 at 3:56 PM, markus.rietz...@rzf.fin-nrw.de wrote: hi, what is the default maximum charsize per field? i found a macChars paramater for copyField but i don't think, that this is what i am looking for. we have indexed some documents via tika/solrcell. only the beginning of

Re: Deploying Solr 1.3 in JBoss 5

2010-02-05 Thread Luca Molteni
Bye the way, I finally solved it. To deploy solr 1.3 in jboss 5, you simply have to remove xercesImpl-2.8.1.jar xml-apis-1.3.03.jar From the WEB-INF/lib folder of solr.war Solr will use the lib provided by jboss 5. Thank you again. L.M. On 3 February 2010 10:38, Luca Molteni

AW: (default) maximum chars per field

2010-02-05 Thread Markus.Rietzler
ok, i was looking for all types of max but somehow didn't saw the maxFieldLength. this is a global parameter, right? can this be defined on a field basis? global would be enough at the moment. thank you -Ursprüngliche Nachricht- Von: Shalin Shekhar Mangar

Re: Deploying Solr 1.3 in JBoss 5

2010-02-05 Thread Sascha Szott
Hi Luca, could you add a note to the Wiki page [1]. Thanks! -Sascha [1] http://wiki.apache.org/solr/SolrJBoss Luca Molteni wrote: Bye the way, I finally solved it. To deploy solr 1.3 in jboss 5, you simply have to remove xercesImpl-2.8.1.jar xml-apis-1.3.03.jar From the WEB-INF/lib

Re: (default) maximum chars per field

2010-02-05 Thread Sascha Szott
markus.rietz...@rzf.fin-nrw.de wrote: ok, i was looking for all types of max but somehow didn't saw the maxFieldLength. this is a global parameter, right? can this be defined on a field basis? It's a global parameter counting the maximum number of tokens(!) - not the number of characters

AW: Solr not starting JMX

2010-02-05 Thread Jan Simon Winkelmann
: My parameters look like this (running the Solr example): : : java -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=6060 : -Dcom.sun.management.jmxremote.authenticate=false : -Dcom.sun.management.jmxremote.ssl=false -jar start.jar What implementation/version of java are you

Re: Deploying Solr 1.3 in JBoss 5

2010-02-05 Thread Luca Molteni
Done. L.M. On 5 February 2010 12:56, Sascha Szott sz...@zib.de wrote: Hi Luca, could you add a note to the Wiki page [1]. Thanks! -Sascha [1] http://wiki.apache.org/solr/SolrJBoss Luca Molteni wrote: Bye the way, I finally solved it. To deploy solr 1.3 in jboss 5, you simply have

html tag problem while searching

2010-02-05 Thread Ranveer Kumar
Hi All, I have problem related to html tag. Basically in database some column carry html tage, for example p Hello how are you? /p I am indexing same as it is in index. I am filtering solr supported special character at query time. now the problem is when I am searching by p then result is *p

RE: Fundamental questions of how to build up solr for huge portals

2010-02-05 Thread Fuad Efendi
- whats the best way to use solr to get the best performance for an huge portal with 5000 users that might expense fastly? 5000 users: 200 TPS, for instance, equal to 1200 concurrent users (each user makes 1 request per minute); so that single SOLR instance is more than enough. Why 200TPS? It

Re: How to return filtered tokens as query results?

2010-02-05 Thread Gregg Horan
On Fri, Feb 5, 2010 at 2:31 AM, Ahmet Arslan iori...@yahoo.com wrote: Is there a way to return Solr's analyzed/filtered tokens from a query, rather than the original indexed data? (Ideally at a fairly high level like solrj). TermVectorComponent [1] can do that.

Re: html tag problem while searching

2010-02-05 Thread Ahmet Arslan
I have problem related to html tag. Basically in database some column carry html tage, for example p Hello how are you? /p I am indexing same as it is in index. I am filtering solr supported special character at query time. now the problem is when I am searching by p then result is

Use of solr.ASCIIFoldingFilterFactory

2010-02-05 Thread Yann PICHOT
Hi, I have define this type in my schema.xml file : fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.ASCIIFoldingFilterFactory / filter

Re: Use of solr.ASCIIFoldingFilterFactory

2010-02-05 Thread Ahmet Arslan
I test query this query string  on SOLR web application : all:chateau. Results (content of the field all)  :   CHATEAU D'AMBOISE   [CHATEAU EN FRANCE, BABELON]   ope dvd rene chateau   CHATEAU DE LA LOIRE   DE CHATEAU EN CHATEAU ENTRE LA LOIRE ET LE CHER   [LE CHATEAU AMBULANT, HAYAO

Re: Use of solr.ASCIIFoldingFilterFactory

2010-02-05 Thread Yann PICHOT
On Fri, Feb 5, 2010 at 4:00 PM, Ahmet Arslan iori...@yahoo.com wrote: I test query this query string on SOLR web application : all:chateau. Results (content of the field all) : CHATEAU D'AMBOISE [CHATEAU EN FRANCE, BABELON] ope dvd rene chateau CHATEAU DE LA LOIRE DE

Re: Use of solr.ASCIIFoldingFilterFactory

2010-02-05 Thread Ahmet Arslan
château is reduce to chateau. I test it on /admin/anaysis.jsp, result : Index Analyzer  château  chateau  chateau Query Analyzer  château  chateau chateau Strange. If château is reduce to chateau, then both words should return same set of documents. May be you added that filter to you

Re: Use of solr.ASCIIFoldingFilterFactory

2010-02-05 Thread Ahmet Arslan
Just for your information: since you are using whitespacetokenizer château won't retrieve documents containing (comma) château, Thats the problem. I just see that your matched (multivalued-field) all contains chateau thats why all:chateau is matching. And it has château, (with comma) so

Re: Use of solr.ASCIIFoldingFilterFactory

2010-02-05 Thread Yann PICHOT
On Fri, Feb 5, 2010 at 4:53 PM, Ahmet Arslan iori...@yahoo.com wrote: Just for your information: since you are using whitespacetokenizer château won't retrieve documents containing (comma) château, Thats the problem. I just see that your matched (multivalued-field) all contains chateau

RE: Use of solr.ASCIIFoldingFilterFactory

2010-02-05 Thread Steven A Rowe
Hi Yann, I'm pretty sure that this is a character encoding problem. You wrote: I do other tests. I directly write url in my browser adresse bar : http://localhost:8080/solr/select/?q=all:châteauhttp://localhost:8080/solr/select/?q=all:ch%C3%A2teau and i have result !!! and the url is now :

SV: Running Solr (LucidWorks) as a Windows Server

2010-02-05 Thread Roland Villemoes
Hi All, Thanks a lot for your help in this. I have tried to use the Win32Wrapper, and the Jetty-Service.exe but still no success. I was actually hoping the some of you guys out there actually had a running copy so I could so how to configure it? Looks like it must go the Tomcat way...

Re: Solr 1.4: Full import FileNotFoundException

2010-02-05 Thread ranjitr
Hello, I was just wondering if any one had any suggestions? Thank you, Ranjit. ranjitr wrote: Hello, I have noticed that when I run concurrent full-imports using DIH in Solr 1.4, the index ends up getting corrupted. I see the following in the log files (a snippet): record

Re: Thanks Robert!

2010-02-05 Thread Tom Burton-West
+1 And thanks to you both for all your work on CommonGrams! Tom Burton-West Jason Rutherglen-2 wrote: Robert, thanks for redoing all the Solr analyzers to the new API! It helps to have many examples to work from, best practices so to speak. -- View this message in context:

Faceting

2010-02-05 Thread José Moreira
Hello, I'm planning to index a 'content' field for search and from that fields text content i would like to facet (probably) according to if the content has e-mails, urls and within urls, url's to pictures, videos and others. As i'm a relatively new user to Solr, my plan was to regexp the

Boost documents based on a constant value in a field

2010-02-05 Thread Jon Drukman
I have a very simple schema: two integers and two text fields. fields field name=answer_id type=integer indexed=true stored=true required=true / field name=question type=text indexed=true stored=true/ field name=question_source type=integer indexed=true stored=true/ field

StreamingUpdateSolrServer hangs

2010-02-05 Thread Stephen Meyer
I am trying to use the StreamingUpdateSolrServer to index a bunch of bibliographic data and it is hanging up every time I run it. Sometimes it hangs after about 100k records (after about 2 minutes), sometimes after 4M records (after about 80 minutes) and all different intervals in between. It

Re: user feedback in solr

2010-02-05 Thread Tomas
I'm responding to this old mail becouse I implemented something like this similar to http://wiki.apache.org/solr/SolrSnmp . Maybe we could discuss if this is a good solution. I'm using Solr 1.4 on a JBoss 4.0.5 and Java 1.5. In my particular case, what I'm trying to find out is how often the

How to query multiple fields with phrases

2010-02-05 Thread Jason Chaffee
I am migrating from lucene to solr and I am not quite sure how do what I need to. I need to do a search that will search 3 different fields and combine the results. First, it needs to not break the phrase into tokens, but rather treat it is a phrase for one field. The other fields need to be

Re: Slow QueryComponent.process() when queries have numbers in them

2010-02-05 Thread Simon Wistow
On Wed, Feb 03, 2010 at 07:38:13PM -0800, Lance Norskog said: The debugQuery parameter shows you how the query is parsed into a tree of Lucene query objects. Well, that's kind of what I'm asking - I know how the query is being parsed: str name=rawquerystringmyers 8e psychology chapter 9/str

Re: Boost documents based on a constant value in a field

2010-02-05 Thread Wangsheng Mei
you would use bq parameter to boost question_source==3 documents first. similar to: http://solr/select?q=your_queryqt=dismaxbq=question_source:3^1000 2010/2/6 Jon Drukman jdruk...@gmail.com I have a very simple schema: two integers and two text fields. fields field name=answer_id

old wildcard highlighting behaviour

2010-02-05 Thread Joe Calderon
hello *, currently with hl.usePhraseHighlighter=true, a query for (joe jack*) will highlight emjoe jackson/em, however after reading the archives, what im looking for is the old 1.1 behaviour so that only emjoe jack/em is highlighted, is this possible in solr 1.5 ? thx much --joe

Re: old wildcard highlighting behaviour

2010-02-05 Thread Mark Miller
On iPhone so don't remember exact param I named it, but check wiki - something like hl.highlightMultiTerm - set it to false. - Mark http://www.lucidimagination.com (mobile) On Feb 6, 2010, at 12:00 AM, Joe Calderon calderon@gmail.com wrote: hello *, currently with