Tomcat EXE Source Code

2011-02-25 Thread rajini maski
Can anybody help me to get the source code of the Tomcat exe file i.e, source code of the installation exe . Thanks..

Re: CUSTOM JSP FOR APACHE SOLR

2011-02-25 Thread Paul Libbrecht
From looking at the source, I see only the following option available for me to write search results displaying jsp's: adjust SolrDispatchFilter to treat a JspResponseWriter specially by: - enriching the http-request with the search queries and reponses - forward the request down the chain It

Re: DIH regex remove email + extract url

2011-02-25 Thread Rosa (Anuncios)
Hi Koji, My question was more about the solr DIH syntax. It doesn't work either with the new regex. Especially the syntax for this: field column=source xpath=/product/url regex=http:\/\/(.*?)\/(.*) / --- Is it correct? (not the regex, the syntax)? Example:

Re: Ramdirectory

2011-02-25 Thread Matt Weber
I have used this without issue. In the example solrconfig.xml replace this line: directoryFactory name=DirectoryFactory class=${solr.directoryFactory:solr.StandardDirectoryFactory}/ with this one: directoryFactory name=DirectoryFactory class=solr.RAMDirectoryFactory/ Thanks, Matt Weber On

upgrading to Tika 0.9 on Solr 1.4.1

2011-02-25 Thread jo
I have tried the steps indicated here: http://wiki.apache.org/solr/ExtractingRequestHandler http://wiki.apache.org/solr/ExtractingRequestHandler and when I try to parse a document nothing would happen, no error.. I have copied the jar files everywhere, and nothing.. can anyone give me the steps

LetterTokenizer + EdgeNGram + apostrophe in query = invalid result

2011-02-25 Thread Matt Weber
I have the following field defined in my schema: fieldType name=ngram class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.LetterTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory

Partial search extremly slow

2011-02-25 Thread javaxmlsoapdev
Since my users wanted to have a partial search functionality I had to introduce following. I have declared two EdgeNGram filters with both side back and front since they wanted to have partial search working from any side. fieldType name=edgytext class=solr.TextField analyzer tokenizer

Re: Solr running on many sites

2011-02-25 Thread Stefan Matheis
Hi Grant, Multi Sites == Multi Cores? :) http://wiki.apache.org/solr/MultiCore have a look Regards Stefan On Fri, Feb 25, 2011 at 3:15 AM, Grant Longhurst grant.longhu...@ecorner.com.au wrote: Hi, We are a e-commerce service provider and are looking at using solr for all the site

Re: Make syntax highlighter caseinsensitive

2011-02-25 Thread Tarjei Huse
Hi, On 02/25/2011 02:06 AM, Koji Sekiguchi wrote: (11/02/24 20:18), Tarjei Huse wrote: Hi, I got an index where I have two fields, body and caseInsensitiveBody. Body is indexed and stored while caseInsensitiveBody is just indexed. The idea is that by not storing the caseInsensitiveBody I

Re: Tomcat EXE Source Code

2011-02-25 Thread Jan Høydahl
Why do you want it? Try asking on the Tomcat list :) -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 25. feb. 2011, at 09.16, rajini maski wrote: Can anybody help me to get the source code of the Tomcat exe file i.e, source code of the installation

Re: upgrading to Tika 0.9 on Solr 1.4.1

2011-02-25 Thread Jan Høydahl
Your best bet is perhaps upgrading to latest 1.4 branch, i.e. 1.4.2-dev (http://svn.apache.org/repos/asf/lucene/solr/branches/branch-1.4/) It includes Tika 0.8-SNAPSHOT and is a compatible drop-in (war/jar) replacement with lots of other bug fixes you'd also like (check changes.txt). svn co

Re: Tomcat EXE Source Code

2011-02-25 Thread rajini maski
I am trying to configure tomcat multi instances with that many number of services configured too. Right now that particular tomcat exe let create only one. If the same exe run again and tried to configure at other destination folder ,It throws an exception as service already exists.How can I fix

Re: upgrading to Tika 0.9 on Solr 1.4.1

2011-02-25 Thread Markus Jelsma
You don't want to use 0.8 if you're parsing PDF. Your best bet is perhaps upgrading to latest 1.4 branch, i.e. 1.4.2-dev (http://svn.apache.org/repos/asf/lucene/solr/branches/branch-1.4/) It includes Tika 0.8-SNAPSHOT and is a compatible drop-in (war/jar) replacement with lots of other bug

Re: DIH regex remove email + extract url

2011-02-25 Thread Koji Sekiguchi
Hi Rosa, Are you sure you have transformer=RegexTransformer in your entity/? My question was more about the solr DIH syntax. It doesn't work either with the new regex. Especially the syntax for this: field column=source xpath=/product/url regex=http:\/\/(.*?)\/(.*) / --- Is it correct?

Re: CUSTOM JSP FOR APACHE SOLR

2011-02-25 Thread Erik Hatcher
On Feb 1, 2011, at 08:58 , Estrada Groups wrote: Has anyone noticed the rails application that installs with Solr4.0? I am interested to hear some feedback on that one... I guess you're talking about the client/ruby/flare stuff? It's been untouched for quite a while and has not been

Re: Tomcat EXE Source Code

2011-02-25 Thread Gora Mohanty
On Fri, Feb 25, 2011 at 3:42 PM, rajini maski rajinima...@gmail.com wrote: I am trying to configure tomcat multi instances with that many number of services configured too. Right now that particular tomcat exe let create only one. If the same exe run again and tried to configure at other

Re: problem when search grouping word

2011-02-25 Thread Chamnap Chhorn
Any idea? On Thu, Feb 24, 2011 at 6:49 PM, Chamnap Chhorn chamnapchh...@gmail.comwrote: There are many product names. How could I list them all, and the list is growing fast as well? On Thu, Feb 24, 2011 at 5:25 PM, Grijesh pintu.grij...@gmail.com wrote: may synonym will help -

solr score issue

2011-02-25 Thread Bagesh Sharma
Hi sir , Can anyone explain me how this score is being calculated. i am searching here software engineer using dismax handler. Total documents indexed are 477 and query results are 28. Query is like that - q=software+engineerfq=location%3Adelhi dismax setting is - str name=qf

Re: solr admin result page error

2011-02-25 Thread Bernd Fehling
Hi Markus, the result of my investigation is that Lucene currently can only handle UTF-8 code within BMP [Basic Multilingual Plane] (plane 0) = 0x. Any code above BMP might end in unpredictable results which is bad. If you get invalid UTF-8 from the index and use wt=xml it gives the error

Re: Tomcat EXE Source Code

2011-02-25 Thread Adam Estrada
Some of these links may help... http://www.google.com/search?client=safarirls=enq=apache+tomcat+downloadie=UTF-8oe=UTF-8 Adam On Feb 25, 2011, at 3:16 AM, rajini maski wrote: Can anybody help me to get the source code of the Tomcat exe file i.e, source code of the installation

Re: Make syntax highlighter caseinsensitive

2011-02-25 Thread Koji Sekiguchi
(11/02/25 18:30), Tarjei Huse wrote: Hi, On 02/25/2011 02:06 AM, Koji Sekiguchi wrote: (11/02/24 20:18), Tarjei Huse wrote: Hi, I got an index where I have two fields, body and caseInsensitiveBody. Body is indexed and stored while caseInsensitiveBody is just indexed. The idea is that by not

Re: Tomcat EXE Source Code

2011-02-25 Thread Jan Høydahl
I am trying to configure tomcat multi instances with that many number of services configured too. Right now that particular tomcat exe let create only one. If the same exe run again and tried to configure at other destination folder ,It throws an exception as service already exists.How can I

Re: solr score issue

2011-02-25 Thread Jayendra Patil
Check the Need help in understanding output of searcher.explain() function thread. http://mail-archives.apache.org/mod_mbox/lucene-java-user/201008.mbox/%3CAANLkTi=m9a1guhrahpeyqaxhu9gta9fjbnr7-8-zi...@mail.gmail.com%3E Regards, Jayendra On Fri, Feb 25, 2011 at 6:57 AM, Bagesh Sharma

Question on writing custom UpdateHandler

2011-02-25 Thread Mark
I am trying to write my own custom UpdateHandler that extends DirectUpdateHandler2. I would like to be able to query the current state of the index within the addDoc method. How would I be able to accomplish this? I tried something like the following but it was a big fat fail as it quickly

Re: Question on writing custom UpdateHandler

2011-02-25 Thread Mark
Or how can I perform a query on the current state of the index from within an UpdateProcessor? Thanks On 2/25/11 6:30 AM, Mark wrote: I am trying to write my own custom UpdateHandler that extends DirectUpdateHandler2. I would like to be able to query the current state of the index within

Re: upgrading to Tika 0.9 on Solr 1.4.1

2011-02-25 Thread Mattmann, Chris A (388J)
Hi Jo, You may consider checking out Tika trunk, where we recently have a Tika JAX-RS web service [1] committed as part of the tika-server module. You could probably wire DIH into it and accomplish the same thing. Cheers, Chris [1] https://issues.apache.org/jira/browse/TIKA-593 On Feb 24,

manually editing spellcheck dictionary

2011-02-25 Thread Tanner Postert
I'm using an index based spellcheck dictionary and I was wondering if there were a way for me to manually remove certain words from the dictionary. Some of my content has some mis-spellings, and for example when I search for the word sherrif (which should be spelled sheriff), it get

Re: manually editing spellcheck dictionary

2011-02-25 Thread Sujit Pal
If the dictionary is a Lucene index, wouldn't it be as simple as delete using a term query? Something like this: IndexReader sdreader = new IndexReader(); sdreader.delete(new Term(word, sherri)); ... sdreader.optimize(); sdreader.close(); I am guessing your dictionary is built dynamically using

Re: upgrading to Tika 0.9 on Solr 1.4.1

2011-02-25 Thread jo
You guys are great.. I will stick for now to the release version and if I have problem parsing I will give the branch jars a try the reason I am looking for upgrading tika is because tika keeps improving on things like languages, mime type support, and so on thanks again JO -- View

Re: upgrading to Tika 0.9 on Solr 1.4.1

2011-02-25 Thread Darx Oman
hi if you want to index pdf files then use tika 0.6 because 0.7 and 0.8 does not detect the correctly the pdfParse

Re: Omitting tf but not positions

2011-02-25 Thread Jan Høydahl
I also have a case (yellow-page) where IDF comes in and destroys the rank. A company listing with a word which occurs in few other listings is not necessarily better than others just because of that. When it gets to the extreme value of IDF=1, we get an artificially high IDF boost. It is not

Re: Omitting tf but not positions

2011-02-25 Thread Robert Zotter
Jan, You are correct, you'll need your own Similarity class. Have a look at SweetSpotSimilarity (http://lucene.apache.org/java/3_0_3/api/contrib-misc/org/apache/lucene/misc/SweetSpotSimilarity.html) On 2/25/11 10:57 AM, Jan Høydahl wrote: I also have a case (yellow-page) where IDF comes in

Re: Omitting tf but not positions

2011-02-25 Thread Robert Muir
On Fri, Feb 25, 2011 at 1:57 PM, Jan Høydahl jan@cominvent.com wrote: I also have a case (yellow-page) where IDF comes in and destroys the rank. A company listing with a word which occurs in few other listings is not necessarily better than others just because of that. When it gets to the

Re: boosting based on number of terms matched?

2011-02-25 Thread Chris Hostetter
: I'm using the edismax handler, although my question is probably the same for : dismax. When the user types a long query, I use the mm parameter so that : only 75% of terms need to match. This works fine, however, sometimes documents : that only match 75% of the terms show up higher in my

Tika metadata extracted per supported document format?

2011-02-25 Thread Andreas Kemkes
Hello, I've asked this on the Tika mailing list w/o an answer, so apologies for cross-posting. I'm trying to find information that tells me specifically what metadata is provided for the different supported document formats. Unfortunately all I was able to find so far is The Metadata

Re: DIH regex remove email + extract url

2011-02-25 Thread Rosa (Anuncios)
Hi Koji, Yes of course i have RegexTransformer in my entity/. What i'm not sure is the syntax of this field column=source xpath=/product/url regex= / i don't need any other parameter here? Rosa Le 25/02/2011 12:21, Koji Sekiguchi a écrit : Hi Rosa, Are you sure you have

Re: Tika metadata extracted per supported document format?

2011-02-25 Thread Mattmann, Chris A (388J)
Hi Andreas, In Tika 0.8+, you can run the --list-met-models command from tika-app: java -jar tika-app-version.jar --list-met-models And get a print out of the met keys that Tika supports. Some parsers add their own that aren't part of this met listing, but this is a relatively comprehensive

Re: Tika metadata extracted per supported document format?

2011-02-25 Thread Andreas Kemkes
Hi Chris, Thank you so much - that's a great start. Andreas From: Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov To: solr-user@lucene.apache.org solr-user@lucene.apache.org Cc: u...@tika.apache.org u...@tika.apache.org Sent: Fri, February 25, 2011

Re: upgrading to Tika 0.9 on Solr 1.4.1

2011-02-25 Thread Andreas Kemkes
According to the Tika release notes, it's fixed in 0.9. Haven't tried it myself. A critical backwards incompatible bug in PDF parsing that was introduced in Tika 0.8 has been fixed. (TIKA-548) Andreas From: Darx Oman darxo...@gmail.com To:

Case insensitive but number sensitive string?

2011-02-25 Thread Jon Drukman
I want a string field that is case insensitive. This is what I tried: fieldType name=cistring class=solr.StrField sortMissingLast=true omitNorms=true analyzer type=index tokenizer class=solr.LowerCaseTokenizerFactory/ /analyzer analyzer type=query

Re: upgrading to Tika 0.9 on Solr 1.4.1

2011-02-25 Thread Mattmann, Chris A (388J)
Yep it's fixed in 0.9. Cheers, Chris On Feb 25, 2011, at 2:37 PM, Andreas Kemkes wrote: According to the Tika release notes, it's fixed in 0.9. Haven't tried it myself. A critical backwards incompatible bug in PDF parsing that was introduced in Tika 0.8 has been fixed. (TIKA-548)

Re: Case insensitive but number sensitive string?

2011-02-25 Thread Ahmet Arslan
I want a string field that is case insensitive.  This is what I tried: fieldType name=cistring class=solr.StrField sortMissingLast=true omitNorms=true         analyzer type=index                 tokenizer class=solr.LowerCaseTokenizerFactory/         /analyzer         analyzer

Re: Case insensitive but number sensitive string?

2011-02-25 Thread Jon Drukman
Ahmet Arslan iorixxx at yahoo.com writes: I want a string field that is case insensitive.  This is what I tried: fieldType name=cistring class=solr.StrField sortMissingLast=true omitNorms=true         analyzer type=index                 tokenizer

Re: DIH regex remove email + extract url

2011-02-25 Thread Koji Sekiguchi
(11/02/26 5:24), Rosa (Anuncios) wrote: Hi Koji, Yes of course i have RegexTransformer in my entity/. What i'm not sure is the syntax of this field column=source xpath=/product/url regex= / i don't need any other parameter here? Hi Rosa, So I've mentioned groupNames attribute for field

Re: Tika metadata extracted per supported document format?

2011-02-25 Thread Andreas Kemkes
Hi Chris, java -jar tika-app-0.9.jar --list-met-models TikaMetadataKeys PROTECTED RESOURCE_NAME_KEY TikaMimeKeys MIME_TYPE_MAGIC TIKA_MIME_FILE Both 0.8 and 0.9 give me the same list. Is that a configuration issue? I'm a bit unclear if that gets me to what I was looking for - metadata

Re: Tika metadata extracted per supported document format?

2011-02-25 Thread Mattmann, Chris A (388J)
Hi Andreas, java -jar tika-app-0.9.jar --list-met-models TikaMetadataKeys PROTECTED RESOURCE_NAME_KEY TikaMimeKeys MIME_TYPE_MAGIC TIKA_MIME_FILE Both 0.8 and 0.9 give me the same list. Is that a configuration issue? Strange -- those are the only met models you're seeing listed?

Help on query time boosting effect using standardQueryParser

2011-02-25 Thread cyang2010
For the solr example(exampleDIH), how do i achieve the following with standard queryparser? search all docs which name field contains memory (primary query logic), Within that resultset, boost the doc matches features:battery (boosting logic). Note that I have to use standard queryparser

Re: Help on query time boosting effect using standardQueryParser

2011-02-25 Thread cyang2010
Once i change the query to be: +name:memory features:battery^100 str name=rawquerystring+name:memory features:battery^100 /str str name=querystring+name:memory features:battery^100 /str str name=parsedquery+name:memori features:batteri^100.0/str str name=parsedquery_toString+name:memori

Re: Solr running on many sites

2011-02-25 Thread Bill Bell
You would want to evaluate the size and the number of searched, plus how often the index will need changed data. There is no recipe just good experience. Bill Bell Sent from mobile On Feb 25, 2011, at 3:06 AM, Stefan Matheis matheis.ste...@googlemail.com wrote: Hi Grant, Multi Sites ==

How to handle special character in filter query

2011-02-25 Thread cyang2010
How to handle special character when constructing filter query? for example, i want to do something like: http://.fq=genre:ACTION ADVENTURE How do i handle the space and in the filter query part? Thanks. -- View this message in context:

Problems with JSP pages?

2011-02-25 Thread Lance Norskog
I'm on Windows Vista, using the trunk. Some of the JSP pages do not execute, but instead Jetty downloads them. solr/admin/get-properties.jsp for example. This is called by the 'JAVA PROPERTIES' button in the main admin page. Is this a known problem/quirk for Windows? Or fallout from a jetty