Chinese chars are not indexed ?

2010-06-26 Thread go canal
Hello, I enter Chinese chars in the admin console for searching matched documents, it does not return any though I have uploaded some documents that has Chinese chars. I guess the Chinese characters are not indexed. Is there any configuration I need to make in Solr? rgds, canal

Re: How to index rich document with XML payload?

2010-06-26 Thread go canal
Simple code like this: File file = new File ("test.pdf"); InputStream input = new FileInputStream(file); Metadata metadata = new Metadata (); ContentHandler handler = new BodyContentHandler(); AutoDetectParser parse = new AutoDetectParser(); parse.parse(input, handler, metadata); input.cl

Re: NGramFilterFactory usage

2010-06-26 Thread Indika Tantrigoda
Hello, Applying the NGramFilterFactory for analyzer type="query" didnt solve the issue. >From the examples I've seen it is only necesssary to have the NGramFilterFactory at index time right ? Regards, Indika On 27 June 2010 01:14, Indika Tantrigoda wrote: > Hi all, > > I've been working with S

Re: How to index rich document with XML payload?

2010-06-26 Thread go canal
Hi, I just started using SolrI am using SolrJ client, but uploading the file directly to Solr. I think we can use Tika in our code first. Here I send the file directly to Solr which will do the text extraction: CommonsHttpSolrServer solr = new CommonsHttpSolrServer("http://localhost:8983/so

Re: [ANN] Solr 1.4.1 Released

2010-06-26 Thread Ken Krugler
On Jun 26, 2010, at 5:18pm, Jason Chaffee wrote: It appears the 1.4.1 version was deployed with a new maven groupId For eample, if you are trying to download solr-core, here are the differences between 1.4.0 and 1.4.1. 1.4.0 groupId: org.apache.solr artifactId: solr-core 1.4.1 groupId: or

URLDataSource

2010-06-26 Thread Jason Chaffee
I would like to the URLDataSource to make RESTful calls to get content and only re-index when content changes. This means using http headers to make a request and using the response headers to determine when to make the request. For example, Request Headers: Accept: application/xml if-modifi

REST calls

2010-06-26 Thread Jason Chaffee
The solr docs say it is RESTful, yet it seems that it doesn't use http headers in a RESTful way. For example, it doesn't seem to use the Accept: request header to determine the media-type to be returned. Instead, it requires a query parameter to be used in the URL. Also, it doesn't seem to us

RE: [ANN] Solr 1.4.1 Released

2010-06-26 Thread Jason Chaffee
It appears the 1.4.1 version was deployed with a new maven groupId For eample, if you are trying to download solr-core, here are the differences between 1.4.0 and 1.4.1. 1.4.0 groupId: org.apache.solr artifactId: solr-core 1.4.1 groupId: org.apache.solr.solr artifactId:solr-core Was this cha

Re: example solr xml working fine but my own xml files not working

2010-06-26 Thread codar
To add to my last, when I query *:* I get the results I expect, but if I query a term (ZS2) it doesn't find any matches. I must be missing something simple. I'm new to solr, so it's possible I just don't understand how to query it. Jeff On Sat, Jun 26, 2010 at 6:42 PM, Jeff Kemble wrote:

Re: example solr xml working fine but my own xml files not working

2010-06-26 Thread codar
Thanks, Erik. I downloaded Luke and pointed it to my index. I can see the data I indexed via Luke, but still can't query it through the admin console. I queried for ZS1 and still got no results, but when I look at the index via Luke, I see the document was indexed. I'm stumped. Jeff On S

How to index rich document with XML payload?

2010-06-26 Thread Steve Johnson
Greetings, I am new to Solr, but have gotten as far as successfully indexing documents both by sending XML describing the document and by sending the document itself using "update/extract". What I want to do now is, in effect, do both of these on each of my documents. I want to be able to h

Re: NGramFilterFactory usage

2010-06-26 Thread Robert Muir
yes, you need to use ngramfilter at query-time too. On Sat, Jun 26, 2010 at 3:55 PM, Indika Tantrigoda wrote: > Hi all, > > I've been working with Solr for while and the search components work as > expected. > Recently I've had the requirement to do searching on partial words and I > setup the NG

NGramFilterFactory usage

2010-06-26 Thread Indika Tantrigoda
Hi all, I've been working with Solr for while and the search components work as expected. Recently I've had the requirement to do searching on partial words and I setup the NGramFilterFactory. My schema.xml is as follows :

Re: example solr xml working fine but my own xml files not working

2010-06-26 Thread Erick Erickson
The first place you should go for this type of question is the solr admin page and look at what's actually in your index. A very handy tool for this is also Luke. Get a copy of it (google Lucene Luke) and point it at your index and poke around to see if what's actually in your index is what you ex

upload PDF using curl

2010-06-26 Thread go canal
Hello, I am following the example at http://wiki.apache.org/solr/ExtractingRequestHandler I am using Windows XP, curl 7.19.5, Solr 1.4.1 the command is: curl http://localhost:8983/solr/update/extract?literal.id=doc1&commit=true' -F "myfi...@tutorial.pdf" I got error : HTTP Error: 400. mis

Re: phrase highlighting

2010-06-26 Thread Lukas Kahwe Smith
On 26.06.2010, at 16:30, Lukas Kahwe Smith wrote: > > On 26.06.2010, at 16:22, Koji Sekiguchi wrote: > >> (10/06/26 22:19), Lukas Kahwe Smith wrote: >>> Hi, >>> >>> Form googling and looking at jira tickets it seems like phrase highlighting >>> should work out of the box, but even enabling it

Re: phrase highlighting

2010-06-26 Thread Lukas Kahwe Smith
On 26.06.2010, at 16:22, Koji Sekiguchi wrote: > (10/06/26 22:19), Lukas Kahwe Smith wrote: >> Hi, >> >> Form googling and looking at jira tickets it seems like phrase highlighting >> should work out of the box, but even enabling it manually didnt get me the >> desired result: >> http://resolu

Re: phrase highlighting

2010-06-26 Thread Koji Sekiguchi
(10/06/26 22:19), Lukas Kahwe Smith wrote: Hi, Form googling and looking at jira tickets it seems like phrase highlighting should work out of the box, but even enabling it manually didnt get me the desired result: http://resolutionfinder.org/search?q=%22security+council%22&=&tm=any&s=Search g

phrase highlighting

2010-06-26 Thread Lukas Kahwe Smith
Hi, Form googling and looking at jira tickets it seems like phrase highlighting should work out of the box, but even enabling it manually didnt get me the desired result: http://resolutionfinder.org/search?q=%22security+council%22&=&tm=any&s=Search generates the following query: INFO: [Clause_

Re: example solr xml working fine but my own xml files not working

2010-06-26 Thread codar
I'm struggling with this very same problem. I can index the example files fine. When I try adding a custom file, it appears to index without issue; but I get no search results via the admin console. I've also tried modifying one of the files (monitor.xml); it also did not update. I'm using solr

Re: Recommended MySQL JDBC driver

2010-06-26 Thread Marc Sturlese
I supose you use BatchSize=-1 to index that amount of data. Up from 5.1.7 connector there's this param: netTimeoutForStreamingResults The default value is 600. Increasing that maybe can help (2400 for example?) -- View this message in context: http://lucene.472066.n3.nabble.com/Recommended-MySQL

Re: Setting many properties for a multivalued field. Schema.xml ? External file?

2010-06-26 Thread Saïd Radhouani
Thanks Geert-Jan, this is indeed very helpful. The delimiters I gave were just for the need of the example. I will use non frequent delimiter. Cheers, -Saïd On Jun 26, 2010, at 1:53 PM, Geert-Jan Brits wrote: >> If I understand your suggestion correctly, you said that there's NO need to > have

Re: Setting many properties for a multivalued field. Schema.xml ? External file?

2010-06-26 Thread Geert-Jan Brits
btw, be careful with you delimiters: pic_url may possibly contain a '-', etc. 2010/6/26 Geert-Jan Brits > >If I understand your suggestion correctly, you said that there's NO need > to have many Dynamic Fields; instead, we can have one definitive field name, > which can store a long string (conc

Re: Setting many properties for a multivalued field. Schema.xml ? External file?

2010-06-26 Thread Geert-Jan Brits
>If I understand your suggestion correctly, you said that there's NO need to have many Dynamic Fields; instead, we can have one definitive field name, which can store a long string (concatenation of >information about tens of pictures), e.g., using "-" and "%" delimiters: pic_url_value1-pic_caption

Re: Setting many properties for a multivalued field. Schema.xml ? External file?

2010-06-26 Thread Saïd Radhouani
Thanks Geert-Jan for the detailed answer. Actually, I don't search at all on these fields. I'm only filtering (w/ vs w/ pic) and sorting (based on the number of pictures). Thus, your suggestion of adding an extra field NrOfPics [0,N] would be the best solution. Regarding the other suggestion:

Re: Setting many properties for a multivalued field. Schema.xml ? External file?

2010-06-26 Thread Geert-Jan Brits
You can treat dynamic fields like any other field, so you can facet, sort, filter, etc on these fields (afaik) I believe the confusion arises that sometimes the usecase for dynamic fields seems to be ill-understood, i.e: to be able to use them to do some kind of wildcard search, e.g: search for a

Re: Setting many properties for a multivalued field. Schema.xml ? External file?

2010-06-26 Thread Saïd Radhouani
Thanks so much Otis. This is working great. Now, I'm trying to make facets on pictures: display doc w/ pic vs. doc w/o pic To the best of my knowledge, everyone is saying that faceting cannot be done on dynamic fields (only on definitive field names). Thus, I tried the following and it's workin