Logic behind Solr creating files in .../data/index path.

2010-09-07 Thread rajini maski
All, While we post data to Solr... The data get stored in //data/index path in some multiple files with different file extensions... Not worrying about the extensions, I want to know how are these number of files created ? Does anyone know on what logic are these multiple index files created

Re: Logic behind Solr creating files in .../data/index path.

2010-09-07 Thread Ryan McKinley
Check: http://lucene.apache.org/java/3_0_2/fileformats.html On Tue, Sep 7, 2010 at 3:16 AM, rajini maski rajinima...@gmail.com wrote: All, While we post data to Solr... The data get stored in   //data/index  path in some multiple files with different file extensions... Not worrying about

Nutch/Solr

2010-09-07 Thread Yavuz Selim YILMAZ
I tried to combine nutch and solr, want to ask somethig. After crawling, nutch has certain fields such as; content, tstamp, title. How can I map content field after crawling ? Do I have change the lucene code (such as add extra field)? Or overcome in solr stage? Any suggestion? Thx. -- Yavuz

Re: Nutch/Solr

2010-09-07 Thread Markus Jelsma
Depends on your version of Nutch. At least trunk and 1.1 obey the solrmapping.xml file in Nutch' configuration directory. I'd suggest you start with that mapping file and the Solr schema.xml file shipped with Nutch as it exactly matches with the mapping file. Just restart Solr with the new

Re: Null pointer exception when mixing highlighter shards q.alt

2010-09-07 Thread Marc Sturlese
I noticed that long ago. Fixed it doing in HighlightComponent finishStage: @Override public void finishStage(ResponseBuilder rb) { boolean hasHighlighting = true ; if (rb.doHighlights rb.stage == ResponseBuilder.STAGE_GET_FIELDS) { Map.EntryString, Object[] arr = new

Re: Nutch/Solr

2010-09-07 Thread Yavuz Selim YILMAZ
In fact, I used nutch 0.9 version, but thinking of passing the new version. If anybody did something like that, ı want to learn their experience. If indexing an xml file, there are specific fields and all of them are dependent among them, so duplicates don't happen. I want to extract specific

Re: Nutch/Solr

2010-09-07 Thread Markus Jelsma
You should: - definately upgrade to 1.1 (1.2 is on the way), and - subscribe to the Nutch mailing list for Nutch specific questions. On Tuesday 07 September 2010 10:36:58 Yavuz Selim YILMAZ wrote: In fact, I used nutch 0.9 version, but thinking of passing the new version. If anybody did

Query result ranking - Score independent

2010-09-07 Thread Alessandro Benedetti
Hi all, I need to retrieve query-results with a ranking independent from each query-result's default lucene score, which means assigning the same score to each query result. I tried to use a zero boost factor ( ^0 ) to reset to zero each query-result's score. This strategy seems to work within the

Re: Alphanumeric wildcard search problem

2010-09-07 Thread Erick Erickson
Thanks for letting us know. What was the magic? I'm still unclear what was different between my tests and your implementation, mysteries like this make me nervous G.. Thanks Erick On Mon, Sep 6, 2010 at 5:45 PM, Hasnain hasn...@hotmail.com wrote: Finally got it working, thanks for your help

Re: Expanded Synonyms + phrase search

2010-09-07 Thread Jak Akdemir
Did you check ../admin/analysis.jsp page to see how index and query analyzer behaved? In usual, when you add parti socialiste to synonyms-fr.txt, it would response correctly both of PS et and parti socialiste queries. On Mon, Aug 30, 2010 at 4:55 PM, Xavier Schepler

Re: Null pointer exception when mixing highlighter shards q.alt

2010-09-07 Thread Ron Mayer
Marc Sturlese wrote: I noticed that long ago. Fixed it doing in HighlightComponent finishStage: ... public void finishStage(ResponseBuilder rb) { ... } Thanks! I'll try that I also seem to have a similar problem with shards + facets -- in particular it seems like the error

Re: Implementing synonym NewBie

2010-09-07 Thread Jak Akdemir
If you think to improve your synonyms file by time I would recommend you query time indexing. By the way you don't have to re-index when you need to add something more. On Sat, Aug 28, 2010 at 10:01 AM, Jonty Rhods jonty.rh...@gmail.com wrote: Hi All, I want to use synonym for my search.

ankita shinde wants to chat

2010-09-07 Thread ankita shinde
--- ankita shinde wants to stay in better touch using some of Google's coolest new products. If you already have Gmail or Google Talk, visit: http://mail.google.com/mail/b-d1bf7a33e2-4d170858b7-C4KO27fMXYsHI1lHg8OOW9Oi-ts You'll

How to extend IndexSchema and SchemaField

2010-09-07 Thread Renaud Delbru
Hi, I would like to extend the field node in the schema.xml by adding new attributes. For example, I would like to be able to write: field type=myField myattribute=myvalue/ And be able to access myattribute directly from IndexSchema and SchemaField objects. However, these two classes are

Advice requested. How to map 1:M or M:M relationships with support for facets

2010-09-07 Thread Tim Gilbert
Hi guys, Question: What is the best way to create a solr schema which supports a 'multivalue' where the value is a two item array of event category and a date. I want to have faceted searches, counts and Date Range ability on both the category and the dates. Details: This is a

Re: How to give path in SCRIPT tag?

2010-09-07 Thread Simon Willnauer
ankita, your questions seems to be somewhat unrelated to solr / lucene and should be asked somewhere else but not on this list. Please try to keep the focus of your questions to Solr related topics or use java-user@ for lucene related topics. Thanks, Simon On Tue, Sep 7, 2010 at 3:46 PM,

RE: solr user

2010-09-07 Thread Dave Searle
You probably need to use the file:// moniker - if using firefox, install firebug and use the net panel to see if the includes load -Original Message- From: ankita shinde [mailto:ankitashinde...@gmail.com] Sent: 07 September 2010 18:22 To: solr-user@lucene.apache.org Subject: solr user

Re: Query result ranking - Score independent

2010-09-07 Thread Grant Ingersoll
On Sep 7, 2010, at 7:08 AM, Alessandro Benedetti wrote: Hi all, I need to retrieve query-results with a ranking independent from each query-result's default lucene score, which means assigning the same score to each query result. I tried to use a zero boost factor ( ^0 ) to reset to zero

Re: SolrCloud distributed indexing (Re: anyone use hadoop+solr?)

2010-09-07 Thread MitchK
What if we do not care about the version of a document at index-time? When it comes to distributed search, we currently decide aggregating documents based on their uniqueKey. But what would be, if we decide additionally decide on uniqueKey plus indexingDate, so that we only aggregate the last

Re: SolrCloud distributed indexing (Re: anyone use hadoop+solr?)

2010-09-07 Thread MitchK
I must add something to my last post: When saying it could be used together with techniques like consistent hashing, I mean it could be used at indexing time for indexing documents, since I assumed that the number of shards does not change frequently and therefore an ODV-case becomes relatively

Re: Search Results optimization

2010-09-07 Thread Chris Hostetter
: also my request handler looks like this : : requestHandler name=mb_artists class=solr.SearchHandler : lst name=defaults : str name=defTypedismax/str : str name=qfname ^2.4/str : str name=tie0.1/str : /lst : /requestHandler that request handler doesn't match up with the output you posted in

Is there a way to fetch the complete list of data from a particular column in SOLR document?

2010-09-07 Thread bbarani
Hi, I am trying to get complete list of unique document ID and compare it with that of back end to make sure that both back end and SOLR documents are in sync. Is there a way to fetch the complete list of data from a particular column in SOLR document? Once I get the list, I can easily compare

Re: shingles work in analyzer but not real data

2010-09-07 Thread Chris Hostetter
: Hi Robert, thanks for the response. I've looked into the query parsers a : bit and I did find that using the raw parser on a matching multi-word : keyword works correctly. I need to have shingling though, in order to : support query phrases. It seems odd to have the query parser emitting

RE: Is there a way to fetch the complete list of data from a particular column in SOLR document?

2010-09-07 Thread Markus Jelsma
q=*:*fl=id_FIELDrows=NUM_DOCS ?   -Original message- From: bbarani bbar...@gmail.com Sent: Tue 07-09-2010 23:09 To: solr-user@lucene.apache.org; Subject: Is there a way to fetch the complete list of data from a particular column in SOLR document? Hi, I am trying to get complete list

Re: FieldCache.DEFAULT.getInts vs FieldCache.DEFAULT.getStringIndex. Memory usage

2010-09-07 Thread Chris Hostetter
: I need to load a FieldCache for a field wich is a solr integer type and has : as maximum 3 digits. Let's say my index has 10M docs. : I am wandering what is more optimal and less memory consumig, to load a : FieldCache.DEFAUL.getInts or a FieldCache.DEFAULT.getStringIndex. by itself, getInts

Re: Is there a way to fetch the complete list of data from a particular column in SOLR document?

2010-09-07 Thread Geert-Jan Brits
Please let me know if there are any other ideas / suggestions to implement this. You're indexing program should really take care of this IMHO. Each time your indexer inserts a document to Solr, flag the corresponding entity in your RDBMS, each time you delete, remove the flag. You should

Re: Download document from solr

2010-09-07 Thread Chris Hostetter
: Subject: Download document from solr : References: aanlkti=ajq4qpifn2r0dyz=s9hv1i=pc-nqnxp3hw...@mail.gmail.com : In-Reply-To: aanlkti=ajq4qpifn2r0dyz=s9hv1i=pc-nqnxp3hw...@mail.gmail.com http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new

Re: MoreLikethis and fq not giving exact results ?

2010-09-07 Thread Chris Hostetter
: But when I enable mlt inside the query it returns the results for jp_ as : well, because job_title also exist in job posting ( though jp_ or cp_ : already differentiating to both of this ?) I don't believe the MLT Component has anyway of filtering like this. In your case you want the fq

Re: Deploying Solr 1.4.1 in JbossAs 6

2010-09-07 Thread Chris Hostetter
: 1-extract the solr.war : 2-edit the web.xml for setting solr/home param : 3-create the solr.war : 4-setup solr home directory : 5-copy the solr.war to JBossAs 6 deploy directory : 7-start the jboss server I don't know a lot about JBoss, but from what i understand there really shouldn't be any

Re: Solr, c/s type ?

2010-09-07 Thread Chris Hostetter
: Subject: Solr, c/s type ? : : i'm wondering c/s type is possible (not http web type). : if possible, could i get the material about it? You're going t oneed to provide more info exaplining what it is you are asking baout -- i don't know about anyone else, but i honestly have absolutely no

RE: Re: MoreLikethis and fq not giving exact results ?

2010-09-07 Thread Markus Jelsma
I can think of two useful cases for a feature that limits MLT results depending with an optional mlt.fq parameter that limits the MLT results for each document, based on that fq:   1. prevent irrelevant docs when in a deep faceted navigation 2. general search results with MLT where you need

Re: Is semicolon a character that needs escaping?

2010-09-07 Thread Chris Hostetter
: Subject: Is semicolon a character that needs escaping? ... : From this I conclude that there is a bug either in the docs or in the : query parser or I missed something. What is wrong here? Back in Solr 1.1, the standard query parser treated ; as a special character and looked for

RE: Re: MoreLikethis and fq not giving exact results ?

2010-09-07 Thread Chris Hostetter
: I can think of two useful cases for a feature that limits MLT results : depending with an optional mlt.fq parameter that limits the MLT results : for each document, based on that fq: i don't disagree with you -- i was just commenting that it doesn't work that way at the moment, because it

RE: Re: MoreLikethis and fq not giving exact results ?

2010-09-07 Thread Markus Jelsma
I know =)   I was just polling votes for a feature request - there is no such issue filed for this component. Perhaps there should be?   -Original message- From: Chris Hostetter hossman_luc...@fucit.org Sent: Wed 08-09-2010 00:13 To: solr-user@lucene.apache.org; Subject: RE: Re:

Re: stream.url

2010-09-07 Thread Chris Hostetter
:I used escape charaters and made it... It is not problem for : a single file of 'solr apache' but it shows the same problem for the files : like Wireless lan.ppt, Tom info.pdf. Since you haven't told us what the original URL is that you are trying to pass as a value for the

Help with partial term highlighting

2010-09-07 Thread Jed Glazner
Hello Everyone, Thanks for taking time to read through this. I'm using a checkout from the solr 3.x branch My problem is with the highlighter and wildcards, and is exactly the same as this guy's but I can't find a reply to his problem:

Null Pointer Exception with shardsfacets where some shards have no values for some facets.

2010-09-07 Thread Ron Mayer
Short summary: * Mixing Facets and Shards give me a NullPointerException when not all docs have all facets. * Attached patch improves the failure mode, but still spews errors in the log file * Suggestions how to fix that would be appreciated. In my system, I tried separating out a

Re: Null Pointer Exception with shardsfacets where some shards have no values for some facets.

2010-09-07 Thread Yonik Seeley
Thanks for the report Ron, can you open a JIRA issue? What version of Solr is this? -Yonik http://lucenerevolution.org Lucene/Solr Conference, Boston Oct 7-8 On Tue, Sep 7, 2010 at 8:31 PM, Ron Mayer r...@0ape.com wrote: Short summary:  * Mixing Facets and Shards give me a

Re: Null Pointer Exception with shardsfacets where some shards have no values for some facets.

2010-09-07 Thread Ron Mayer
Yonik Seeley wrote: Thanks for the report Ron, can you open a JIRA issue? Sure. I'll do it at work tomorrow morning, hopefully after I try to verify with a standalone test case. What version of Solr is this? This is trunk as of a few days ago. I can update to the latest trunk and check

How to use TermsComponent when I need a filter

2010-09-07 Thread David Yang
Hi, I have a solr index, which for simplicity is just a list of names, and a list of associations. (either a multivalue field e.g. {A1, A2, A3, A6} or a string concatenation list e.g. A1 A2 A3 A6) I want to be able to provide autocomplete but with a specific association. E.g. Names beginning

Batch update, order of evaluation

2010-09-07 Thread Greg Pendlebury
Does anyone know with certainty how (or even if) order is evaluated when updates are performed by batch? Our application internally buffers solr documents for speed of ingest before sending them to the server in chunks. The XML documents sent to the solr server contain all documents in the order

list of filters/factories/Input handlers/blah blah

2010-09-07 Thread Dennis Gearon
Is there a definitive list of: filters inputHandlers and other 'code fragments' that do I/O processing for Solr/Lucene? Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at

Re: Advice requested. How to map 1:M or M:M relationships with support for facets

2010-09-07 Thread Lance Norskog
These days the best practice for a 'drill-down' facet in a UI is to encode both the unique value of the facet and the displayable string into one facet value. In the UI, you unpack and show the display string, and search with the full facet string. If you want to also do date ranges, make a

Re: Deploying Solr 1.4.1 in JbossAs 6

2010-09-07 Thread Lance Norskog
Does JBoss still uses Tomcat? Tomcat has an external file to configure war files in Catalina/localhost. If JBoss is not Tomcat any more, it must have a directory and file format somewhere for an external configuration of a servlet war. Lance Chris Hostetter wrote: : 1-extract the solr.war

RE: list of filters/factories/Input handlers/blah blah

2010-09-07 Thread Jonathan Rochkind
Not neccesarily definitive, but filters and tokenizers can be found here: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters Not sure if that's all of the analyzers (which I think is the generic name for both tokenizers and filters) that come with Solr, but I believe it's at least