Re: SolrServer instances

2011-08-25 Thread Jonty Rhods
do I also required to close the connection from solr server (CommonHttpSolrServer). regards On Fri, Aug 26, 2011 at 9:45 AM, Jonty Rhods wrote: > Deal all please help I am stuck here as I have not much experience.. > > thanks > > On Thu, Aug 25, 2011 at 6:51 PM, Jonty Rhods wrote: > >> Hi All,

FileDataSource baseDir to be solr.data.dir

2011-08-25 Thread abhayd
hi we have multiple environments where we run solr and use DIH to index promotions.xml file. here is snippet from dih config file http://lucene.472066.n3.nabble.com/FileDataSource-baseDir-to-be-solr-data-dir-tp3285872p3285872.html Sent from the Solr - User mailing list archive at Nabble.

Re: SolrServer instances

2011-08-25 Thread Jonty Rhods
Deal all please help I am stuck here as I have not much experience.. thanks On Thu, Aug 25, 2011 at 6:51 PM, Jonty Rhods wrote: > Hi All, > > I am using SolrJ (3.1) and Tomcat 6.x. I want to open solr server once (20 > concurrence) and reuse this across all the site. Or something like > connect

making a Solr search that returns documents where every field in the document is matched

2011-08-25 Thread Henry Ho
Hi everybody, I've just started using solr. Not sure if this is too specific of a problem, but here goes. My situation is I have a semi long query and then im searching on very short documents. The basic issue is I want this query to return documents where every word/token in the document is matc

missing field in schema browser on solr admin

2011-08-25 Thread deniz
hi all... i have added a new field to index... but now when i check solr admin, i see some interesting stuff... i can see the field in schema and also db config file but there is nothing about the field in schema browser... in addition i cant make a search in that field... all of the config files

Re: Paging over mutlivalued field results?

2011-08-25 Thread Darren Govoni
Hi Erick, Sure thing. I have a document schema where I put the sentences of that document in a multivalued field "sentences". I search that field in a query but get back the document results, naturally. I then need to further find which exact sentences matched the query (for each document

Re: Is it possible to do a partial update of a doc's fields in the index?

2011-08-25 Thread Goran Pocina
Got an answer from the excellent support folks at LucidWorks: Currently Lucene/Solr can't do field-level updating. So whenever a new document is indexed, if it has the same unique identifier (in this case "id") field, then the new document replaces the older document. There is an open JIRA issue

RE: how to deal with URLDatasource which needs authorization?

2011-08-25 Thread deniz
Well, let me explain in details about the problem... I have a website www.blablabla.com on which users can have profiles, with any kind of information. And each user has an id, something like user_xyz. So www.blablabla.com/user_xyz shows user profile, and www.blablabla.com/solr/index/user_xyz show

Re: Query parameter changes from solr 1.4 to 3.3

2011-08-25 Thread Yonik Seeley
On Tue, Aug 23, 2011 at 7:11 AM, Samarendra Pratap wrote: >  We are upgrading solr 1.4 (with collapsing patch solr-236) to solr 3.3. I > was looking for the required changes in query parameters (or parameter > names) if any. There should be very few (but check CHANGES.txt as Erick pointed out). W

Re: Highlight on alternateField

2011-08-25 Thread Koji Sekiguchi
(11/08/26 2:32), Val Minyaylo wrote: Hi there, I am trying to utilize highlighting alternateField and can't get highlights on the results from targeted fields. Is this expected behavior or am I understanding alternateFields wrong? Yes, it is expected behavior. solrconfig.xml: description_hi

Re: Paging over mutlivalued field results?

2011-08-25 Thread Erick Erickson
Hmm, I don't quite understand what you want. An example or two would help. Best Erick On Thu, Aug 25, 2011 at 12:11 PM, Darren Govoni wrote: > Hi, >  Is it possible to construct a query in Solr where the paged results are > matching multivalued fields and not documents? > > thanks, > Darren >

Re: Solr indexing process: keep a persistent Mysql connection throu all the indexing process

2011-08-25 Thread Erick Erickson
Put it anywhere you want . Here's a good place to start: http://www.javapractices.com/topic/TopicAction.do?Id=46 where the distributePresents method is the one you have that returns the connection. Here's a sample class that doesn't do much... public enum MyEnum { INSTANCE; private String _t

Is it possible to do a partial update of a doc's fields in the index?

2011-08-25 Thread Goran Pocina
New to Solr and Lucene. We're indexing text, pdf, html docs located on local Unix file systems, and need the ability to search for file owner, group, and other Linux file metadata, in addition to the file contents. It would be great if we could use nutch to index everything, and then crawl throu

Re: Query vs Filter Query Usage

2011-08-25 Thread Lance Norskog
The point of filter queries is that they are applied very early in the searching algorithm, and thus cut the amount of work later on. Some complex queries take a lot of time and so this pre-trimming helps a lot. On Thu, Aug 25, 2011 at 2:37 PM, Yonik Seeley wrote: > On Thu, Aug 25, 2011 at 5:19 P

solr UIMA exception

2011-08-25 Thread chanhangfai
Hi, I have followed this solr UIMA config, using AlchemyAPIAnnotator and OpenCalaisAnnotator. https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/contrib/uima/README.txt http://wiki.apache.org/solr/SolrUIMA so, I got the AlchemyAPI key and OpenCalais key. and I can successfully hit http://a

Re: Query vs Filter Query Usage

2011-08-25 Thread Yonik Seeley
On Thu, Aug 25, 2011 at 5:19 PM, Michael Ryan wrote: >> 10,000,000 document index >> Internal Document id is 32 bit unsigned int >> Max Memory Used by a single cache slot in the filter cache = 32 bits x >> 10,000,000 docs = 320,000,000 bits or 38 MB > > I think it depends on where exactly the resu

RE: Query vs Filter Query Usage

2011-08-25 Thread Michael Ryan
> 10,000,000 document index > Internal Document id is 32 bit unsigned int > Max Memory Used by a single cache slot in the filter cache = 32 bits x > 10,000,000 docs = 320,000,000 bits or 38 MB I think it depends on where exactly the result set was generated. I believe the result set will usually

RE: Solr in a windows shared hosting environment

2011-08-25 Thread Jaeger, Jay - DOT
You visit the Sun (oops, I mean Oracle -- old habits die hard) web site, download it and install it, or, being shared, you ask your provider to do it for you. They might decline, of course, in which case you host another web server using a hosting provide who does support java (or, at least ano

Solr Implementations

2011-08-25 Thread zarni aung
First, I would like to apologize if this is a repeat question but can't seem to get the right answer anywhere. - What happens to pending documents when the server dies abruptly? I understand that when the server shuts down gracefully, it will commit the pending documents and close the In

Re: Newbie Question, can I store structured sub elements?

2011-08-25 Thread Paul Libbrecht
Whether multi-valued or token-streams, the question is search, not (de)serialization: that's opaque to Solr which will take and give it to you as needed. paul Le 25 août 2011 à 21:24, Zac Tolley a écrit : > My search is very simple, mainly on titles, actors, show times and channels. > Having

Re: Query vs Filter Query Usage

2011-08-25 Thread Joshua Harness
Erick - Thanks for the insight. Does the filter cache just cache the internal document id's of the result set, correct (as opposed to the document)? If so, am I correct in the following math: 10,000,000 document index Internal Document id is 32 bit unsigned int Max Memory Used by a single cache s

Re: Newbie Question, can I store structured sub elements?

2011-08-25 Thread Zac Tolley
My search is very simple, mainly on titles, actors, show times and channels. Having multiple lists of values is probably better for that, and as the order is kept the same its relatively simple to map the response back onto pojos for my presentation layer. On Thu, Aug 25, 2011 at 8:18 PM, Paul Lib

Re: Solr in a windows shared hosting environment

2011-08-25 Thread simon
That's not a question we can answer in this group - you need to take it up with your hosting provider - they may already have it available. On Thu, Aug 25, 2011 at 2:59 PM, Devora wrote: > Thank you! > > Since it's shared hosting, how do I install java? > > -Original Message- > From: Jae

Re: Newbie Question, can I store structured sub elements?

2011-08-25 Thread Paul Libbrecht
Delimited text is the baby form of lists. Text can be made very very structured (think XML, ontologies...). I think the crux is your search needs. For example, with Lucene, I made a search for formulæ (including sub-terms) by converting the OpenMath-encoded terms into rows of tokens and querying

Re: Newbie Question, can I store structured sub elements?

2011-08-25 Thread Zac Tolley
have come to that conclusion so had to choose between multiple fields with multiple vales or a field with delimited text, gone for the former On Thu, Aug 25, 2011 at 7:58 PM, Erick Erickson wrote: > nope, it's not easy. Solr docs are flat, flat, flat with the tiny > exception that multiValued fi

Re: Preserve XML hierarchy

2011-08-25 Thread Erick Erickson
Jars aren't where it's at. You apply patches to *source* code, then compile. Here's a good place to start understanding this process: http://wiki.apache.org/solr/HowToContribute See "getting the code" and "working with patches" I *strongly* advise you to get the code and compile it and run it f

RE: Solr in a windows shared hosting environment

2011-08-25 Thread Devora
Thank you! Since it's shared hosting, how do I install java? -Original Message- From: Jaeger, Jay - DOT [mailto:jay.jae...@dot.wi.gov] Sent: Thursday, August 25, 2011 4:34 PM To: solr-user@lucene.apache.org Subject: RE: Solr in a windows shared hosting environment Yes, but since Solr is

Re: Newbie Question, can I store structured sub elements?

2011-08-25 Thread Erick Erickson
nope, it's not easy. Solr docs are flat, flat, flat with the tiny exception that multiValued fields are returned as a lists. However, you can count on multi-valued fields being returned in the order they were added, so it might work out for you to treat these as parallel arrays in Solr documents.

Re: Query vs Filter Query Usage

2011-08-25 Thread Erick Erickson
The pitfalls of filter queries is also their strength. The results will be cached and re-used if possible. This will take some memory, of course. Depending upon how big your index is, this could be quite a lot. Yet another time/space tradeoff But yeah, use filter queries until you have OOMs, t

Re: Field type change / copy field

2011-08-25 Thread Erick Erickson
What version of Solr are you using? Because 3.2 (and I believe 3.1) and later have faceting and range on numeric values, so there would be no need to use two fields. And you could then avoid the format thing entirely. Best Erick On Wed, Aug 24, 2011 at 5:53 AM, Oliver Schihin wrote: > Hello lis

RE: Best way to anchor solr searches?

2011-08-25 Thread arian487
Thanks for the replies. I did look at caching but our commit time time is 90 seconds. It's definitely possible for someone to make a search, change the page, and have wonky results. How about getting it to autowarm the x most recent searches in the queryResultCache and that can hopefully reduce

RE: Text Analysis and copyField

2011-08-25 Thread Herman Kiefus
It had crossed my mind but for now we have a 'DictionarySource' field whose type utilizes the KeepWordFilterFactory that uses a text file containing all correctly spelled words (thanks to scrabble), location/last/first names (courtesy of the US census bureau) and a few other adds (month/day) nam

Highlight on alternateField

2011-08-25 Thread Val Minyaylo
Hi there, I am trying to utilize highlighting alternateField and can't get highlights on the results from targeted fields. Is this expected behavior or am I understanding alternateFields wrong? schema.xml: multiValued="true" termVectors="true" termPositions="true" termOffsets="true"/> stored

Paging over mutlivalued field results?

2011-08-25 Thread Darren Govoni
Hi, Is it possible to construct a query in Solr where the paged results are matching multivalued fields and not documents? thanks, Darren

Re: csv responsewriter and numfound

2011-08-25 Thread Jon Hoffman
I've added a case with patch here: https://issues.apache.org/jira/browse/SOLR-2731 On Wed, Aug 24, 2011 at 11:59 AM, Jon Hoffman wrote: > I took a look at the source and agree that it would be a bit hairy to > bubble up header settings from the response writers. > > Alternatively, and I'll admit

Re: not equals query in solr

2011-08-25 Thread simon
http://wiki.apache.org/solr/SolrQuerySyntax has answers for you. -Simon On Thu, Aug 25, 2011 at 1:04 AM, Ranveer Kumar wrote: > any help... > > On Wed, Aug 24, 2011 at 12:58 PM, Ranveer Kumar >wrote: > > > Hi, > > > > is it right way to do : > > q=(state:[* TO *] AND city:[* TO *]) > > > > rega

Re: Where the heck do you put maxAnalyzedChars?

2011-08-25 Thread Daniel Skiles
Thanks. That seemed to do it. I was thrown by the section of documentation that said "This parameter makes sense for Highlighter only" and tried to put it in the various highlighter elements. On Wed, Aug 24, 2011 at 6:52 PM, Koji Sekiguchi wrote: > (11/08/25 5:29), Daniel Skiles wrote: > >> I

Re: Solr indexing process: keep a persistent Mysql connection throu all the indexing process

2011-08-25 Thread samuele.mattiuzzo
since i'm barely new to solr, can you please give some guidelines or provide an example i can look at for starters? i already tought about a singleton implementation, but i'm not sure where i have to put it and how should i start coding it -- View this message in context: http://lucene.472066.n3

Re: Return only the fields where there are results

2011-08-25 Thread Erick Erickson
"So, there are 2 documents with results, but Solr show me fields without the word "alquileres", why?" Because that's the way Solr is built . It wouldn't be very useful for Solr to only return the fields with matches. Consider indexing articles with a title and contents. If you search matched on co

Re: Spatial Search problems

2011-08-25 Thread Smiley, David W.
Um, You might try googling LocalSolr or LocalLucene -- dead projects but you insist on using an old Solr/Lucene. Of course if all you need is a bounding box filter than a pair of lat & lon range queries is sufficient. ~ David Smiley On Aug 25, 2011, at 4:01 AM, Javier Heras wrote: > Thanx Dav

RE: How to copy and extract information from a multi-line text before the tokenizer

2011-08-25 Thread Jaeger, Jay - DOT
"A programmer had a problem. He tried to solve it with regular expressions. Now he has two problems" :). A. That just isn't fair... 8^) (I can't think of very many things that have allowed me to perform more magic over my career than regular expressions, starting with SNOBOL. Uh oh: I ju

RE: Solr in a windows shared hosting environment

2011-08-25 Thread Jaeger, Jay - DOT
Yes, but since Solr is written in Java to run in a JEE container, you would host Solr in a web application server, either Jetty (which comes packaged), or something else (say, Tomcat or WebSphere or something like that). As a result, you aren't going to find anything that says how to run Solr un

Re: Problem using stop words

2011-08-25 Thread Erick Erickson
Hmmm, I'd expect you to have an error in your log file if you haven't removed the default field type named "string". If you have removed it from your schema, this should work... But I'd change the name anyway, it'll cause endless confusion. And don't forget to re-index *everything*, preferably af

SolrServer instances

2011-08-25 Thread Jonty Rhods
Hi All, I am using SolrJ (3.1) and Tomcat 6.x. I want to open solr server once (20 concurrence) and reuse this across all the site. Or something like connection pool like we are using for DB (ie Apache DBCP). There is a way to use static method which is a way but I want better solution from you pe

Re: Solr indexing process: keep a persistent Mysql connection throu all the indexing process

2011-08-25 Thread Erick Erickson
Yes, but you can always employ a singleton to open and maintain a DB connection. Best Erick On Tue, Aug 23, 2011 at 9:16 PM, samuele.mattiuzzo wrote: > those documents are unrelated to the database. the db i have is just storing > countries - region - cities, and it's used to do a refinement on

RE: Newbie question, ant target for packaging source files from local copy?

2011-08-25 Thread Steven A Rowe
Hi sid, The current source packaging scheme aims to *avoid* including local changes :), so yes, there is no support currently for what you want to do. Prior to , the source packaging scheme used the current sources rather than pulling from Subv

RE: Best way to anchor solr searches?

2011-08-25 Thread Jaeger, Jay - DOT
I don't think it has to be quite so bleak as that, depending upon the number of queries done over a given timeframe, and the size of the result sets. Solr does cache the identifiers of "documents" returned by search results. See http://wiki.apache.org/solr/SolrCaching paying particular attent

Re: Grouping and performing statistics per group

2011-08-25 Thread Martijn v Groningen
If you take this query from the wiki: http://localhost:8983/solr/select?q=*:*&stats=true&stats.field=price&stats.field=popularity&stats.twopass=true&rows=0&indent=true&stats.facet=inStock In this case you get stats about the popularity per inStock value (true / false). Replacing this values with we

Re: Spellcheck Phrases

2011-08-25 Thread Erick Erickson
Please start a new thread for this question, see: http://people.apache.org/~hossman/#threadhijack <<< When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if you change the subject line of your email, other mail headers st

Re: Query parameter changes from solr 1.4 to 3.3

2011-08-25 Thread Erick Erickson
Have you looked at the CHANGES.txt file? That is supposed to be updated with all these kinds of notes... Best Erick On Tue, Aug 23, 2011 at 7:11 AM, Samarendra Pratap wrote: > Hi, >  We are upgrading solr 1.4 (with collapsing patch solr-236) to solr 3.3. I > was looking for the required changes

Re: How to copy and extract information from a multi-line text before the tokenizer

2011-08-25 Thread Erick Erickson
You could consider writing your own UpdateHandler. It allows you to get access to the underlying SolrInputDocument, and freely modify the fields before it even gets to the analysis chain in defined in your schema. So you can get your "AllData" out of the doc, split it apart as many ways as you want

Re: Grouping and performing statistics per group

2011-08-25 Thread Omri Cohen
Hi, thanks for your reply.. it doesn't work. I am getting the plain stats results at the end of the response, but no statistics per group.. thanks anyway

Re: Grouping and performing statistics per group

2011-08-25 Thread Martijn v Groningen
Or if you dont care about grouped results you can also add the following option: stats.facet=gender On 25 August 2011 14:40, Martijn v Groningen wrote: > Hi Omri, > > I think you can achieve that with grouping and the Solr StatsComponent ( > http://wiki.apache.org/solr/StatsComponent). > In order

Re: can you help on this?

2011-08-25 Thread Shalin Shekhar Mangar
Can you please specify: 1. Solr version 2. Platform 3. JDK version 4. Under what conditions does this error happens? Is it reproducible? On Wed, Aug 24, 2011 at 6:53 PM, abhijit bashetti wrote: > SEVERE: java.lang.InternalError: a fault occurred in a recent unsafe memory > access o

Re: Grouping and performing statistics per group

2011-08-25 Thread Martijn v Groningen
Hi Omri, I think you can achieve that with grouping and the Solr StatsComponent ( http://wiki.apache.org/solr/StatsComponent). In order to compute statistics on groups you must set the option group.truncate=true An example query: q=*:*&group=true&group.field=gender&group.truncate=true&stats=true&s

Re: Upload doc and pdf in Solr 3.3.0

2011-08-25 Thread Moinsn
Thanks. Now it works. -- View this message in context: http://lucene.472066.n3.nabble.com/Upload-doc-and-pdf-in-Solr-3-3-0-tp3283224p3283760.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Best way to anchor solr searches?

2011-08-25 Thread lee carroll
I don't think solr conforms to ACID type behaviours for its queries. This is not to say your use-case is not important just that its not SOLR's focus. I think its a interesting question but the solution is probably going to involve rolling your own. Something like returning 1 user docs and cac

Re: Upload doc and pdf in Solr 3.3.0

2011-08-25 Thread Jayendra Patil
http://wiki.apache.org/solr/ExtractingRequestHandler may help. Regards, Jayendra On Thu, Aug 25, 2011 at 3:24 AM, Moinsn wrote: > Good Morning, > > I have to set up a Solr System to seek in documents like pdf and doc. My > Solr System is running in the meantime, but i cant find a tutorial that >

Solr in a windows shared hosting environment

2011-08-25 Thread Devora
Hi, Is it possible to install Solr in a windows (IIS 7 or IIS 7.5) shared hosting environment? If yes, where can I find instructions how to do that? Thank you!

Upload doc and pdf in Solr 3.3.0

2011-08-25 Thread Moinsn
Good Morning, I have to set up a Solr System to seek in documents like pdf and doc. My Solr System is running in the meantime, but i cant find a tutorial that tells me what i have to do to put the files in the system. I hope you can help me a bit to bring that off on a simple way. And please excu

Re: Grouping and performing statistics per group

2011-08-25 Thread Omri Cohen
Thanks, but I actually need something more deterministic and more accurate.. Anyone knows if there is an already existing feature for that? thanks again

Re: Grouping and performing statistics per group

2011-08-25 Thread Sowmya V.B.
Hi Is it possible that Luke Handler can be used for this? I used Something like: http://localhost:8080/solr/admin/luke?fl=fieldName&numTerms=1 to get an estimate of a range of values a field can have. Hope you find this information useful. Sowmya. On Thu, Aug 25, 2011 at 10:58 AM, Omri Coh

Grouping and performing statistics per group

2011-08-25 Thread Omri Cohen
Hi All, I want to group-by certain field and perform statistics per group on a certain field of my choice. For example, if I have the next documents in my collection: 12353 65 male 12353 63 male 12353 49 male now I want to group by gender, and let say for the sake of

Re: Spatial Search problems

2011-08-25 Thread Javier Heras
Thanx David, Just one more question. Am I able to do spatial search with solr1.4? And with lucene 2.9? What's your recomendation? Javier -- View this message in context: http://lucene.472066.n3.nabble.com/Spatial-Search-problems-tp3277945p3283285.html Sent from the Solr - User mailing list arch

Re: Preserve XML hierarchy

2011-08-25 Thread _snake_
Hi Michael, Thanks for your help! I am using Apache Solr 3.2 on windows. I am trying to apply the 2 patches ( https://issues.apache.org/jira/browse/SOLR-2597?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#issue-tabs XMLCharFilter Patch ), but I have no idea to do that. W

Re: can i create filters of score range

2011-08-25 Thread jame vaalet
well when i said client.. i meant querying through solr.NET (in a way this can be seen as posting it through web browser url). so coming back to the issue .. even if am sorting it by _docid_ i need to do paging( 2 million docs in result) how is it internally doing it ? when sorted by docid, don we

Re: Newbie Question, can I store structured sub elements?

2011-08-25 Thread Zac Tolley
I know I can have multi value on them but that doesn't let me see that a showing instance happens at a particular time on a particular channel, just that it shows on a range of channels at a range of times Starting to think I will have to either store a formatted string that combines them or keep