Re: Solr cloud date based paritioning

2013-07-02 Thread kowish.adamosh
Otis, Is it implemented? I can find only unresolved JIRA issues. Kowish -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-cloud-date-based-paritioning-tp4074729p4074993.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to re-index Solr & get term frequency within documents

2013-07-02 Thread Tony Mullins
Hi Otis, I am quite new to Solr. And have looked at this link " http://search-lucene.com/jd/solr/solr-dataimporthandler/org/apache/solr/handler/dataimport/SolrEntityProcessor.html"; but could not figure out how to use it to re-index my all data in solr. Could you please explain in little detail t

Extract file name (without extension) while indexing using Data Import Handler in Solr

2013-07-02 Thread archit2112
Im successfully able to index pdf,doc,ppt,etc files using the Data Import Handler in solr 4.3.0 . My data-config.xml looks like this -

Re: Aggregate TermFrequency on Result Grouping / Field Collapsing

2013-07-02 Thread Tony Mullins
Any suggestions please ! On Tue, Jul 2, 2013 at 3:24 PM, Tony Mullins wrote: > Hi, > > Is it possible to perform aggregated termfreq(field,term) on Result > Grouping ? > > I am trying to get total count of term's appearance in a document and then > want to aggregate that count by grouping the do

Re: Solr cloud date based paritioning

2013-07-02 Thread Otis Gospodnetic
Hi, Yes, search for keywords shard and route in ML archives and JIRA. Otis Solr & ElasticSearch Support http://sematext.com/ On Jul 2, 2013 4:12 PM, "kowish.adamosh" wrote: > Sure, I'ill measure results and come back if results will be > unsatisfactory. > Thanks very much for advice. > > Out of

Re: Solr cloud date based paritioning

2013-07-02 Thread Jan Høydahl
Not yet, you may look at https://issues.apache.org/jira/browse/SOLR-4221 for possible future solution. For the time being you could use Implicit routing, i.e. your application sends the docs directly to the correct shard, so every month your app code would decide to switch indexing to a new nod

Re: copyField and storage requirements

2013-07-02 Thread Shawn Heisey
On 7/2/2013 1:58 PM, Ali, Saqib wrote: Thanks Shawn. Here is the text_general type definition. We would like to bring down the storage requirement down to a minimum for those 500KB content documents. We just need basic full-text search. Thanks!!! :)

Re: Access to Solr Wiki

2013-07-02 Thread Steve Rowe
I've added GoraMohanty to the Solr wiki's ContributorsGroup page. - Steve On Jul 2, 2013, at 3:25 PM, Gora Mohanty wrote: > Hi, > > May I please be added to the list of editors to the > Solr Wiki as I see that some earlier changes seem > to have gone missing. My user name is GoraMohanty > Thank

Re: Partial Matching in both query and field

2013-07-02 Thread Jack Krupansky
Ahhh... you put autoGeneratePhraseQueries="false" on the field - but it needs to be on the field type. You can see from the parsed query that it generated the phrase. -- Jack Krupansky -Original Message- From: James Bathgate Sent: Tuesday, July 02, 2013 5:35 PM To: solr-user@lucene.

Re: Two instances of solr - the same datadir?

2013-07-02 Thread Michael Della Bitta
Wouldn't it be better to do a RELOAD? http://wiki.apache.org/solr/CoreAdmin#RELOAD Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 917 477 7906 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appinions

Re: Partial Matching in both query and field

2013-07-02 Thread James Bathgate
Jack, I've already tried that, here's my query: on on 0 0_extrafield1_n:20454 OR 10 2.2 Here's the parsed query: 0_extrafield1_n:"2o45 o454 2o454" Here's the applicable lines from schema.xml:

Re: Partial Matching in both query and field

2013-07-02 Thread Jack Krupansky
You will need to set q.op to "OR", and... use a field type that has the autoGeneratePhraseQueries attribute set to "false". -- Jack Krupansky -Original Message- From: James Bathgate Sent: Tuesday, July 02, 2013 5:10 PM To: solr-user@lucene.apache.org Subject: Partial Matching in both

Re: Newbie SolR - Need advice

2013-07-02 Thread Sandeep Mestry
Hi Fabio, Yes, you're on right track. I'd like to now direct you to first reply from Jack to go through solr tutorial. Even with Solr,, it will take some time to learn various bits and pieces about designing fields, their field types, server configuration, etc. and then tune the results to match

Partial Matching in both query and field

2013-07-02 Thread James Bathgate
Given a string of "123456" and a search query "923459", what should the schema look like to consider this a match because at least 4 consecutive in characters the query match 4 consecutive characters in the data? I'm trying an NGramFilterFactory on the index and NGramTokenizerFactory on the query i

Re: Two instances of solr - the same datadir?

2013-07-02 Thread Peter Sturge
The RO instance commit isn't (or shouldn't be) doing any real writing, just an empty commit to force new searchers, autowarm/refresh caches etc. Admittedly, we do all this on 3.6, so 4.0 could have different behaviour in this area. As long as you don't have autocommit in solrconfig.xml, there would

What are the options for obtaining IDF at interactive speeds?

2013-07-02 Thread Kathryn Mazaitis
Hi, I'm using SOLRJ to run a query, with the goal of obtaining: (1) the retrieved documents, (2) the TF of each term in each document, (3) the IDF of each term in the set of retrieved documents (TF/IDF would be fine too) ...all at interactive speeds, or <10s per query. This is a demo, so if all

RE: How to query Solr for empty field or specific value

2013-07-02 Thread Van Tassell, Kristian
Thank you! -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Tuesday, July 02, 2013 3:05 PM To: solr-user@lucene.apache.org Subject: Re: How to query Solr for empty field or specific value Better to define color.not_null as a boolean field and always initiali

Re: How to show just the parent domains from results in Solr

2013-07-02 Thread Jack Krupansky
Re-index your data with a separate field for "domain name", then either manually populate it or use an update processor to extract the domain name and store it in the desired field. You can then group by that field. The URL Classify update processor can do the trick. Or maybe a custom script w

Re: Solr cloud date based paritioning

2013-07-02 Thread kowish.adamosh
Sure, I'ill measure results and come back if results will be unsatisfactory. Thanks very much for advice. Out of curiosity: is there any way to partition shards (logical and physical) by specified value of specified field? Kowish -- View this message in context: http://lucene.472066.n3.nabble

Re: How to query Solr for empty field or specific value

2013-07-02 Thread Jack Krupansky
Better to define color.not_null as a boolean field and always initialize as either true or false. But, even without that you need write a pure negative query or clause as (*:* -term) So: select?q=*:*&fq=((*:* -color:[* TO *]) OR color:blue) and select?q=*:*&fq=((*:* -color.not_null

Re: copyField and storage requirements

2013-07-02 Thread Ali, Saqib
Thanks Shawn. Here is the text_general type definition. We would like to bring down the storage requirement down to a minimum for those 500KB content documents. We just need basic full-text search. Thanks!!! :)

How to query Solr for empty field or specific value

2013-07-02 Thread Van Tassell, Kristian
Hello, I'm using Solr 4.2 and am trying to get a specific value (blue) or null field (no color) returned by my filter query. My results should yield 3 documents (If I execute the two separate filters in different queries, I get 2 hits for one query and 1 for the other). I've tried this (blue o

RE: Newbie SolR - Need advice

2013-07-02 Thread fabio1605
So, you keep your mssql database, you just don't use it for searches - that'll relieve some of the load. Searches then all go through SOLR & its Lucene indexes. If your various tables need SQL joins, you specify those in the DataImportHandler (DIH) config. That way, when SOLR indexes everything, i

Access to Solr Wiki

2013-07-02 Thread Gora Mohanty
Hi, May I please be added to the list of editors to the Solr Wiki as I see that some earlier changes seem to have gone missing. My user name is GoraMohanty Thanks. Regards, Gora

Re: Solr cloud date based paritioning

2013-07-02 Thread Gora Mohanty
On 2 July 2013 22:35, kowish.adamosh wrote: > Thanks! > > I have very limited response time (max 100ms) therefore sharding is a must. Really? "Sharding is a must" without any measurements to validate that assertion? I am not sure what advice to give you if you seem determined to ignore any, but a

Re: Converting nested data model to solr schema

2013-07-02 Thread adfel70
My current solution is overriding the out-of-the-box shard routing, and forcing each document and its attachment to go into a specific shard. But this is so I can support the query time joins (because join are only performed between documents in the same shard). I'm a bit concerned by this approa

Re: DIH: HTMLStripTransformer in sub-entities?

2013-07-02 Thread Gora Mohanty
On 2 July 2013 20:55, Andy Pickler wrote: > Thanks for the quick reply. Unfortunately, I don't believe my company > would want me sharing our exact production schema in a public forum, > although I realize it makes it harder to diagnose the problem. The > sub-entity is a multi-valued field that

Re: Replicating files containing external file fields

2013-07-02 Thread Arun Rangarajan
Jack and Erick, Thanks for your replies. I am able to replicate ext file fields by specifying the relative paths for each individual file. confFiles in solrconfig.xml is really long now with lot of "../" and I got 5 ext file field files. Would be really nice if wild-cards were supported here :-).

Re: Solr large boolean filter

2013-07-02 Thread Mikhail Khludnev
Roman, It's covered in http://wiki.apache.org/solr/ContentStream | For POST requests where the content-type is not "application/x-www-form-urlencoded", the raw POST body is passed as a stream. So, there is no need for encoding of binary data inside the body. Regarding encoding, I have a pos

Filter cache pollution during sharded edismax queries

2013-07-02 Thread Ken Krugler
Hi all, After upgrading from Solr 3.5 to 4.2.1, I noticed our filterCache hit ratio had dropped significantly. Previously it was at 95+%, but now it's < 50%. I enabled recording 100 entries for debugging, and in looking at them it seems that edismax (and faceting) is creating entries for me.

Re: Two instances of solr - the same datadir?

2013-07-02 Thread Roman Chyla
Interesting, we are running 4.0 - and solr will refuse the start (or reload) the core. But from looking at the code I am not seeing it is doing any writing - but I should digg more... Are you sure it needs to do writing? Because I am not calling commits, in fact I have deactivated *all* components

Re: Two instances of solr - the same datadir?

2013-07-02 Thread Peter Sturge
Hmmm, single lock sounds dangerous. It probably works ok because you've been [un]lucky. For example, even with a RO instance, you still need to do a commit in order to reload caches/changes from the other instance. What happens if this commit gets called in the middle of the other instance's commit

Re: Solr large boolean filter

2013-07-02 Thread Roman Chyla
Hello Mikhail, Yes, GET is limited, but POST is not - so I just wanted that it works in both the same way. But I am not sure if I am understanding your question completely. Could you elaborate on the parameters/body part? Is there no need for encoding of binary data inside the body? Or do you mean

Re: copyField and storage requirements

2013-07-02 Thread Shawn Heisey
On 7/2/2013 12:22 PM, Ali, Saqib wrote: > Newbie question: > > We have the following fields defined in the schema: > > > > > > the content is field is about 500KB data. > > My question is whether Solr stores the entire contents of the that 500KB > content field? > > We want to minimize the

Re: Two instances of solr - the same datadir?

2013-07-02 Thread Roman Chyla
as i discovered, it is not good to use 'native' locktype in this scenario, actually there is a note in the solrconfig.xml which says the same when a core is reloaded and solr tries to grab lock, it will fail - even if the instance is configured to be read-only, so i am using 'single' lock for the

Re: Request to Edit Solr Wiki

2013-07-02 Thread Erick Erickson
Done, added VivekShivaprabhu to the Solr contributor's group. Let us know if you need the alias instead And thanks for helping with the Wiki! Erick On Tue, Jul 2, 2013 at 1:42 PM, Vivek Shivaprabhu wrote: > Hi > > I'd like to contribute to some of the page in the Solr Wiki at > wiki.apache

Request to Edit Solr Wiki

2013-07-02 Thread Vivek Shivaprabhu
Hi I'd like to contribute to some of the page in the Solr Wiki at wiki.apache.org/solr My username is VivekShivaprabhu (alias: vivekrs) Please do the needful. Thanks in advance! -Vivek R S

copyField and storage requirements

2013-07-02 Thread Ali, Saqib
Newbie question: We have the following fields defined in the schema: the content is field is about 500KB data. My question is whether Solr stores the entire contents of the that 500KB content field? We want to minimize the stored data in the Solr index, that is why we added the copyField te

Re: Converting nested data model to solr schema

2013-07-02 Thread Mikhail Khludnev
during indexing whole block (doc and it's attachment) goes into particular shard, then it's can be queried per every shard and results are merged. btw, do you feel any problem with your current approach - query time joins and out-of-the-box shard routing? On Tue, Jul 2, 2013 at 5:19 PM, adfel70

Re: set-based and other less common approaches to search

2013-07-02 Thread Mikhail Khludnev
try to hit dismax query parser specifying mm and qf parameters. On Tue, Jul 2, 2013 at 9:31 PM, gilawem wrote: > Thanks. So following up on a) below, could I set up and query Solr, > without any customization of code, to match 10 of my given 20 terms, but > only if it finds those 10 terms in an

Re: Solr large boolean filter

2013-07-02 Thread Mikhail Khludnev
Hello Roman, Don't you consider to pass long id sequence as body and access internally in solr as a content stream? It makes base64 compression not necessary. AFAIK url length is limited somehow, anyway. On Tue, Jul 2, 2013 at 9:32 PM, Roman Chyla wrote: > Wrong link to the parser, should be:

Re: Solr cloud date based paritioning

2013-07-02 Thread kowish.adamosh
Thanks! I have very limited response time (max 100ms) therefore sharding is a must. Data also have trend to grow up to tens of gigs. Is there any way how to create new logical shard in runtime? I want to logically partition my data by date. I'm still wondering how is implemented example from docum

How to show just the parent domains from results in Solr

2013-07-02 Thread A Geek
hi All, I've indexed documents in my Solr 4.0 index, with fields like URL, page_content etc. Now when I run a search query, against the page_content I get a lot of urls . And say, if I in total 15 URL domains, and under these 15 domains I've all the pages indexed in SOLR. Is there a way in whic

Re: Solr large boolean filter

2013-07-02 Thread Roman Chyla
Wrong link to the parser, should be: https://github.com/romanchyla/montysolr/blob/master/contrib/adsabs/src/java/org/apache/solr/search/BitSetQParserPlugin.java On Tue, Jul 2, 2013 at 1:25 PM, Roman Chyla wrote: > Hello @, > > This thread 'kicked' me into finishing som long-past task of > sendi

Re: set-based and other less common approaches to search

2013-07-02 Thread gilawem
Thanks. So following up on a) below, could I set up and query Solr, without any customization of code, to match 10 of my given 20 terms, but only if it finds those 10 terms in an xls document under a column that is named "MyID" or "My ID" or "My I.D."? If so, what would that query look like? On

Re: Concurrent Modification Exception

2013-07-02 Thread adityab
Anyone , any suggestion or pointers for this issue? -- View this message in context: http://lucene.472066.n3.nabble.com/Concurrent-Modification-Exception-tp4074371p4074829.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr large boolean filter

2013-07-02 Thread Roman Chyla
Hello @, This thread 'kicked' me into finishing som long-past task of sending/receiving large boolean (bitset) filter. We have been using bitsets with solr before, but now I sat down and wrote it as a qparser. The use cases, as you have discussed are: - necessity to send lng list of ids as a

RE: Newbie SolR - Need advice

2013-07-02 Thread David Quarterman
Don’t worry Fabio - nobody knows everything (apart from Hossman). Following on from Sandeep, to use SOLR, you extract the data from your MSSQL DB using the DataImportHandler and you can then query it, facet it, pivot it to your heart's content. And fast! You can use almost anything to build the

Re: Tomcat Solr Server startup fails with FileNotFoundException

2013-07-02 Thread Shawn Heisey
On 7/2/2013 9:39 AM, Murthy Perla wrote: >I am newbie to solr. I've accidentally deleted indexed > files(manually using rm -rf command) on server from solr index folder. Then > on when ever I start my server its failing to start with FNF exception. How > can this be fixed quickly? I believ

Re: Newbie SolR - Need advice

2013-07-02 Thread fabio1605
Arrfh I see...  So SolR is the search engine for a datastore  Is that what mongo is.. A datastore bit.  Sent from Samsung Mobile Original message From: "Jack Krupansky-2 [via Lucene]" Date: 02/07/2013 17:51 (GMT+00:00) To: fabio1605 Subject: Re: Newbie SolR - Need

Re: Newbie SolR - Need advice

2013-07-02 Thread Walter Underwood
Solr is not a database and it does not handle SQL queries. --wunder On Jul 2, 2013, at 9:09 AM, fabio1605 wrote: > Thanks guys > > So SolR is actually a database replacement for mssql... Am I right > > > We have a lot of perl scripts that contains lots of sql insert queries. > Etc >

Re: Newbie SolR - Need advice

2013-07-02 Thread Shawn Heisey
On 7/2/2013 10:09 AM, fabio1605 wrote: > Thanks guys > > So SolR is actually a database replacement for mssql... Am I right > > > We have a lot of perl scripts that contains lots of sql insert queries. > Etc > > > How do we query the SolR database from scripts I know I have a l

Re: Newbie SolR - Need advice

2013-07-02 Thread Jack Krupansky
Consider DataStax Enterprise - it combines Cassandra for NoSql data storage with Solr for indexing - fully integrated. http://www.datastax.com/ -- Jack Krupansky -Original Message- From: fabio1605 Sent: Tuesday, July 02, 2013 12:44 PM To: solr-user@lucene.apache.org Subject: Re: Newb

Re: Newbie SolR - Need advice

2013-07-02 Thread fabio1605
Hi Ok I'm even more confused now...  Sorry for even more stupid questions.  So if it's not a database replacement  Where do we keep the database then.  We have a website that is a documentation website that store documents.  It has over 130 million records in a table and 50 million

Tomcat Solr Server startup fails with FileNotFoundException

2013-07-02 Thread Murthy Perla
Hi All, I am newbie to solr. I've accidentally deleted indexed files(manually using rm -rf command) on server from solr index folder. Then on when ever I start my server its failing to start with FNF exception. How can this be fixed quickly? Appreciate if any can suggest a quick fix

Re: set-based and other less common approaches to search

2013-07-02 Thread Otis Gospodnetic
Hi, Solr can do all of these. There are phrase queries, queries where you specify a field, the "mm" param for "min should match", etc. Otis -- Solr & ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Tue, Jul 2, 2013 at 12:36 PM, gilawem wrote

set-based and other less common approaches to search

2013-07-02 Thread gilawem
Let's say I wanted to ask solr to find me any document that contains at least 100 out of some 300 search terms I give it. Can Solr do this out of the box? If not, what kind of customization would it require? Now let's say I want to further have the option to request that those terms a) must sho

Re: Newbie SolR - Need advice

2013-07-02 Thread Sandeep Mestry
Hi Fabio, No, Solr isn't the database replacement for MS SQL. Solr is built on top of Lucene which is a search engine library for text searches. Solr in itself is not a replacement for any database as it does not support any relational db features, however as Jack and David mentioned its fully op

Re: How to re-index Solr & get term frequency within documents

2013-07-02 Thread Otis Gospodnetic
Hi Tony, There is, you can do it with that SolrEntityProcessor I pointed out, if you have all your fields stored in Solr. Otis -- Solr & ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Tue, Jul 2, 2013 at 2:00 AM, Tony Mullins wrote: > I use

Re: Solr cloud date based paritioning

2013-07-02 Thread Otis Gospodnetic
Hi, There is nothing automatic that I know of that will create shards (or maybe you mean SolrCloud Collections?) every month. You can do that in your application, though, just create the Collection via the API. You can make use of aliases to have something like "last2months" alias point to your l

RE: Newbie SolR - Need advice

2013-07-02 Thread fabio1605
Thanks guys So SolR is actually a database replacement for mssql...  Am I right  We have a lot of perl scripts that contains lots of sql insert queries. Etc How do we query the SolR database from scripts  I know I have a lot to learn still so excuse my ignorance.  Also...  What i

RE: Newbie SolR - Need advice

2013-07-02 Thread David Quarterman
Hi Fabio, Like Jack says, try the tutorial. But to answer your question, SOLR isn't a bolt on to SQLServer or any other DB. It's a fantastically fast indexing/searching tool. You'll need to use the DataImportHandler (see the tutorial) to import your data from the DB into the indices that SOLR u

Re: Unique key error while indexing pdf files

2013-07-02 Thread Shalin Shekhar Mangar
See http://wiki.apache.org/solr/DataImportHandler#FileListEntityProcessor "The implicit fields generated by the FileListEntityProcessor are fileDir, file, fileAbsolutePath, fileSize, fileLastModified and these are available for use within the entity" On Tue, Jul 2, 2013 at 2:47 PM, archit2112 wr

Re: OOM killer script woes

2013-07-02 Thread Mark Miller
Please file a JIRA issue so that we can address this. - Mark On Jul 2, 2013, at 6:20 AM, Daniel Collins wrote: > On looking at the code in SolrDispatchFilter, is this intentional or not? > I think I remember Mark Miller mentioning that in an OOM case, the best > course of action is basically to

Re: Newbie SolR - Need advice

2013-07-02 Thread Jack Krupansky
Start with the Solr Tutorial. http://lucene.apache.org/solr/tutorial.html -- Jack Krupansky -Original Message- From: fabio1605 Sent: Tuesday, July 02, 2013 11:16 AM To: solr-user@lucene.apache.org Subject: Newbie SolR - Need advice Hi we have a MSSQL Server which is just getting far

Newbie SolR - Need advice

2013-07-02 Thread fabio1605
Hi we have a MSSQL Server which is just getting far to large now and performance is dying! the majority of our webservers mainly are doing search function so i thought it may be best to move to SolR But i know very little about it! My questions are! Does SolR Run as a bolt on to MSSQL - as in th

Re: Solr indexer and Hadoop

2013-07-02 Thread Michael Della Bitta
Yes, I've read directly from NFS. Consider the case where your mapper takes as input a list of the file paths to operate on. Your mapper would load each file one by one by using standard java.io.* calls, build a SolrInputDocument out of each one, and submit it to a SolrServer implementation stored

Re: DIH: HTMLStripTransformer in sub-entities?

2013-07-02 Thread Andy Pickler
Thanks for the quick reply. Unfortunately, I don't believe my company would want me sharing our exact production schema in a public forum, although I realize it makes it harder to diagnose the problem. The sub-entity is a multi-valued field that indeed does have a relationship to the outer entity

Re: DIH: HTMLStripTransformer in sub-entities?

2013-07-02 Thread Gora Mohanty
On 2 July 2013 20:29, Andy Pickler wrote: > Solr 4.1.0 > > We've been using the DIH to pull data in from a MySQL database for quite > some time now. We're now wanting to strip all the HTML content out of many > fields using the HTMLStripTransformer ( > http://wiki.apache.org/solr/DataImportHandle

Re: Using per-segment FieldCache or DocValues in custom component?

2013-07-02 Thread Robert Muir
Where do you get the docid from? Usually its best to just look at the whole algorithm, e.g. docids come from per-segment readers by default anyway so ideally you want to access any per-document things from that same segmentreader. As far as supporting docvalues, FieldCache API "passes thru" to doc

Re: Solr cloud date based paritioning

2013-07-02 Thread Gora Mohanty
On 2 July 2013 20:05, kowish.adamosh wrote: > Hi guys! > > I have simple use case to implement but I have problem with date based > partitioning... Here are some rules: > > 1. At the beginning I have to create huge index (10GB) based on one db > table. > 2. Every day I have to update this index. >

How to disable debug in Solrj

2013-07-02 Thread Jean-Pierre Lauris
Hi, I'm running the jetty start.jar and I'm indexing documents with Solrj's HttpSolrServer object : SolrServer server = new HttpSolrServer("http://HOST:8983/solr/";); server.add( docs ); server.commit(); This leads to TONS of debug information (i.e. logs at level DEBUG), on both server and client

Re: Solr - working with delta import and cache

2013-07-02 Thread Mysurf Mail
BTW: Just found out that a delta import is only supported by the SqlEntityProcessor . Does it matter that I defined processor="CachedSqlEntityProcessor"? On Tue, Jul 2, 2013 at 5:58 PM, Mysurf Mail wrote: > I have two entities in 1:n relation - PackageVersion and Tag. > I have configured DIH to

Solr cloud date based paritioning

2013-07-02 Thread kowish.adamosh
Hi guys! I have simple use case to implement but I have problem with date based partitioning... Here are some rules: 1. At the beginning I have to create huge index (10GB) based on one db table. 2. Every day I have to update this index. 3. 99,999% are queries based on date field (*data from last

Solr - working with delta import and cache

2013-07-02 Thread Mysurf Mail
I have two entities in 1:n relation - PackageVersion and Tag. I have configured DIH to use CachedSqlEntityProcessor and everything works as planned. First, Tag entity is selected using the query attribute. Then the main entity. Ultra Fast. Now I am adding the delta import. Everything runs and load

DIH: HTMLStripTransformer in sub-entities?

2013-07-02 Thread Andy Pickler
Solr 4.1.0 We've been using the DIH to pull data in from a MySQL database for quite some time now. We're now wanting to strip all the HTML content out of many fields using the HTMLStripTransformer ( http://wiki.apache.org/solr/DataImportHandler#HTMLStripTransformer). Unfortunately, while it seem

Re: Spell check in SOLR

2013-07-02 Thread Shalin Shekhar Mangar
See http://wiki.apache.org/solr/SpellCheckComponent On Tue, Jul 2, 2013 at 4:14 PM, Prathik Puthran wrote: > Hi, > > How can i configure SOLR to provide corrections for misspelled words. If > the query string is in dictionary SOLR should not return any suggestions. > But if the query string is no

Re: Converting nested data model to solr schema

2013-07-02 Thread adfel70
I'm not familiar with block join in lucene. I've read a bit, and I just want to make sure - do you think that when this ticket is released, it will solve the current problem of solr cloud joins? Also, can you elaborate a bit about your solution? Jack Krupansky-2 wrote > It sounds like 4.4 will h

Re: Converting nested data model to solr schema

2013-07-02 Thread Jack Krupansky
It sounds like 4.4 will have an RC next week, so the prospects for block join in 4.4 are kind of dim. I mean, such a significant feature should have more than a few days to bake before getting released. But... who knows what Yonik has planned! -- Jack Krupansky -Original Message- Fro

Re: documentCache not used in 4.3.1?

2013-07-02 Thread Daniel Collins
Cheers, its certainly something we might end up exploring. On 2 July 2013 12:41, Erick Erickson wrote: > This takes some significant custom code, but... > > One strategy is to keep your commits relatively > lengthy (depends on the ingest rate) and keep > a "side car" index either a small core o

Re: need distance in miles not in kilometers

2013-07-02 Thread irshad siddiqui
Jack , Thanks for your response. In case of frange we donot want to separately multiple for conversion so in that case is there any way to convert it into miles. my Query: http://localhost:8983/solr/select?q=name:shop&fl=name,shopLocation,shopMaxDeliveryDistance,geodist%28shopLocation,0.0,0.0%29&s

Re: need distance in miles not in kilometers

2013-07-02 Thread Jack Krupansky
Simply multiply by the number of miles per kilometer, 0.621371: fl=_dist_:mul(geodist(),0.621371) -- Jack Krupansky -Original Message- From: irshad siddiqui Sent: Tuesday, July 02, 2013 5:19 AM To: solr-user@lucene.apache.org Subject: need distance in miles not in kilometers Hi, I

Re: Solr 4.3 Pivot Performance Issue

2013-07-02 Thread Jack Krupansky
What is the nature of your degradation? -- Jack Krupansky -Original Message- From: solrUserJM Sent: Tuesday, July 02, 2013 4:22 AM To: solr-user@lucene.apache.org Subject: Solr 4.3 Pivot Performance Issue Hi There, I notice with the upgrade from solr 4.0 to solr 4.3 that we had a deg

Re: Converting nested data model to solr schema

2013-07-02 Thread adfel70
As you see it, does SOLR-3076 fixes my problem? Is SOLR-3076 fix getting into solr 4.4? Mikhail Khludnev wrote > On Mon, Jul 1, 2013 at 5:56 PM, adfel70 < > adfel70@ > > wrote: > >> This requires me to override the solr document distribution mechanism. >> I fear that with this solution I may

Re: documentCache not used in 4.3.1?

2013-07-02 Thread Erick Erickson
This takes some significant custom code, but... One strategy is to keep your commits relatively lengthy (depends on the ingest rate) and keep a "side car" index either a small core or a RAMDirectory. Then at search time you "somehow" combine the two results. The "somehow" is a bit tricky since the

Re: Stemming query in Solr

2013-07-02 Thread Erick Erickson
Somehow we're mis-communicating here. Forget expansion, it's all about base forms. . bq: What I cannot figure out is how is this going to help me in instructing Solr to execute the query for the different grammatical variations of the input search term stem You don't. You search the stemmed input

Re: undefined field http:// while searchi query

2013-07-02 Thread Daniel Collins
Presuming that uses the standard lucene query parser syntax then you have asked to query for the field called http, searching for the value // www.google.co.in See http://wiki.apache.org/solr/SolrQuerySyntax for more details, but you probably want to escape the : at least, http\://www.google.co.in

parent Import Query doent run

2013-07-02 Thread Mysurf Mail
I have 1:n relation between my main entity(PackageVersion) and its tag in my DB. I add a new tag with this date to the db at the timestamp and I run delta import command. the select retrieves the line but i dont see any other sql. Here are my data-config.xml configurations:

RE: undefined field http:// while searchi query

2013-07-02 Thread Markus Jelsma
colons need to be escaped cheers -Original message- > From:aniljayanti > Sent: Tuesday 2nd July 2013 12:35 > To: solr-user@lucene.apache.org > Subject: undefined field http:// while searchi query > > Hi, > > I am using solr 3.3 version. After indexing I am querying below command. >

Spell check in SOLR

2013-07-02 Thread Prathik Puthran
Hi, How can i configure SOLR to provide corrections for misspelled words. If the query string is in dictionary SOLR should not return any suggestions. But if the query string is not in dictionary SOLR should return all possible corrected words in the dictionary which most likely could be the query

Re: No date.gap on pivoted facets

2013-07-02 Thread Dotan Cohen
On Sun, Jun 30, 2013 at 5:33 PM, Jack Krupansky wrote: > Sorry, but Solr pivot faceting is based solely on "field" facets, not > "range" (or "date") facets. > Thank you. I tried adding that information to the SimpleFacetParameters wiki page, but that page seems to be defined as "Immutable Page".

Solr 4.3 Pivot Performance Issue

2013-07-02 Thread solrUserJM
Hi There, I notice with the upgrade from solr 4.0 to solr 4.3 that we had a degradation of queries that are using pivot fields. Have someone else notice it too? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-3-Pivot-Performance-Issue-tp4074617.html Sent from

undefined field http:// while searchi query

2013-07-02 Thread aniljayanti
Hi, I am using solr 3.3 version. After indexing I am querying below command. http://localhost:8080/solr/select/?q=*(http://www.google.co.in)* getting below error. SEVERE: org.apache.solr.common.SolrException: *undefined field http://* at org.apache.solr.schema.IndexSchema.getDynamicFiel

Aggregate TermFrequency on Result Grouping / Field Collapsing

2013-07-02 Thread Tony Mullins
Hi, Is it possible to perform aggregated termfreq(field,term) on Result Grouping ? I am trying to get total count of term's appearance in a document and then want to aggregate that count by grouping the document on one of my field. Like this http://localhost:8080/solr/collection1/select?q=iphon

Re: OOM killer script woes

2013-07-02 Thread Daniel Collins
On looking at the code in SolrDispatchFilter, is this intentional or not? I think I remember Mark Miller mentioning that in an OOM case, the best course of action is basically to kill the process, there is very little Solr can do once it has run out of memory. Yet it seems that Solr catches the O

need distance in miles not in kilometers

2013-07-02 Thread irshad siddiqui
Hi, I am suing solr 4.2 and my results are coming proper. but now i want to distance in miles and i am getting the distance in kilometre. can anyone tell me how to get the distance in miles. example query &q=*:*&fq={!geofilt}&sfield=latlng&pt=18.9322453,72.8264378001&d=60&fl=_dist_:geod

Re: Unique key error while indexing pdf files

2013-07-02 Thread archit2112
Yes. The absolute path is unique. How do i implement it? can you please explain? -- View this message in context: http://lucene.472066.n3.nabble.com/Unique-key-error-while-indexing-pdf-files-tp4074314p4074638.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Removal of unique key - Query Elevation Component

2013-07-02 Thread archit2112
Thanks! The author_s issue has been resolved. Why are the other fields not getting indexed ? -- View this message in context: http://lucene.472066.n3.nabble.com/Removal-of-unique-key-Query-Elevation-Component-tp4074624p4074636.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr indexer and Hadoop

2013-07-02 Thread Anatoli Matuskova
If you can upload your data to hdfs you can use this patch to build the solr indexes: https://issues.apache.org/jira/browse/SOLR-1301 -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexer-and-Hadoop-tp4072951p4074635.html Sent from the Solr - User mailing list archive

Re: Removal of unique key - Query Elevation Component

2013-07-02 Thread Shalin Shekhar Mangar
My guess is that you have a element which copies the author into an author_s field. On Tue, Jul 2, 2013 at 2:14 PM, archit2112 wrote: > > I want to index pdf files in solr 4.3.0 using the data import handler. > > I have done the following: > > My request handler - > > class="org.apache.solr.han

  1   2   >