Atomic updates behavior

2012-08-04 Thread as4j1th .
Hi,

We've been evaluating the atomic update feature in Solr4.0. It would be
great if anyone can shed some light on the following.

#) There seems to be more than one way to pass data to UpdateJSON for
add/update. What would be the recommended way.
https://gist.github.com/3256587 OR https://gist.github.com/3256602

#) For atomic updates, we see that copyField fields are getting
appeneded.
For eg, if we have an entry in schema.xml copyField source=name
dest=text/, every atomic update is appending the value of the the field
name again to text.
Is there any way to override this behavior?


Thanks
Sajith


Re: Atomic updates behavior

2012-08-04 Thread as4j1th .

 Hi,

 We've been evaluating the atomic update feature in Solr4.0. It would be
 great if anyone can shed some light on the following.

 #) There seems to be more than one way to pass data to UpdateJSON for
 add/update. What would be the recommended way.
 https://gist.github.com/3256587 OR https://gist.github.com/3256602

 #) For atomic updates, we see that copyField fields are getting
 appeneded.
 For eg, if we have an entry in schema.xml copyField source=name
 dest=text/, every atomic update is appending the value of the the field
 name again to text.
 Is there any way to override this behavior?

 Looks like resetting the destination field of the copyField to null will
solve the problem [ https://gist.github.com/3256716 ] . Not sure whether
this would be the recommended solution though.


Re: AW: AW: auto completion search with solr using NGrams in SOLR

2012-08-04 Thread Ahmet Arslan


--- On Sat, 8/4/12, aniljayanti anil.jaya...@gmail.com wrote:

 From: aniljayanti anil.jaya...@gmail.com
 Subject: Re: AW: AW: auto completion search with solr using NGrams in SOLR
 To: solr-user@lucene.apache.org
 Date: Saturday, August 4, 2012, 8:57 AM
 Hi 
 thanks,
 
 which doing searching i will search either with empname or
 title only. And
 also not using any asterics in the query.
 ex : if i search with mic result should come like 
 
 michale jackson
 michale border
 michale smith
 
 want the result just like google search.
 
 can us suggest me wht are the configuration need to
 add/change to get the
 result like google search ?. 
 for my required result which tokenizers need to use. ?
 can u tell me how to call a query for this??

Only suspicious thing is omitTermFreqAndPositions=true. Try changing this to 
false. 

Also see Chantal's message http://search-lucene.com/m/rPrhO1RIlfQ


Re: auto completion search with solr using NGrams in SOLR

2012-08-04 Thread Jan Høydahl
Have a look at my blog post 
http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/ for 
a walkthrough of how it could be done, as a separate Solr core.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 1. aug. 2012, at 12:04, aniljayanti wrote:

 I want to implement an auto completion search with solr using NGrams. If the
 user is searching for names of employees, then auto completion should be
 applied. ie., 
 
 if types j then need to show the names starts with j if types ja then
 need to show the names starts with ja if types jac then need to show the
 names starts with jak if types jack then need to show the names starts
 with jack
 
 Below is my configuration settings in schema.xml, Please suggest me if
 anything wrong.
 
 below is my code in schema.xml
 
 fieldType name=edgytext class=solr.TextField
 positionIncrementGap=100
 analyzer type=index
  tokenizer class=solr.KeywordTokenizerFactory / 
  filter class=solr.LowerCaseFilterFactory / 
  filter class=solr.EdgeNGramFilterFactory minGramSize=1
 maxGramSize=15 / 
  /analyzer
 analyzer type=query
  tokenizer class=solr.KeywordTokenizerFactory / 
  filter class=solr.LowerCaseFilterFactory / 
  /analyzer
  /fieldType
 field name=empname type=edgytext indexed=true stored=true /
 field name=autocomplete_text type=edgytext indexed=true stored=true
 omitNorms=true omitTermFreqAndPositions=true / 
 copyField source=empname dest=text / 
 
 when im searching with name mado or madonna getting employees names.But
 when searching with madon not getting any data.
 
 Please help me on this.
 
 
 Thanks in Advance,
 
 Anil.
 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/auto-completion-search-with-solr-using-NGrams-in-SOLR-tp3998559.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Adding new field before import- using post.jar

2012-08-04 Thread Rajani Maski
Thank you for the reply.

8.How about Extending class : XmlUpdateRequestHandler? Is it possible and
good method?


Regards
Rajani






On Fri, Aug 3, 2012 at 8:32 PM, Erik Hatcher erik.hatc...@gmail.com wrote:

 I hate to also add:

   6. Use DataImportHandler

 It can index Solr XML, and could add field values, either statically or by
 template glue if you need to combine multiple field values somehow.

 And in 4.0 you'll be able to use:

   7: scripting update processor

 Erik


 On Aug 3, 2012, at 10:51 , Jack Krupansky wrote:

  1. Google for XSLT tools.
  2. Write a script that loads the XML, adds the fields, and writes the
 updated XML.
  3. Same as #2, but using Java.
  4. If the fields are constants, set default values in the schema and
 then the documents will automatically get those values when added. Take the
 default value attributes out of the schema once you have input documents
 that actually have the new field values.
  5. Hire a consultant.
 
  -- Jack Krupansky
 
  -Original Message- From: Rajani Maski
  Sent: Friday, August 03, 2012 5:37 AM
  To: solr-user@lucene.apache.org
  Subject: Adding new field before import- using post.jar
 
  Hi all,
 
  I have xmls in a folder in the standard solr xml format. I was simply
 using
  SimplePostTool.java to import these xmls to solr. Now I have to add 3 new
  fields to each document in the xml before doing a post.
 
  What can be the effective way for doing this?
 
 
  Thanks  Regards
  Rajani




Re: Adding new field before import- using post.jar

2012-08-04 Thread Jack Krupansky

Where are the values of the three new fields coming from?

Are they constant/default values?
Computed from other fields in the XML?

From other XML files?
From a text file?
From a database?

Or where?

So, given a specific Solr XML input document, how will you be accessing the 
three field values to add?


This may guide the approach that you could/should take.

-- Jack Krupansky

-Original Message- 
From: Rajani Maski

Sent: Friday, August 03, 2012 5:37 AM
To: solr-user@lucene.apache.org
Subject: Adding new field before import- using post.jar

Hi all,

I have xmls in a folder in the standard solr xml format. I was simply using
SimplePostTool.java to import these xmls to solr. Now I have to add 3 new
fields to each document in the xml before doing a post.

What can be the effective way for doing this?


Thanks  Regards
Rajani 



Re: Sorting fields of text_general fieldType

2012-08-04 Thread Erick Erickson
Did you re-index everything after the change you made? Your old docs
will be sorted by null values in the title_sort field, so they'd all come out
first or last depending, then sub-sorted by internal Lucene doc ID.

If you have, can you just create an index with, say, 6 titles that sorts
improperly and give us the output from your app?

I find it very unlikely that this is really broken, lots and lots and lots
of people are using this all the time so my first guess is it's something
you're doing that _seems_ harmless. Don't get me wrong, there could
indeed be a bug here, it just seems unlikely.

To be really safe, I'd stop my Solr server and blow away the
solr_home/data/index directory. Remove the directory itself
not just the contents and start indexing over again.

Best
Erick

On Fri, Aug 3, 2012 at 4:30 AM, Anupam Bhattacharya anupam...@gmail.com wrote:
 Few titles are as following:

 Embattled JPMorgan boss survives power challenge - Jakarta Globe

 Kitten Survives 6500-Mile Trip in China-US Container - Jakarta Globe

 Guard survives hail of bullets - Jakarta Post

 On Fri, Aug 3, 2012 at 1:09 PM, Lance Norskog goks...@gmail.com wrote:

 Give us some pairs of titles which sort the wrong way.

 On Thu, Aug 2, 2012 at 10:06 AM, Anupam Bhattacharya
 anupam...@gmail.com wrote:
  The approach used to work perfectly.
 
  But recently i realized that it is not working for more than 30
 indexed
  records.
  I am using SOLR 3.5 version.
 
  Is there another approach to SORT a title field in proper alphabetical
  order irrespective of Lower case and Upper case.
 
  Regards
  Anupam
 
  On Thu, May 17, 2012 at 4:32 PM, Ahmet Arslan iori...@yahoo.com wrote:
 
   The title sort works in a strange manner because the SOLR
   server treats
   title string based on Upper Case or Lower Case String. Thus
   if we sort in
   ascending order, first the title with numeric shows up then
   the titles in
   alphabetical order which starts with Upper Case  after
   that the titles
   starting with Lowercase.
  
   The title field is indexed as text_general fieldtype.
  
   field name=title type=text_general indexed=true
   stored=true/
 
  Please see Otis' response http://search-lucene.com/m/uDxTF1scW0d2
 
  Simply create an additional field named title_sortable with the
 following
  type
 
   !-- lowercases the entire field value, keeping it as a single token.
  --
  fieldType name=lowercase class=solr.TextField
  positionIncrementGap=100
analyzer
  tokenizer class=solr.KeywordTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory /
  filter class=solr.TrimFilterFactory /
/analyzer
  /fieldType
 
  Populate it via copyField directive :
 
copyField source=title dest=title_sortable maxChars=N/
 
  then sort=title_sortable asc
 
 
 



 --
 Lance Norskog
 goks...@gmail.com




 --
 Thanks  Regards
 Anupam Bhattacharya


Re: Special suggestions requirement

2012-08-04 Thread Erick Erickson
Would it work to use TermsComponent with wildcards?
Something like terms.regex=ABCD42??...

see: http://wiki.apache.org/solr/TermsComponent/

Best
Erick


On Fri, Aug 3, 2012 at 9:07 AM, Michael Della Bitta
michael.della.bi...@appinions.com wrote:
 I could be crazy, but it sounds to me like you need a trie, not a
 search index: http://en.wikipedia.org/wiki/Trie

 But in any case, what you want to do should be achievable. It seems
 like you need to do EdgeNgrams and facet on the results, where
 facet.counts  1 to exclude the actual part numbers, since each of
 those would be distinct.

 I'm on the train right now, so I can't test this. :\

 Michael Della Bitta

 
 Appinions | 18 East 41st St., Suite 1806 | New York, NY 10017
 www.appinions.com
 Where Influence Isn’t a Game


 On Thu, Aug 2, 2012 at 9:19 PM, Lochschmied, Alexander
 alexander.lochschm...@vishay.com wrote:
 Even with prefix query, I do not get ABCD02 or any ABCD02... back. BTW: 
 EdgeNGramFilterFactory is used on the field we are getting the 
 suggestions/spellchecks from.
 I think the problem is that there are a lot of different part numbers 
 starting with ABCD and every part number has the same length. I showed 
 only 4 in the example but there might be thousands.

 Here are some full part number examples that might be in the index:
 ABCD110040
 ABCD00
 ABCD99
 ABCD155500
 ...

 I'm looking for a way to make Solr return distinct list of fixed length 
 substrings of them, e.g. if ABCD is entered, I would need
 ABCD00
 ABCD01
 ABCD02
 ABCD03
 ...
 ABCD99

 Then if user chose ABCD42 from the suggestions, I would need
 ABCD4201
 ABCD4202
 ABCD4203
 ...
 ABCD4299

 and so on.

 I would be able to do some post processing if needed or adjust the schema 
 or indexing process. But the key functionality I need from Solr is returning 
 distinct set of those suggestions where only the last two characters change. 
 All of the available combinations of those last two characters must be 
 considered though. I need to show alpha-numerically sorted suggestions; the 
 smallest value first.

 Thanks,
 Alexander

 -Ursprüngliche Nachricht-
 Von: Michael Della Bitta [mailto:michael.della.bi...@appinions.com]
 Gesendet: Donnerstag, 2. August 2012 15:02
 An: solr-user@lucene.apache.org
 Betreff: Re: Special suggestions requirement

 In this case, we're storing the overall value length and sorting it on that, 
 then alphabetically.

 Also, how are your queries fashioned? If you're doing a prefix query, 
 everything that matches it should score the same. If you're only doing a 
 prefix query, you might need to add a term for exact matches as well to get 
 them to show up.

 Michael Della Bitta

 
 Appinions | 18 East 41st St., Suite 1806 | New York, NY 10017 
 www.appinions.com Where Influence Isn't a Game


 On Wed, Aug 1, 2012 at 9:58 PM, Lochschmied, Alexander 
 alexander.lochschm...@vishay.com wrote:
 Is there a way to offer distinct, alphabetically sorted, fixed length 
 options?

 I am trying to suggest part numbers and I'm currently trying to do it with 
 the spellchecker component.
 Let's say ABCD was entered and we have indexed part numbers like
 ABCD
 ABCD2000
 ABCD2100
 ABCD2200
 ...

 I would like to have 2 characters suggested always, so for ABCD, it
 should suggest
 ABCD00
 ABCD20
 ABCD21
 ABCD22
 ...

 No smart sorting is needed, just alphabetically sorting. The problem is 
 that for example 00 (or ABCD00) may not be suggested currently as it 
 doesn't score high enough. But we are really trying to get all distinct 
 values starting from the smallest (up to a certain number of suggestions).

 I was looking already at custom comparator class option. But this would 
 probably not work as I would need more information to implement it there 
 (like at least the currently entered search term, ABCD in the example).

 Thanks,
 Alexander


Re: Tuning caching of geofilt queries

2012-08-04 Thread Erick Erickson
I don't think rounding will affect cache hits in either case _unless_
the input point for different queries can be very close to each other.

Think of the filter cache as being composed of a map where the key
is the (raw) filter query and the value is the set of documents in your
corpus that satisfy it.

So the only time rounding would help, is if it's likely that two
users enter very similar points at query time, i.e.
89.1234 and 89.1236. If you're giving them a set of choices
that are pre-defined (city center, say), then the values should be
identical to all the decimal places so rounding doesn't do you much
good.

You say you can tolerate some slop, so using bounding box might
speed up your queries...

Best
Erick

On Fri, Aug 3, 2012 at 4:56 AM, Thomas Heigl tho...@umschalt.com wrote:
 Hey all,

 Our production system is heavily optimized for caching and nearly all parts
 of queries are satisfied by filter caches. The only filter that varies a
 lot from user to user is the location and distance. Currently we use the
 default location field type and index lat/long coordinates as we get them
 from Geonames and GMaps with varying decimal precision.

 My question is: Does it make sense to round these coordinates (a) while
 indexing and/or (b) while querying to optimize cache hits? Our maximum
 required resolution for geo queries is 1km and we can tolerate minor errors
 so I could round to two decimal points for most of our queries.

 E.g. Instead of querying like this

 fq=_query_:{!geofilt sfield=user.location_p pt=48.19815,16.3943
 d=50.0}sfield=user.location_ppt=48.1981,16.394


 we would round to

 fq=_query_:{!geofilt sfield=user.location_p pt=48.19,16.39
 d=50.0}sfield=user.location_ppt=48.19,16.39


 Any feedback would be greatly appreciated.

 Cheers,

 Thomas


Re: search hit on multivalued fields

2012-08-04 Thread Erick Erickson
What about just using highlighting and display both fields?

Best
Erick

On Fri, Aug 3, 2012 at 5:51 AM, Mark , N nipen.m...@gmail.com wrote:
 I have a multivalued field  Tex which is indexed , for example :

 F1:  some value
 F2: some value
 Text = ( content of f1,f2)

 When user search , I am checking only a  Text field but i would also need
 to display to users which Field ( F1 or F2 )  resulted the search hit
 Is it possible in SOLR  ?


 --
 Thanks,

 *Nipen Mark *


Re: Adding new field before import- using post.jar

2012-08-04 Thread Rajani Maski
They are coming from text file.

SolrXML input documents are xmls in folder location. (To import these xmls,
I was using simple post.jar) Now, for each xml there is need to add 3
external new fields reading values from text file.


Regards
Rajani

On Sat, Aug 4, 2012 at 10:59 PM, Jack Krupansky j...@basetechnology.comwrote:

 Where are the values of the three new fields coming from?

 Are they constant/default values?
 Computed from other fields in the XML?
 From other XML files?
 From a text file?
 From a database?
 Or where?

 So, given a specific Solr XML input document, how will you be accessing
 the three field values to add?

 This may guide the approach that you could/should take.


 -- Jack Krupansky

 -Original Message- From: Rajani Maski
 Sent: Friday, August 03, 2012 5:37 AM
 To: solr-user@lucene.apache.org
 Subject: Adding new field before import- using post.jar

 Hi all,

 I have xmls in a folder in the standard solr xml format. I was simply using
 SimplePostTool.java to import these xmls to solr. Now I have to add 3 new
 fields to each document in the xml before doing a post.

 What can be the effective way for doing this?


 Thanks  Regards
 Rajani



Re: Adding new field before import- using post.jar

2012-08-04 Thread Jack Krupansky

Sounds like a perl script would be sufficient.

-- Jack Krupansky

-Original Message- 
From: Rajani Maski

Sent: Saturday, August 04, 2012 2:23 PM
To: solr-user@lucene.apache.org
Subject: Re: Adding new field before import- using post.jar

They are coming from text file.

SolrXML input documents are xmls in folder location. (To import these xmls,
I was using simple post.jar) Now, for each xml there is need to add 3
external new fields reading values from text file.


Regards
Rajani

On Sat, Aug 4, 2012 at 10:59 PM, Jack Krupansky 
j...@basetechnology.comwrote:



Where are the values of the three new fields coming from?

Are they constant/default values?
Computed from other fields in the XML?
From other XML files?
From a text file?
From a database?
Or where?

So, given a specific Solr XML input document, how will you be accessing
the three field values to add?

This may guide the approach that you could/should take.


-- Jack Krupansky

-Original Message- From: Rajani Maski
Sent: Friday, August 03, 2012 5:37 AM
To: solr-user@lucene.apache.org
Subject: Adding new field before import- using post.jar

Hi all,

I have xmls in a folder in the standard solr xml format. I was simply 
using

SimplePostTool.java to import these xmls to solr. Now I have to add 3 new
fields to each document in the xml before doing a post.

What can be the effective way for doing this?


Thanks  Regards
Rajani





Re: Adding new field before import- using post.jar

2012-08-04 Thread Lance Norskog
For a permanent solution, DataImportHandler and the scripting update
handler are the best choices- they are small files and live inside
Solr.

On Sat, Aug 4, 2012 at 12:02 PM, Jack Krupansky j...@basetechnology.com wrote:
 Sounds like a perl script would be sufficient.


 -- Jack Krupansky

 -Original Message- From: Rajani Maski
 Sent: Saturday, August 04, 2012 2:23 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Adding new field before import- using post.jar


 They are coming from text file.

 SolrXML input documents are xmls in folder location. (To import these xmls,
 I was using simple post.jar) Now, for each xml there is need to add 3
 external new fields reading values from text file.


 Regards
 Rajani

 On Sat, Aug 4, 2012 at 10:59 PM, Jack Krupansky
 j...@basetechnology.comwrote:

 Where are the values of the three new fields coming from?

 Are they constant/default values?
 Computed from other fields in the XML?
 From other XML files?
 From a text file?
 From a database?
 Or where?

 So, given a specific Solr XML input document, how will you be accessing
 the three field values to add?

 This may guide the approach that you could/should take.


 -- Jack Krupansky

 -Original Message- From: Rajani Maski
 Sent: Friday, August 03, 2012 5:37 AM
 To: solr-user@lucene.apache.org
 Subject: Adding new field before import- using post.jar

 Hi all,

 I have xmls in a folder in the standard solr xml format. I was simply
 using
 SimplePostTool.java to import these xmls to solr. Now I have to add 3 new
 fields to each document in the xml before doing a post.

 What can be the effective way for doing this?


 Thanks  Regards
 Rajani





-- 
Lance Norskog
goks...@gmail.com


Re: termFrequncy off and still use fastvector highlighter?

2012-08-04 Thread abhayd
yes u r correct. But problem is u can not just turnOff term frequency. You
have to turn off termpositions with it. And once i do that phrase searches
dont work.

I want termPositions=false termPositions=true

How would i do that?




--
View this message in context: 
http://lucene.472066.n3.nabble.com/termFrequncy-off-and-still-use-fastvector-highlighter-tp3998590p3999260.html
Sent from the Solr - User mailing list archive at Nabble.com.