subject:"Re\: copyField"

Re: copyField from empty multivalue

2020-08-07 Thread matthew sporleder

Nevermind I think we found this was caused by a bug in our (new) custom indexer

On Thu, Aug 6, 2020 at 4:11 PM matthew sporleder  wrote:
>
> I have a copyField:
>  
>  
>
> But sometimes preview ( indexed="true" stored="true" multiValued="true" />) is not populated.
>
> It appears that the "catchall" field does not get created when preview
> has no content in it.  Can I use required=false or similar on a
> copyField?
>
> Thanks,
> Matt

Re: copyField - why source should contain * when dest contains *?

2019-10-24 Thread Paras Lehana

Hey Community,

I think I have got the answer to my query.

This statement about *copyFields
*:

*The copyField command can use a wildcard (*) character in the dest
> parameter only if the source parameter contains one as well. copyField uses
> the matching glob from the source field for the dest field name into which
> the source content is copied.*


So, the glob here means the *something_** pattern. copyField doesn't
support chaining and similarly, does *NOT support copying into multiple
destinations from a single or multiple field*. The whole point of
supporting wildcard in dest when it is also present in source is to
actually make *one-to-one mapping* from the matching glob. For example,
consider having this config:




and these fields in the schema:

  
>   
>   


>   
>   
>   


And if you index some information for title_*en*, it will be copied into
text_*en* ONLY. Note the one-to-one mapping here:


> *title_fr  to   text_fr**title_trto   text_tr*


Note that information of title_*en *will *NOT* copy into text_*fr*.

I guess Erick and Chris were actually making me understand this. I have
tried my best to explain this to any possible uninformed. Thank you,
everyone! :)


On Thu, 24 Oct 2019 at 12:50, Paras Lehana 
wrote:

> Hey Chris,
>
> Awesome explanation.
>
> ...then solr has no idea what full field name to use as the destination
>> when it seees values in a field "foo" ... should it be "1_bar" ?
>> "aaa_bar" ? ... "z_bar" ? all three?
>
>
> But how does Solr get the idea what full field name to use as the
> destination when we provide wildcard in source as well? Seems I'm missing
> something.
>
>
> but using a wildcard in the dest only woks with a one-to-one mapping
>
>
> So, I think, this restriction could be more related to the source code
> flow instead of a logical reason. I'll try to understand the code about
> this.
>
> I was actually curious if there's any logical restriction that I had been
> missing.
>
> Many thanks. :)
>
> On Thu, 24 Oct 2019 at 03:47, Chris Hostetter 
> wrote:
>
>>
>> : Documentation says that we can copy multiple fields using wildcard to
>> one
>> : or more than one fields.
>>
>> correct ... the limitation is in the syntax and the ambiguity that would
>> be unresolvable if you had a wildcard in the dest but not in the source.
>>
>> the wildcard is essentially a variable.  if you have...
>>
>>source="foo" desc="*_bar"
>>
>> ...then solr has no idea what full field name to use as the destination
>> when it seees values in a field "foo" ... should it be "1_bar" ?
>> "aaa_bar" ? ... "z_bar" ? all three?
>>
>> : Yes, that's what hit me initially. But, "*_x" while indexing (in XMLs)
>> : doesn't mean anything, right? It's only used in dynamicFields while
>> : defining schema to let Solr know that we would have some undeclared
>> fields
>>
>> use of wildcards in copyField is not contstrained to only
>> using dynamicFields, this would be a perfectly valid copyField using
>> wildcards, even if these are the only fields in the schema, and it had
>> no dynamicFields at all...
>>
>>   
>>   
>>   
>>
>>   
>>   
>>   
>>
>>   
>>
>> : having names like this. Also, according to the documentation, we can
>> have
>> : dest="*_x" when source="*_x" if I'm right. In this case, there's support
>> : for multiple destinations when there are multiple source.
>>
>> correct.  there is support for copying from one field to another
>> via a *MAPPING* -- so a single copyField declaration can go from multiple
>> sources to multiple destiations, but using a wildcard in the dest
>> only woks with a one-to-one mapping when the wildcard also exists in the
>> source.
>>
>> on the flip side however, you have have a many-to-one mapping by using a
>> wildcard *only* in the source
>>
>>   
>>   
>>   
>>
>>   
>>
>>   
>>
>>
>>
>> -Hoss
>> http://www.lucidworks.com/
>>
>
>
> --
> --
> Regards,
>
> *Paras Lehana* [65871]
> Development Engineer, Auto-Suggest,
> IndiaMART Intermesh Ltd.
>
> 8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
> Noida, UP, IN - 201303
>
> Mob.: +91-9560911996
> Work: 01203916600 | Extn:  *8173*
>


-- 
-- 
Regards,

*Paras Lehana* [65871]
Development Engineer, Auto-Suggest,
IndiaMART Intermesh Ltd.

8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
Noida, UP, IN - 201303

Mob.: +91-9560911996
Work: 01203916600 | Extn:  *8173*

-- 
IMPORTANT: 
NEVER share your IndiaMART OTP/ Password with anyone.

Re: copyField - why source should contain * when dest contains *?

2019-10-24 Thread Paras Lehana

Hey Chris,

Awesome explanation.

...then solr has no idea what full field name to use as the destination
> when it seees values in a field "foo" ... should it be "1_bar" ?
> "aaa_bar" ? ... "z_bar" ? all three?


But how does Solr get the idea what full field name to use as the
destination when we provide wildcard in source as well? Seems I'm missing
something.


but using a wildcard in the dest only woks with a one-to-one mapping


So, I think, this restriction could be more related to the source code flow
instead of a logical reason. I'll try to understand the code about this.

I was actually curious if there's any logical restriction that I had been
missing.

Many thanks. :)

On Thu, 24 Oct 2019 at 03:47, Chris Hostetter 
wrote:

>
> : Documentation says that we can copy multiple fields using wildcard to one
> : or more than one fields.
>
> correct ... the limitation is in the syntax and the ambiguity that would
> be unresolvable if you had a wildcard in the dest but not in the source.
>
> the wildcard is essentially a variable.  if you have...
>
>source="foo" desc="*_bar"
>
> ...then solr has no idea what full field name to use as the destination
> when it seees values in a field "foo" ... should it be "1_bar" ?
> "aaa_bar" ? ... "z_bar" ? all three?
>
> : Yes, that's what hit me initially. But, "*_x" while indexing (in XMLs)
> : doesn't mean anything, right? It's only used in dynamicFields while
> : defining schema to let Solr know that we would have some undeclared
> fields
>
> use of wildcards in copyField is not contstrained to only
> using dynamicFields, this would be a perfectly valid copyField using
> wildcards, even if these are the only fields in the schema, and it had
> no dynamicFields at all...
>
>   
>   
>   
>
>   
>   
>   
>
>   
>
> : having names like this. Also, according to the documentation, we can have
> : dest="*_x" when source="*_x" if I'm right. In this case, there's support
> : for multiple destinations when there are multiple source.
>
> correct.  there is support for copying from one field to another
> via a *MAPPING* -- so a single copyField declaration can go from multiple
> sources to multiple destiations, but using a wildcard in the dest
> only woks with a one-to-one mapping when the wildcard also exists in the
> source.
>
> on the flip side however, you have have a many-to-one mapping by using a
> wildcard *only* in the source
>
>   
>   
>   
>
>   
>
>   
>
>
>
> -Hoss
> http://www.lucidworks.com/
>


-- 
-- 
Regards,

*Paras Lehana* [65871]
Development Engineer, Auto-Suggest,
IndiaMART Intermesh Ltd.

8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
Noida, UP, IN - 201303

Mob.: +91-9560911996
Work: 01203916600 | Extn:  *8173*

-- 
IMPORTANT: 
NEVER share your IndiaMART OTP/ Password with anyone.

Re: copyField - why source should contain * when dest contains *?

2019-10-23 Thread Chris Hostetter



: Documentation says that we can copy multiple fields using wildcard to one
: or more than one fields.

correct ... the limitation is in the syntax and the ambiguity that would 
be unresolvable if you had a wildcard in the dest but not in the source.  

the wildcard is essentially a variable.  if you have...

   source="foo" desc="*_bar"

...then solr has no idea what full field name to use as the destination 
when it seees values in a field "foo" ... should it be "1_bar" ? 
"aaa_bar" ? ... "z_bar" ? all three?

: Yes, that's what hit me initially. But, "*_x" while indexing (in XMLs)
: doesn't mean anything, right? It's only used in dynamicFields while
: defining schema to let Solr know that we would have some undeclared fields

use of wildcards in copyField is not contstrained to only 
using dynamicFields, this would be a perfectly valid copyField using 
wildcards, even if these are the only fields in the schema, and it had 
no dynamicFields at all...

  
  
  

  
  


  

: having names like this. Also, according to the documentation, we can have
: dest="*_x" when source="*_x" if I'm right. In this case, there's support
: for multiple destinations when there are multiple source.

correct.  there is support for copying from one field to another 
via a *MAPPING* -- so a single copyField declaration can go from multiple 
sources to multiple destiations, but using a wildcard in the dest
only woks with a one-to-one mapping when the wildcard also exists in the 
source.

on the flip side however, you have have a many-to-one mapping by using a 
wildcard *only* in the source

  
  
  

  

  



-Hoss
http://www.lucidworks.com/

Re: copyField - why source should contain * when dest contains *?

2019-10-23 Thread Paras Lehana

Hey Erick,

Thanks for addressing.

Copyfields are intended to copy exactly one field in the input into exactly
> one field in the destination, not multiple ones at the same time.

Documentation says that we can copy multiple fields using wildcard to one
or more than one fields.

Remember that Solr is also dealing with dynamic fields. In this case, what
> does “*_x” mean?

Yes, that's what hit me initially. But, "*_x" while indexing (in XMLs)
doesn't mean anything, right? It's only used in dynamicFields while
defining schema to let Solr know that we would have some undeclared fields
having names like this. Also, according to the documentation, we can have
dest="*_x" when source="*_x" if I'm right. In this case, there's support
for multiple destinations when there are multiple source.

 Or is this mostly curiosity?

I'm just curious what exactly restricts multiple destination and single
source.

 And what use-case do you want to solve?

Yes, it does seem not too practical. Maybe impossibility of chaining
copyFields is the reason here. I'm just curious about the implementation -
there should be a catch.

Anyways, thanks for replying, Erick. :)

On Wed, 23 Oct 2019 at 17:41, Erick Erickson 
wrote:

> So how would that work? Copyfields are intended to copy exactly one field
> in the input into exactly one field in the destination, not multiple ones
> at the same time. If you need to do that, define multiple copyField
> directives.
>
> I don’t even see how that would work.  dest=“*_x”/>. Remember that Solr is also dealing with dynamic fields. In
> this case, what does “*_x” mean? Create N new fields?
>
> And what use-case do you want to solve? Or is this mostly curiosity?
>
> Best,
> Erick
>
> > On Oct 23, 2019, at 7:55 AM, Paras Lehana 
> wrote:
> >
> > Can't we have one source field
> > information that is copied into different fields
>
>

-- 
-- 
Regards,

*Paras Lehana* [65871]
Development Engineer, Auto-Suggest,
IndiaMART Intermesh Ltd.

8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
Noida, UP, IN - 201303

Mob.: +91-9560911996
Work: 01203916600 | Extn:  *8173*

-- 
IMPORTANT: 
NEVER share your IndiaMART OTP/ Password with anyone.

Re: copyField - why source should contain * when dest contains *?

2019-10-23 Thread Erick Erickson

So how would that work? Copyfields are intended to copy exactly one field in 
the input into exactly one field in the destination, not multiple ones at the 
same time. If you need to do that, define multiple copyField directives.

I don’t even see how that would work. . 
Remember that Solr is also dealing with dynamic fields. In this case, what does 
“*_x” mean? Create N new fields?

And what use-case do you want to solve? Or is this mostly curiosity?

Best,
Erick

> On Oct 23, 2019, at 7:55 AM, Paras Lehana  wrote:
> 
> Can't we have one source field
> information that is copied into different fields

Re: copyfield not working

2019-01-14 Thread Jay Potharaju

 thanks for the info Andrea!
Thanks
Jay



On Sun, Jan 13, 2019 at 11:53 PM Andrea Gazzarini 
wrote:

> Hi Jay, the text analysis always operates on the indexed content. The
> stored content of a filed is left untouched unless you do something
> before it gets indexed (e.g. on client side or by an
> UpdateRequestProcessor).
>
> Cheers,
> Andrea
>
> On 14/01/2019 08:46, Jay Potharaju wrote:
> > Hi,
> > I have a copy field in which i am copying the contents of text_en field
> to
> > another custom field.
> > After indexing i was expecting any of the special characters in the
> > paragraph to be removed, but it does not look like that is happening. The
> > copied content is same as the what is there in the source. I ran analysis
> > ...looks like the pattern matching works as expected and the special
> > characters are removed.
> >
> > Any suggestions?
> > 
>  <
> > charFilter class="solr.PatternReplaceCharFilterFactory" pattern=
> > "['!#\$%'\(\)\*+,-\./:;=?@\[\]\^_`{|}~!@#$%^*]" />  > "solr.StandardTokenizerFactory"/>  > "solr.SuggestStopFilterFactory" ignoreCase="true" words=
> > "lang/stopwords_en.txt" /> 
> <
> > filter class="solr.EnglishPossessiveFilterFactory"/>  > "solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
>   > fieldType>
> >
> > Thanks
> > Jay
> >
>

Re: copyfield not working

2019-01-13 Thread Jay Potharaju

copyfield syntax from my schema file...

Thanks
Jay



On Sun, Jan 13, 2019 at 11:46 PM Jay Potharaju 
wrote:

> Hi,
> I have a copy field in which i am copying the contents of text_en field to
> another custom field.
> After indexing i was expecting any of the special characters in the
> paragraph to be removed, but it does not look like that is happening. The
> copied content is same as the what is there in the source. I ran analysis
> ...looks like the pattern matching works as expected and the special
> characters are removed.
>
> Any suggestions?
>  
>  "['!#\$%'\(\)\*+,-\./:;=?@\[\]\^_`{|}~!@#$%^*]" />  "solr.StandardTokenizerFactory"/>  "solr.SuggestStopFilterFactory" ignoreCase="true" words=
> "lang/stopwords_en.txt" />  <
> filter class="solr.EnglishPossessiveFilterFactory"/>  "solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> 
> 
>
> Thanks
> Jay
>
>

Re: copyfield not working

2019-01-13 Thread Andrea Gazzarini

Hi Jay, the text analysis always operates on the indexed content. The 
stored content of a filed is left untouched unless you do something 
before it gets indexed (e.g. on client side or by an 
UpdateRequestProcessor).


Cheers,
Andrea

On 14/01/2019 08:46, Jay Potharaju wrote:

Hi,
I have a copy field in which i am copying the contents of text_en field to
another custom field.
After indexing i was expecting any of the special characters in the
paragraph to be removed, but it does not look like that is happening. The
copied content is same as the what is there in the source. I ran analysis
...looks like the pattern matching works as expected and the special
characters are removed.

Any suggestions?
  <
charFilter class="solr.PatternReplaceCharFilterFactory" pattern=
"['!#\$%'\(\)\*+,-\./:;=?@\[\]\^_`{|}~!@#$%^*]" /><
filter class="solr.EnglishPossessiveFilterFactory"/>   

Thanks
Jay

Re: copyField match, but how?

2017-03-03 Thread nbosecker

You're on the money, Chris. Thank you s much, I didn't even realize
"body" wasn't stored. Of course that is the reason!!





--
View this message in context: 
http://lucene.472066.n3.nabble.com/copyField-match-but-how-tp4323327p4323335.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: copyField match, but how?

2017-03-03 Thread Chris Hostetter


: In my schema.xml, I have these copyFields:

you haven't shown us the field/fieldType definitions for any of those 
fields, so it's possible "simplex" was included in a field that is 
indexed=true but not stored-false -- which is why you might be able to 
search on it, but not see it in the fields returned in a search. 

Wild guess...

: 

...perhaps your "body" field is stored=false, but contains "simplex" and 
was copied into alltext.



-Hoss
http://www.lucidworks.com/

Re: copyField match, but how?

2017-03-03 Thread Alexandre Rafalovitch

I think you are not using default field, but rather eDismax field
definitions. Still you seem to be matching on alltext anyway.

What's the field definition? Did you check the index content with Maple or
with Admin Schema field content?

Regards,
   Alex


On 3 Mar 2017 5:07 PM, "nbosecker"  wrote:

I've got a confusing situation related to copyFields and search.

In my schema.xml, I have these copyFields:












and a defaultSearchField to the 'alltext' copyField:
alltext


In my index, this document with all these mapped fields - nothing to note
except that the word "*simplex*" is *NOT IN ANY OF THESE* :
"path": "Components/Analysis and Statistics/R
Statistics/Experimental Design/Design Mixture Experiment",
"folder": "Components/Analysis and Statistics/R
Statistics/Experimental Design",
"name": "Design Mixture Experiment",
"hostapplication": "Pro Client",
"purpose": "Designs a mixture (formulation) experiment using
Pipeline Pilot or R (DOE)",
"parameters": [
  "Design Type",
  "Ingredient Sum",
  "Number of Levels",
  "Centroid Dimension",
  "Ignore Properties",
  "Ingredient 1 Min Level",
  "Ingredient 1 Max Level",
  "Ingredient 2 Min Level",
  "Ingredient 2 Max Level",
  "Ingredient 3 Min Level",
  "Ingredient 3 Max Level",
  "Ingredient 4 Min Level",
  "Ingredient 4 Max Level",
  "Ingredient 5 Min Level",
  "Ingredient 5 Max Level",
  "Ingredient 6 Min Level",
  "Ingredient 6 Max Level",
  "Factors",
  "Ingredient 1",
  "Ingredient 2",
  "Ingredient 3",
  "Ingredient 4",
  "Ingredient 5",
  "Ingredient 6",
  "Constraints",
  "Constraint 1",
  "Constraint 2",
  "Filter Points",
  "Fill",
  "Factors from Input Data"
],
"components": [
  "Custom Filter (PilotScript)",
  "Custom Manipulator (PilotScript)",
  "Custom Manipulator (PilotScript)",
  "Custom Manipulator (PilotScript)",
  "Unmerge Data",
  "Custom Filter (PilotScript)",
  "Custom Filter (PilotScript)",
  "Custom Filter (PilotScript)",
  "Custom Manipulator (PilotScript)",
  "R Custom Script",
  "Custom Manipulator (PilotScript)"
],
"domain": [
  "Statistics"
],
"author": "BIOVIA"
  }

When I perform a debug search on "*simplex*" in the Solr Admin, it finds
this document as a match to the alltext copyField. *BUT HOW?*:
"debug": {
"rawquerystring": "simplex",
"querystring": "simplex",
"parsedquery": "(+DisjunctionMaxQuery((name:simplex^10.0 |
folder:simplex^5.0 | purpose:simplex^3.0 | alltext:simplex)~0.5)
())/no_coord",
"parsedquery_toString": "+(name:simplex^10.0 | folder:simplex^5.0 |
purpose:simplex^3.0 | alltext:simplex)~0.5 ()",
"explain": {
  "Components/Analysis and Statistics/R Statistics/Experimental
Design/Design Mixture Experiment": "\n0.039487615 = (MATCH) sum of:\n
0.039487615 = (MATCH) max plus 0.5 times others of:\n0.039487615 =
(MATCH) weight(alltext:simplex in 2191) [DefaultSimilarity], result of:\n
0.039487615 = score(doc=2191,freq=24.0 = termFreq=24.0\n), product of:\n
0.07139119 = queryWeight, product of:\n  7.225878 = idf(docFreq=11,
maxDocs=6068)\n  0.009879933 = queryNorm\n0.5531161 =
fieldWeight in 2191, product of:\n  4.8989797 = tf(freq=24.0), with
freq of:\n24.0 = termFreq=24.0\n  7.225878 =
idf(docFreq=11, maxDocs=6068)\n  0.015625 = fieldNorm(doc=2191)\n"
},
"QParser": "ExtendedDismaxQParser",



I'm probably missing something obvious, help!



--
View this message in context: http://lucene.472066.n3.
nabble.com/copyField-match-but-how-tp4323327.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: copyField

2015-11-04 Thread Chris Hostetter


: 1) Give my need, am I losing anything by writing my own copy-field in my
: Java code vs. using Solr's copyField in the schema?

nope.

: 2) How do I prevent a case where when I copy data from field A and B where
: A has "Fable of the Throbbing" and B has "Genius of a Tank Town" which get
: copied into group-X as "Fable of the Throbbing Genius of a Tank Town".
: When this happens, a phrase search for "Throbbing Genius" will get me a hit
: (when in reality, it shouldn't).  If I was using copyField, wouldn't this
: problem still exists?

Each discrete field value is kept discrete at all levels -- so adding two 
String values "Foo Bar" and "Yik Yak" (either via solrj or via 
sopyField, or via something like the CLoneFieldsUpdateProcessor does *not* 
just result in one String value of "Foo Bar Yik Yak" -- instead it results 
in a (multivalued) field containing two string values (in the order 
specified)

If/when you search on a multivalued *text* field, phrase queries can in 
fact result in matches across multiple values, so a search for "Bar Yik" 
(or "Throbbing Genius") may result in a match  -- depending on what the 
*positions* are of the temrs in those field values, and what "slop" factor 
is specified in your search.

By default, the tokens resulting from the analysis of field values have 
sequential positionIncremebts -- so "Throbbing" would have position 3, and 
"Genius" would have position 4 -- but the "positionIncrementGap" option 
can be specified on your fieldType to indicate how much of a "gap" you 
want to place in the positionIncrement for the first token produced by 
each subsequent value in a multivalued field.

So, if you had a positionIncrementGap="100" for a fieldtype, you would 
need a slop value on your phrase query of at least 100 to get any matches 
from tokens that originaled in multiple source values.

https://cwiki.apache.org/confluence/display/solr/Field+Type+Definitions+and+Properties

https://cwiki.apache.org/confluence/display/solr/The+Standard+Query+Parser
(scroll to the description of "Proximity Searches")


-Hoss
http://www.lucidworks.com/

Re: copyField based on value of another field

2015-06-23 Thread Alessandro Benedetti

You should work at the UpdateProcessor level :

https://wiki.apache.org/solr/UpdateRequestProcessor#Implementing_a_conditional_copyField

This should give you some hint.

Cheers

2015-06-23 13:45 GMT+01:00 Alistair Young alistair.yo...@uhi.ac.uk:

 Hi folks,

 is it possible to copyField only if another field has a certain value? e.g.

 copyField 'dc.subject' to 'image_suggestions' only if rdf
 http://www.nsdl.org/ontologies/relationships#isInImageBank is true

 thanks,

 Alistair

 --
 mov eax,1
 mov ebx,0
 int 80h




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England

Re: CopyField exclude patterns

2015-02-03 Thread danny teichthal

Alexander and Jack Thanks for the reply.
Looking at both, I think that the CloneFieldUpdateProcessor can do what I
need without having to implement a custom one.
By the way, Is there a performance penalty by update processor comparing to
copy Field?



On Mon, Feb 2, 2015 at 4:29 PM, Alexandre Rafalovitch arafa...@gmail.com
wrote:

 Not on copyField,

 You can use UpdateRequestProcessor instead (

 http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html
 ).

 This allows to specify both inclusion and exclusion patterns.

 Regards,
Alex.
 
 Sign up for my Solr resources newsletter at http://www.solr-start.com/


 On 2 February 2015 at 02:53, danny teichthal dannyt...@gmail.com wrote:
  Hi,
  Is there a way to make some patterns to be excluded on the source of a
  copyField?
 
  We are using globs to copy all our text fields to some target field.
  It looks something like this:
  copyField source=prefix_* dest=destination /
 
  I would like a subset of the fields starting with prefix_ to be
 excluded
  and not copied to destination. (e.g. all fields with prefix_abc_* ).
  Is there a way to do it on SOLR?
 
  I couldn't find anything saying that it exists.
 
  Thanks

Re: CopyField exclude patterns

2015-02-02 Thread Jack Krupansky

Sorry, that feature is not available in Solr at this time. You could
implement an update processor which copied only the desired input field
values. This can be done in JavaScript using the script update processor.

-- Jack Krupansky

On Mon, Feb 2, 2015 at 2:53 AM, danny teichthal dannyt...@gmail.com wrote:

 Hi,
 Is there a way to make some patterns to be excluded on the source of a
 copyField?

 We are using globs to copy all our text fields to some target field.
 It looks something like this:
 copyField source=prefix_* dest=destination /

 I would like a subset of the fields starting with prefix_ to be excluded
 and not copied to destination. (e.g. all fields with prefix_abc_* ).
 Is there a way to do it on SOLR?

 I couldn't find anything saying that it exists.

 Thanks

Re: CopyField exclude patterns

2015-02-02 Thread Alexandre Rafalovitch

Not on copyField,

You can use UpdateRequestProcessor instead (
http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html
).

This allows to specify both inclusion and exclusion patterns.

Regards,
   Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 2 February 2015 at 02:53, danny teichthal dannyt...@gmail.com wrote:
 Hi,
 Is there a way to make some patterns to be excluded on the source of a
 copyField?

 We are using globs to copy all our text fields to some target field.
 It looks something like this:
 copyField source=prefix_* dest=destination /

 I would like a subset of the fields starting with prefix_ to be excluded
 and not copied to destination. (e.g. all fields with prefix_abc_* ).
 Is there a way to do it on SOLR?

 I couldn't find anything saying that it exists.

 Thanks

RE: CopyField from text to multi value

2014-10-20 Thread Tomer Levi

Thanks Walter!

-Original Message-
From: Walter Underwood [mailto:wun...@wunderwood.org] 
Sent: Monday, October 20, 2014 12:09 AM
To: solr-user@lucene.apache.org
Subject: Re: CopyField from text to multi value

I think that info is available with termvectors. That should give a list of the 
query terms that matched each document, if I understand it correctly.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/

On Oct 19, 2014, at 7:37 AM, Tomer Levi tomer.l...@nice.com wrote:

 Thanks again for the help.

 The use case is this.

 In my UI I would like to indicate which words leaded to every document in the 
 response.

 It actually seems like a simple highlight case but instead of getting the 
 highlight result as this is a brlong/br string brwith/br text,

 Our UI team wants a list of words, i.e:[long, with].

 So, I assumed that I can just tokenize the original text - copy the tokens 
 into new multi-value fields - ask Solr to highlight the multi-value field

 That is my use case.

 Thanks again

 Tomer

 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Sunday, October 19, 2014 5:18 PM
 To: solr-user@lucene.apache.org
 Subject: Re: CopyField from text to multi value

 This really feels like an  XY problem, which I think Jack is alluding to.

 bq:  I understand that the analysis chain is applied after the raw input was 
 copied.

 I need to store the output of the analysis chain as a new multi-value field

 This statement is really confusing. You can't have the output of the analysis 
 chain used as input to a copyField, it just doesn't work that way which is 
 what you seem to want to do with the second sentence. Then you bring shingles 
 into the picture...

 So let's take Jack's suggestion and  back up and tell us what the use-case 
 you're trying to support is rather than leaving us to guess what problem 
 you're trying to solve..

 Best,

 Erick

 On Sun, Oct 19, 2014 at 9:43 AM, Jack Krupansky 
 j...@basetechnology.commailto:j...@basetechnology.com wrote:

 As always, you need to first examine how you intend to query the fields 
 before you dive into data modeling. In this case, is there any particular 
 reason that you need the individual terms as separate values, as opposed to 
 simply using a tokenized text field?

 -- Jack Krupansky

 From: Tomer Levi

 Sent: Sunday, October 19, 2014 9:07 AM

 To: solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org

 Subject: CopyField from text to multi value

 Hi,

 I would like to copy a textual field content into a multivalue filed.

 For example,

 Let's say my field text contains: I am a solr user

 I would like to have a multi-value copyFields with the following

 content: [I, am, a, solr, user]

 Thanks,

  Tomer Levi

  Software Engineer

  Big Data Group

  Product  Technology Unit

  (T) +972 (9) 775-2693

  tomer.l...@nice.commailto:tomer.l...@nice.com

  www.nice.comhttp://www.nice.com

Re: CopyField from text to multi value

2014-10-19 Thread Erick Erickson

Not quite sure what you're asking here. If you do a copyField, the raw
input is, well, copied to the destination field and _then_ the analysis
chain is applied. Which seems to be what you want, the destination field
would be a text-based field, perhaps text_general or some such from the
distro.

And perhaps there;s some confusion about what multiValued means here. It
does _not_ mean tokenized, i.e. broken up into words. non-multiValued
fields can be tokenized.

multiValued means tha tmore than one entry for the field can be in a doc.
I.e. (using the XML form of an input doc as an example)

add
  doc
  field name=multisome text/field
  field name=multiand now for something completely different/field
 /doc
/add

will succeed with a field defined as multiValued=true, but fail with
something with multiValued=false.

In either case, though, whether the input was broken up into multiple,
independently-searchable tokens (words) is orthogonal to whether it's
multiValued or not, and is entirely dependent on the analysis chain in the
fieldType for the field in question.

Best,
Erick

On Sun, Oct 19, 2014 at 9:07 AM, Tomer Levi tomer.l...@nice.com wrote:

 Hi,

 I would like to copy a textual field content into a multivalue filed.

 For example,

 Let’s say my field text contains: *“I am a solr user”*

 I would like to have a multi-value copyFields with the following content*:
 [“I”, “am”, “a”, “solr”, “user”]*



 *Thanks,*

 *Tomer Levi*

 *Software Engineer  *

 *Big Data Group*

 *Product  Technology Unit*

 (T) +972 (9) 775-2693



 tomer.l...@nice.com

 www.nice.com

 [image: http://tlvbiztalk03/SignatureMaker/img/newsocial_03.png]
 http://twitter.com/NICE_Systems/[image:
 http://tlvbiztalk03/SignatureMaker/img/newsocial_04.png]
 http://www.facebook.com/pages/NICE-Systems/149072782602/[image:
 http://tlvbiztalk03/SignatureMaker/img/newsocial_05.png]
 http://www.linkedin.com/company/nice-systems[image:
 http://tlvbiztalk03/SignatureMaker/img/newsocial_06.png]
 http://www.nice.com/blog



 [image: http://tlvbiztalk03/SignatureMaker/img/banner_BIG-DATA.jpg]
 http://www.nice.com/big-data-solutions

RE: CopyField from text to multi value

2014-10-19 Thread Tomer Levi


Hi Erick,
Thanks for the explanation, I understand that the analysis chain is applied 
after the raw input was copied.
I need to store the output of the analysis chain as a new multi-value field, 
and I think that ShingleFilterFactory might do that, isn’t it?

Tomer

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Sunday, October 19, 2014 4:31 PM
To: solr-user@lucene.apache.org
Subject: Re: CopyField from text to multi value

Not quite sure what you're asking here. If you do a copyField, the raw input 
is, well, copied to the destination field and _then_ the analysis chain is 
applied. Which seems to be what you want, the destination field would be a 
text-based field, perhaps text_general or some such from the distro.

And perhaps there;s some confusion about what multiValued means here. It does 
_not_ mean tokenized, i.e. broken up into words. non-multiValued fields can 
be tokenized.

multiValued means tha tmore than one entry for the field can be in a doc.
I.e. (using the XML form of an input doc as an example)

add
  doc
  field name=multisome text/field
  field name=multiand now for something completely different/field  
/doc /add

will succeed with a field defined as multiValued=true, but fail with 
something with multiValued=false.

In either case, though, whether the input was broken up into multiple, 
independently-searchable tokens (words) is orthogonal to whether it's 
multiValued or not, and is entirely dependent on the analysis chain in the 
fieldType for the field in question.

Best,
Erick

On Sun, Oct 19, 2014 at 9:07 AM, Tomer Levi tomer.l...@nice.com wrote:

 Hi,

 I would like to copy a textual field content into a multivalue filed.

 For example,

 Let’s say my field text contains: *“I am a solr user”*

 I would like to have a multi-value copyFields with the following content*:
 [“I”, “am”, “a”, “solr”, “user”]*



 *Thanks,*

 *Tomer Levi*

 *Software Engineer  *

 *Big Data Group*

 *Product  Technology Unit*

 (T) +972 (9) 775-2693



 tomer.l...@nice.com

 www.nice.com

 [image: http://tlvbiztalk03/SignatureMaker/img/newsocial_03.png]
 http://twitter.com/NICE_Systems/[image:
 http://tlvbiztalk03/SignatureMaker/img/newsocial_04.png]
 http://www.facebook.com/pages/NICE-Systems/149072782602/[image:
 http://tlvbiztalk03/SignatureMaker/img/newsocial_05.png]
 http://www.linkedin.com/company/nice-systems[image:
 http://tlvbiztalk03/SignatureMaker/img/newsocial_06.png]
 http://www.nice.com/blog



 [image: http://tlvbiztalk03/SignatureMaker/img/banner_BIG-DATA.jpg]
 http://www.nice.com/big-data-solutions

Re: CopyField from text to multi value

2014-10-19 Thread Jack Krupansky

As always, you need to first examine how you intend to query the fields before 
you dive into data modeling. In this case, is there any particular reason that 
you need the individual terms as separate values, as opposed to simply using a 
tokenized text field?

-- Jack Krupansky

From: Tomer Levi 
Sent: Sunday, October 19, 2014 9:07 AM
To: solr-user@lucene.apache.org 
Subject: CopyField from text to multi value

Hi,

I would like to copy a textual field content into a multivalue filed.

For example,

Let’s say my field text contains: “I am a solr user”

I would like to have a multi-value copyFields with the following content: [“I”, 
“am”, “a”, “solr”, “user”]

Thanks,

  Tomer Levi

  Software Engineer  

  Big Data Group

  Product  Technology Unit

  (T) +972 (9) 775-2693

  tomer.l...@nice.com 

  www.nice.com

Re: CopyField from text to multi value

2014-10-19 Thread Erick Erickson

This really feels like an  XY problem, which I think Jack is alluding to.

bq:  I understand that the analysis chain is applied after the raw
input was copied.
I need to store the output of the analysis chain as a new multi-value field

This statement is really confusing. You can't have the output of the analysis
chain used as input to a copyField, it just doesn't work that way which is what
you seem to want to do with the second sentence. Then you bring shingles
into the picture...

So let's take Jack's suggestion and  back up and tell us what the use-case
you're trying to support is rather than leaving us to guess what problem
you're trying to solve..

Best,
Erick


On Sun, Oct 19, 2014 at 9:43 AM, Jack Krupansky j...@basetechnology.com wrote:
 As always, you need to first examine how you intend to query the fields 
 before you dive into data modeling. In this case, is there any particular 
 reason that you need the individual terms as separate values, as opposed to 
 simply using a tokenized text field?

 -- Jack Krupansky

 From: Tomer Levi
 Sent: Sunday, October 19, 2014 9:07 AM
 To: solr-user@lucene.apache.org
 Subject: CopyField from text to multi value

 Hi,

 I would like to copy a textual field content into a multivalue filed.

 For example,

 Let’s say my field text contains: “I am a solr user”

 I would like to have a multi-value copyFields with the following content: 
 [“I”, “am”, “a”, “solr”, “user”]



 Thanks,

   Tomer Levi

   Software Engineer

   Big Data Group

   Product  Technology Unit

   (T) +972 (9) 775-2693



   tomer.l...@nice.com

   www.nice.com

RE: CopyField from text to multi value

2014-10-19 Thread Tomer Levi

Thanks again for the help.



The use case is this.

In my UI I would like to indicate which words leaded to every document in the 
response.

It actually seems like a simple highlight case but instead of getting the 
highlight result as this is a brlong/br string brwith/br text,

Our UI team wants a list of words, i.e:[long, with].



So, I assumed that I can just tokenize the original text - copy the tokens 
into new multi-value fields - ask Solr to highlight the multi-value field



That is my use case.

Thanks again

Tomer





-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Sunday, October 19, 2014 5:18 PM
To: solr-user@lucene.apache.org
Subject: Re: CopyField from text to multi value



This really feels like an  XY problem, which I think Jack is alluding to.



bq:  I understand that the analysis chain is applied after the raw input was 
copied.

I need to store the output of the analysis chain as a new multi-value field



This statement is really confusing. You can't have the output of the analysis 
chain used as input to a copyField, it just doesn't work that way which is what 
you seem to want to do with the second sentence. Then you bring shingles into 
the picture...



So let's take Jack's suggestion and  back up and tell us what the use-case 
you're trying to support is rather than leaving us to guess what problem you're 
trying to solve..



Best,

Erick





On Sun, Oct 19, 2014 at 9:43 AM, Jack Krupansky 
j...@basetechnology.commailto:j...@basetechnology.com wrote:

 As always, you need to first examine how you intend to query the fields 
 before you dive into data modeling. In this case, is there any particular 
 reason that you need the individual terms as separate values, as opposed to 
 simply using a tokenized text field?



 -- Jack Krupansky



 From: Tomer Levi

 Sent: Sunday, October 19, 2014 9:07 AM

 To: solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org

 Subject: CopyField from text to multi value



 Hi,



 I would like to copy a textual field content into a multivalue filed.



 For example,



 Let’s say my field text contains: “I am a solr user”



 I would like to have a multi-value copyFields with the following

 content: [“I”, “am”, “a”, “solr”, “user”]







 Thanks,



   Tomer Levi



   Software Engineer



   Big Data Group



   Product  Technology Unit



   (T) +972 (9) 775-2693







   tomer.l...@nice.commailto:tomer.l...@nice.com



   www.nice.comhttp://www.nice.com

Re: CopyField from text to multi value

2014-10-19 Thread Walter Underwood

I think that info is available with termvectors. That should give a list of the 
query terms that matched each document, if I understand it correctly.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/


On Oct 19, 2014, at 7:37 AM, Tomer Levi tomer.l...@nice.com wrote:

 Thanks again for the help.
 
 
 
 The use case is this.
 
 In my UI I would like to indicate which words leaded to every document in the 
 response.
 
 It actually seems like a simple highlight case but instead of getting the 
 highlight result as this is a brlong/br string brwith/br text,
 
 Our UI team wants a list of words, i.e:[long, with].
 
 
 
 So, I assumed that I can just tokenize the original text - copy the tokens 
 into new multi-value fields - ask Solr to highlight the multi-value field
 
 
 
 That is my use case.
 
 Thanks again
 
 Tomer
 
 
 
 
 
 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Sunday, October 19, 2014 5:18 PM
 To: solr-user@lucene.apache.org
 Subject: Re: CopyField from text to multi value
 
 
 
 This really feels like an  XY problem, which I think Jack is alluding to.
 
 
 
 bq:  I understand that the analysis chain is applied after the raw input was 
 copied.
 
 I need to store the output of the analysis chain as a new multi-value field
 
 
 
 This statement is really confusing. You can't have the output of the analysis 
 chain used as input to a copyField, it just doesn't work that way which is 
 what you seem to want to do with the second sentence. Then you bring shingles 
 into the picture...
 
 
 
 So let's take Jack's suggestion and  back up and tell us what the use-case 
 you're trying to support is rather than leaving us to guess what problem 
 you're trying to solve..
 
 
 
 Best,
 
 Erick
 
 
 
 
 
 On Sun, Oct 19, 2014 at 9:43 AM, Jack Krupansky 
 j...@basetechnology.commailto:j...@basetechnology.com wrote:
 
 As always, you need to first examine how you intend to query the fields 
 before you dive into data modeling. In this case, is there any particular 
 reason that you need the individual terms as separate values, as opposed to 
 simply using a tokenized text field?
 
 
 
 -- Jack Krupansky
 
 
 
 From: Tomer Levi
 
 Sent: Sunday, October 19, 2014 9:07 AM
 
 To: solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org
 
 Subject: CopyField from text to multi value
 
 
 
 Hi,
 
 
 
 I would like to copy a textual field content into a multivalue filed.
 
 
 
 For example,
 
 
 
 Let’s say my field text contains: “I am a solr user”
 
 
 
 I would like to have a multi-value copyFields with the following
 
 content: [“I”, “am”, “a”, “solr”, “user”]
 
 
 
 
 
 
 
 Thanks,
 
 
 
  Tomer Levi
 
 
 
  Software Engineer
 
 
 
  Big Data Group
 
 
 
  Product  Technology Unit
 
 
 
  (T) +972 (9) 775-2693
 
 
 
 
 
 
 
  tomer.l...@nice.commailto:tomer.l...@nice.com
 
 
 
  www.nice.comhttp://www.nice.com

Re: copyfield with wildcard-source?

2014-09-22 Thread Alexandre Rafalovitch

On 22 September 2014 01:04, Clemens Wyss DEV clemens...@mysign.ch wrote:
 All I have at hand is Solr in Action which doesn't (didn't) mention the 
 copyField-wildcards...

Well, unless your implementation is also fully theoretical, you also
have all the various examples in the Solr distribution. They
demonstrate many of the features.

Regards,
Alex

Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853

Re: copyfield with wildcard-source?

2014-09-21 Thread Alexandre Rafalovitch

copyField - Solr is case sensitive, yet does not complain when sees
wrong/misspelt directives (this is being fixed slowly)

https://cwiki.apache.org/confluence/display/solr/Copying+Fields

Regards,
   Alex.

Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On 21 September 2014 05:47, Clemens Wyss DEV clemens...@mysign.ch wrote:
 is there a way to use copyfield with a wildcard-source?
 For example to copy all fields of a certain dynamic field type:
 dynamicField name=*ss .../
 dynamicField name=*ss_suggest .../
  ...
 copyfield source=*_ss_suggest dest=suggest

Re: copyfield with wildcard-source?

2014-09-21 Thread Erick Erickson

What have you tried? Because it works just fine for me.

It's _really_ helpful to tell us what you've tried and what
you think isn't operating correctly, otherwise
there's not much to go on except guesswork.

Best,
Erick

On Sun, Sep 21, 2014 at 2:47 AM, Clemens Wyss DEV clemens...@mysign.ch wrote:
 is there a way to use copyfield with a wildcard-source?
 For example to copy all fields of a certain dynamic field type:
 dynamicField name=*ss .../
 dynamicField name=*ss_suggest .../
  ...
 copyfield source=*_ss_suggest dest=suggest

Re: CopyField Wildcard Exception possible?

2014-08-30 Thread O. Olson

Thank you Ahmet. I am not familiar with using the ScriptUpdateProcessor, but
I would look into it. I am also not sure how bad this would be on the import
performance.
O. O.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/CopyField-Wildcard-Exception-possible-tp4155686p4156001.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: CopyField Wildcard Exception possible?

2014-08-29 Thread O. Olson

Thank you Joe. I am not familiar with creating a JIRA ticket. I was however
hoping that there might be a solution to this. If there is none, then I
would consider explicitly specifying the fields.
O. O.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/CopyField-Wildcard-Exception-possible-tp4155686p4155838.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: CopyField Wildcard Exception possible?

2014-08-29 Thread Ahmet Arslan

How about using ScriptUpdateProcessor?



On Friday, August 29, 2014 5:39 PM, O. Olson olson_...@yahoo.it wrote:
Thank you Joe. I am not familiar with creating a JIRA ticket. I was however
hoping that there might be a solution to this. If there is none, then I
would consider explicitly specifying the fields.
O. O.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/CopyField-Wildcard-Exception-possible-tp4155686p4155838.html



Sent from the Solr - User mailing list archive at Nabble.com.

Re: CopyField Wildcard Exception possible?

2014-08-28 Thread Joe Gresock

We would enjoy this feature as well, if you'd like to create a JIRA ticket.


On Thu, Aug 28, 2014 at 4:21 PM, O. Olson olson_...@yahoo.it wrote:

 I have hundreds of fields of the form in my schema.xml:

  field name=F10434 type=string indexed=true stored=true
 multiValued=true/
  field name=B20215 type=string indexed=true stored=true
 multiValued=true/
   .

 I also have a field 'text' that is set as the Default Search Field

 field name=text type=text indexed=true stored=false
 multiValued=true/

 I populate this 'text' field using CopyField as:

 copyField source=* dest=text/

 This '*' worked so far. However, I now want to exclude some of the fields
 from this i.e. I would like 'text' to contain everything (hundreds of
 fields) except a few. Is there any way to do this?

 One of the ways would be to specify the '*' explicitly e.g.

 copyField source=F10434 dest=text/
 copyField source=B20215 dest=text/
  

 and in this list I would exclude the ones I do not want. Is there an
 alternative to this? (I would like an alternative because putting these
 copyFields would be long and too difficult.


 Thank you
 O. O.




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/CopyField-Wildcard-Exception-possible-tp4155686.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
I know what it is to be in need, and I know what it is to have plenty.  I
have learned the secret of being content in any and every situation,
whether well fed or hungry, whether living in plenty or in want.  I can do
all this through him who gives me strength.*-Philippians 4:12-13*

Re: CopyField can't copy analyzers and Filters

2014-07-01 Thread benjelloun

Hello,

here is my configuration which don't work:

shema:
field name=AllChamp type=text_general multiValued=true
indexed=true
required=false stored=false/

dynamicField name=*_en type=text_en indexed=true stored=true
required=false multiValued=true/

dynamicField name=*_fr type=text_fr indexed=true stored=true
 required=false multiValued=true/

dynamicField name=*_ar type=text_ar indexed=true stored=true
 required=false multiValued=true/

config:

requestHandler name=/browse class=solr.SearchHandler
 lst name=defaults
   str name=echoParamsexplicit/str

   !-- VelocityResponseWriter settings --
   str name=wtvelocity/str
   str name=v.templatebrowse/str
   str name=v.layoutlayout/str
   str name=titleSolritas/str

   !-- Query settings --
   str name=defTypeedismax/str
   str name=qf
  *_ar^2 *_fr^3 *_en^2.2
   /str
   str name=dfAllChamp/str
   str name=mm100%/str
   str name=q.alt*:*/str
   str name=rows10/str
   str name=fl*,score/str

   str name=mlt.qf
 text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
 title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0
   /str
   str
name=mlt.fltext,features,name,sku,id,manu,cat,title,description,keywords,author,resourcename/str
   int name=mlt.count3/int

   !-- Faceting defaults --
   str name=faceton/str
   str name=facet.fieldcat/str
   str name=facet.fieldmanu_exact/str
   str name=facet.fieldcontent_type/str
   str name=facet.fieldauthor_s/str
   str name=facet.queryipod/str
   str name=facet.queryGB/str
   str name=facet.mincount1/str
   str name=facet.pivotcat,inStock/str
   str name=facet.range.otherafter/str
   str name=facet.rangeprice/str
   int name=f.price.facet.range.start0/int
   int name=f.price.facet.range.end600/int
   int name=f.price.facet.range.gap50/int
   str name=facet.rangepopularity/str
   int name=f.popularity.facet.range.start0/int
   int name=f.popularity.facet.range.end10/int
   int name=f.popularity.facet.range.gap3/int
   str name=facet.rangemanufacturedate_dt/str
   str
name=f.manufacturedate_dt.facet.range.startNOW/YEAR-10YEARS/str
   str name=f.manufacturedate_dt.facet.range.endNOW/str
   str name=f.manufacturedate_dt.facet.range.gap+1YEAR/str
   str name=f.manufacturedate_dt.facet.range.otherbefore/str
   str name=f.manufacturedate_dt.facet.range.otherafter/str

   !-- Highlighting defaults --
   str name=hlon/str
   str name=hl.flcontent features title name/str
   str name=hl.encoderhtml/str
   str name=hl.simple.prelt;bgt;/str
   str name=hl.simple.postlt;/bgt;/str
   str name=f.title.hl.fragsize0/str
   str name=f.title.hl.alternateFieldtitle/str
   str name=f.name.hl.fragsize0/str
   str name=f.name.hl.alternateFieldname/str
   str name=f.content.hl.snippets3/str
   str name=f.content.hl.fragsize200/str
   str name=f.content.hl.alternateFieldcontent/str
   str name=f.content.hl.maxAlternateFieldLength750/str

   !-- Spell checking defaults --
   str name=spellcheckon/str
   str name=spellcheck.extendedResultsfalse/str
   str name=spellcheck.count5/str
   str name=spellcheck.alternativeTermCount2/str
   str name=spellcheck.maxResultsForSuggest5/str
   str name=spellcheck.collatetrue/str
   str name=spellcheck.collateExtendedResultstrue/str
   str name=spellcheck.maxCollationTries5/str
   str name=spellcheck.maxCollations3/str
 /lst

 !-- append spellchecking to our list of components --
 arr name=last-components
   strspellcheck/str
 /arr
  /requestHandler






2014-07-01 2:24 GMT+02:00 Steve McKay-4 [via Lucene] 
ml-node+s472066n4144897...@n3.nabble.com:

 Three fields: AllChamp_ar, AllChamp_fr, AllChamp_en. Then query them with
 dismax.

 On Jun 30, 2014, at 11:53 AM, benjelloun [hidden email]
 http://user/SendEmail.jtp?type=nodenode=4144897i=0 wrote:

  here is my schema:
 
  field name=AllChamp type=text_general multiValued=true
 indexed=true
  required=false stored=false/
  dynamicField name=*_en type=text_en indexed=true stored=true
  required=false multiValued=true/
 
  dynamicField name=*_fr type=text_fr indexed=true stored=true
  required=false multiValued=true/
 
  dynamicField name=*_ar type=text_ar indexed=true stored=true
  required=false multiValued=true/
 
  copyField source=*_ar dest=AllChamp/
  copyField source=*_fr dest=AllChamp/
  copyField source=*_en dest=AllChamp/
 
  when i index documents then search on this field AllChamp that don't
 do
  analyzer and filter.
  I know that CopyField can't copy analyzers and Filters, so how to keep
  analyzer and filter on Field: AllChamp?
 
  Exemple:
 
  I search for : AllChamp:presenton  -- num result=0
AllChamp:présenton  -- num result=1
 
  thanks for help,
  best regards,
  Anass BENJELLOUN

Re: CopyField can't copy analyzers and Filters

2014-07-01 Thread Alexandre Rafalovitch

I believe, you were already answered.

If you want to have text parsed/analyzed in different ways, you need
to have them in separate fields with separate analyzer stacks. Then
use disMax/eDisMax to search across those fields.

copyField copies the original content and therefore when you search
the target field only one analyzer chain applies.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Tue, Jul 1, 2014 at 4:20 PM, benjelloun anass@gmail.com wrote:
 Hello,

 here is my configuration which don't work:

Re: CopyField can't copy analyzers and Filters

2014-07-01 Thread benjelloun

Hello,

i have 300 feilds which are copied on AllChamp
if i want to do separated fields then i need to create 300 * Number of
languages i have, which is not logical for me.
is there any other solution?

Best regards
Anass BENJELLOUN


2014-07-01 11:28 GMT+02:00 Alexandre Rafalovitch [via Lucene] 
ml-node+s472066n414493...@n3.nabble.com:

 I believe, you were already answered.

 If you want to have text parsed/analyzed in different ways, you need
 to have them in separate fields with separate analyzer stacks. Then
 use disMax/eDisMax to search across those fields.

 copyField copies the original content and therefore when you search
 the target field only one analyzer chain applies.

 Regards,
Alex.
 Personal website: http://www.outerthoughts.com/
 Current project: http://www.solr-start.com/ - Accelerating your Solr
 proficiency


 On Tue, Jul 1, 2014 at 4:20 PM, benjelloun [hidden email]
 http://user/SendEmail.jtp?type=nodenode=4144938i=0 wrote:
  Hello,
 
  here is my configuration which don't work:
 


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://lucene.472066.n3.nabble.com/CopyField-can-t-copy-analyzers-and-Filters-tp4144803p4144938.html
  To unsubscribe from CopyField can't copy analyzers and Filters, click
 here
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4144803code=YW5hc3MuYm5qQGdtYWlsLmNvbXw0MTQ0ODAzfC0xMDQyNjMzMDgx
 .
 NAML
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://lucene.472066.n3.nabble.com/CopyField-can-t-copy-analyzers-and-Filters-tp4144803p4144943.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: CopyField can't copy analyzers and Filters

2014-07-01 Thread Alexandre Rafalovitch

But aren't you already creating those 300 fields anyway:
dynamicField name=*_fr type=text_fr indexed=true stored=true
required=false multiValued=true/

If you mean you have issues specifying them in eDisMax, I believe 'qf'
parameter allows to specify a wildcard.

Alternatively, you can look at the example used in Solr In Action
book: 
https://github.com/treygrainger/solr-in-action/tree/master/src/main/java/sia/ch14
 They use a multiplexing approach.

Regards,
   Alex.
On Tue, Jul 1, 2014 at 4:53 PM, benjelloun anass@gmail.com wrote:
 Hello,

 i have 300 feilds which are copied on AllChamp
 if i want to do separated fields then i need to create 300 * Number of
 languages i have, which is not logical for me.
 is there any other solution?

 Best regards
 Anass BENJELLOUN



Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency

Re: CopyField can't copy analyzers and Filters

2014-07-01 Thread benjelloun

i have documents (ar, en , fr)
i need to index them and keeping analyzer and filter for each languages.
here is all fields on schema to enderstand my probleme:

fields
field name=IdDocument type=string multiValued=false indexed=true
required=true stored=true/
field name=NomDocument type=string multiValued=false indexed=true
required=false stored=true/
field name=AVersion type=boolean multiValued=false
indexed=false required=false stored=true/
field name=Acl type=string multiValued=false indexed=false
required=false stored=false/
field name=AllChamp type=text_general multiValued=true
indexed=true required=false stored=false/
field name=Chemin type=string multiValued=false indexed=false
required=true stored=true/
field name=ContenuDocument type=text_general multiValued=false
indexed=true required=false stored=true/
field name=DateCreation type=date multiValued=false
indexed=true required=true stored=true/
field name=DateModification type=date multiValued=false
indexed=true required=true stored=true/
field name=EstDansProcessus type=boolean multiValued=false
indexed=false required=true stored=true/
field name=ExtensionDocument type=string multiValued=false
indexed=true required=true stored=true/
field name=IdModele type=long multiValued=false indexed=true
required=true stored=true/
field name=IdRepertoire type=long multiValued=false
indexed=true required=true stored=true/
field name=IdUtilisateur type=long multiValued=false
indexed=true required=true stored=true/
field name=IdUtilisateurDerniereVersion type=long
multiValued=false indexed=false required=false stored=true/
field name=IdUtilisateurModifiePar type=long multiValued=false
indexed=true required=true stored=true/
field name=Postit type=text_general multiValued=True
indexed=true required=false stored=false/

field name=_version_ type=long indexed=true stored=true/
field name=language_s type=string multiValued=true indexed=false
required=false stored=true/
 field name=C6_id  type=long multiValued=true indexed=true
required=false stored=true/
field name=C15_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C17_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C18_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C19_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C22_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C24_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C26_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C27_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C29_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C30_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C31_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C34_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C35_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C36_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C37_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C38_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C49_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C50_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C64_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C65_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C66_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C68_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C70_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C74_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C75_id type=long multiValued=true indexed=true
required=false stored=true/
field name=C80_id type=long multiValued=true indexed=true
required=false stored=true/
 field name=C0_val  type=text_general multiValued=false
indexed=true required=false stored=true/
field name=C1_val  type=text_general multiValued=false
indexed=true required=false stored=true/
field name=C2_val  type=text_general multiValued=false
indexed=true required=false stored=true/
field name=C3_val  type=text_general multiValued=false
indexed=true required=false stored=true/
field name=C4_val  type=text_general multiValued=false
indexed=true required=false stored=true/
field name=C5_val  type=text_general multiValued=false
indexed=true required=false stored=true/
field name=C6_val  type=text_general multiValued=true indexed=true
required=false stored=true/

Re: CopyField can't copy analyzers and Filters

2014-07-01 Thread benjelloun

and i use dynamicfields for  NomDocument,ContenuDocument,Postit
exemple: ContenuDocument_fr, ContenuDocument_en,ContenuDocument_ar

 processor
class=org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory
   lst name=defaults
 str name=langid.flNomDocument,ContenuDocument,Postit/str
 str name=langid.langFieldlanguage_s/str
 str name=langid.fallbackfr/str
 str name=langid.whitelisten,fr,ar/str
 bool name=langid.maptrue/bool
   /lst
/processor

is there any other solution to not separate fileds?

Best regards
Anass BENJELLOUN


2014-07-01 12:05 GMT+02:00 anass benjelloun anass@gmail.com:

 i have documents (ar, en , fr)
 i need to index them and keeping analyzer and filter for each languages.
 here is all fields on schema to enderstand my probleme:

 fields
  field name=IdDocument type=string multiValued=false
 indexed=true required=true stored=true/
  field name=NomDocument type=string multiValued=false
 indexed=true required=false stored=true/
 field name=AVersion type=boolean multiValued=false
 indexed=false required=false stored=true/
 field name=Acl type=string multiValued=false indexed=false
 required=false stored=false/
 field name=AllChamp type=text_general multiValued=true
 indexed=true required=false stored=false/
 field name=Chemin type=string multiValued=false indexed=false
 required=true stored=true/
  field name=ContenuDocument type=text_general multiValued=false
 indexed=true required=false stored=true/
 field name=DateCreation type=date multiValued=false
 indexed=true required=true stored=true/
 field name=DateModification type=date multiValued=false
 indexed=true required=true stored=true/
 field name=EstDansProcessus type=boolean multiValued=false
 indexed=false required=true stored=true/
 field name=ExtensionDocument type=string multiValued=false
 indexed=true required=true stored=true/
 field name=IdModele type=long multiValued=false indexed=true
 required=true stored=true/
 field name=IdRepertoire type=long multiValued=false
 indexed=true required=true stored=true/
 field name=IdUtilisateur type=long multiValued=false
 indexed=true required=true stored=true/
 field name=IdUtilisateurDerniereVersion type=long
 multiValued=false indexed=false required=false stored=true/
 field name=IdUtilisateurModifiePar type=long multiValued=false
 indexed=true required=true stored=true/
 field name=Postit type=text_general multiValued=True
 indexed=true required=false stored=false/

 field name=_version_ type=long indexed=true stored=true/
 field name=language_s type=string multiValued=true indexed=false
 required=false stored=true/
   field name=C6_id  type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C15_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C17_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C18_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C19_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C22_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C24_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C26_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C27_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C29_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C30_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C31_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C34_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C35_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C36_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C37_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C38_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C49_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C50_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C64_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C65_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C66_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C68_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C70_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C74_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C75_id type=long multiValued=true indexed=true
 required=false

Re: CopyField can't copy analyzers and Filters

2014-07-01 Thread Erick Erickson

OK, back up a bit and consider alternative indexing schemes. For instance,
do you really need all those fields? Could you get away with one field
where you indexed the field _name_ + associated value? (you'd have
to be very careful with your analysis chain, but...) Something like:
C67_val_value1

and put them all in a single field? You can then search the single field.
or, you could make your uber-field multiValued and use phrase searching,
i.e.
C67 value1. with a positionIncrementGap of, say, 100 (the default), you
can also do proximity searches like c67 value1~99 and never match
across multiple entries.

You can have all your labels C67, C45 etc. as keywords and prevent
things like WordDelimiterFilterFactory from breaking them up.

My point is that whenever I see comments like I have 300 fields and it's
getting too complicated, I recommend backing up a step and considering
changing the indexing scheme to simplify things.

This really is starting to feel like an XY problem. You're asking how to
do X when a better approach is I want to do Y, what approaches can
people think of?.

Best,
Erick

On Tue, Jul 1, 2014 at 5:08 AM, benjelloun anass@gmail.com wrote:
 and i use dynamicfields for  NomDocument,ContenuDocument,Postit
 exemple: ContenuDocument_fr, ContenuDocument_en,ContenuDocument_ar

  processor
 class=org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory
lst name=defaults
  str name=langid.flNomDocument,ContenuDocument,Postit/str
  str name=langid.langFieldlanguage_s/str
  str name=langid.fallbackfr/str
  str name=langid.whitelisten,fr,ar/str
  bool name=langid.maptrue/bool
/lst
 /processor

 is there any other solution to not separate fileds?

 Best regards
 Anass BENJELLOUN


 2014-07-01 12:05 GMT+02:00 anass benjelloun anass@gmail.com:

 i have documents (ar, en , fr)
 i need to index them and keeping analyzer and filter for each languages.
 here is all fields on schema to enderstand my probleme:

 fields
  field name=IdDocument type=string multiValued=false
 indexed=true required=true stored=true/
  field name=NomDocument type=string multiValued=false
 indexed=true required=false stored=true/
 field name=AVersion type=boolean multiValued=false
 indexed=false required=false stored=true/
 field name=Acl type=string multiValued=false indexed=false
 required=false stored=false/
 field name=AllChamp type=text_general multiValued=true
 indexed=true required=false stored=false/
 field name=Chemin type=string multiValued=false indexed=false
 required=true stored=true/
  field name=ContenuDocument type=text_general multiValued=false
 indexed=true required=false stored=true/
 field name=DateCreation type=date multiValued=false
 indexed=true required=true stored=true/
 field name=DateModification type=date multiValued=false
 indexed=true required=true stored=true/
 field name=EstDansProcessus type=boolean multiValued=false
 indexed=false required=true stored=true/
 field name=ExtensionDocument type=string multiValued=false
 indexed=true required=true stored=true/
 field name=IdModele type=long multiValued=false indexed=true
 required=true stored=true/
 field name=IdRepertoire type=long multiValued=false
 indexed=true required=true stored=true/
 field name=IdUtilisateur type=long multiValued=false
 indexed=true required=true stored=true/
 field name=IdUtilisateurDerniereVersion type=long
 multiValued=false indexed=false required=false stored=true/
 field name=IdUtilisateurModifiePar type=long multiValued=false
 indexed=true required=true stored=true/
 field name=Postit type=text_general multiValued=True
 indexed=true required=false stored=false/

 field name=_version_ type=long indexed=true stored=true/
 field name=language_s type=string multiValued=true indexed=false
 required=false stored=true/
   field name=C6_id  type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C15_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C17_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C18_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C19_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C22_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C24_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C26_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C27_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C29_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C30_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C31_id type=long multiValued=true indexed=true
 required=false stored=true/
  field name=C34_id type=long

Re: CopyField can't copy analyzers and Filters

2014-07-01 Thread benjelloun

hello erick,

unfortunately i can't modify the schema , me and my team analyzed carefully
the problem,
so all fields you seeing are required on schema.

now i just tested to do different fields maybe it could work if i knew
syntaxe of edismax:
field name=AllChamp_ar type=text_ar multiValued=true indexed=true
required=false stored=false/
field name=AllChamp_fr type=text_fr multiValued=true indexed=true
required=false stored=false/
field name=AllChamp_en type=text_en multiValued=true indexed=true
required=false stored=false/

dynamicField name=*_en type=text_en indexed=true stored=true
required=false multiValued=true/
dynamicField name=*_fr type=text_fr indexed=true stored=true
required=false multiValued=true/
dynamicField name=*_ar type=text_ar indexed=true stored=true
required=false multiValued=true/

copyField source=*_ar dest=AllChamp_ar/
copyField source=*_fr dest=AllChamp_fr/
copyField source=*_en dest=AllChamp_en/


and on config this is SearchHandler but i dont find any result:

requestHandler name=/browse class=solr.SearchHandler
 lst name=defaults
   str name=echoParamsexplicit/str

   
   str name=wtvelocity/str
   str name=v.templatebrowse/str
   str name=v.layoutlayout/str
   str name=titleSolritas/str

   
   str name=defTypeedismax/str
   str name=qf
   AllChamp^2.0 AllChamp_ar^2.0 AllChamp_en^2.0 AllChamp_fr^5.0
   /str
   str name=dfAllChamp_fr/str
   str name=mm100%/str
   str name=q.alt*:*/str
   str name=rows10/str
   str name=fl*,score/str

   str name=mlt.qf
 AllChamp^2.0 AllChamp_ar^2.0 AllChamp_en^2.0 AllChamp_fr^5.0
   /str
   str name=mlt.flAllChamp,AllChamp_fr,AllChamp_ar,AllChamp_en/str
   int name=mlt.count3/int

   
   str name=faceton/str
   str name=facet.fieldcat/str
   str name=facet.fieldmanu_exact/str
   str name=facet.fieldcontent_type/str
   str name=facet.fieldauthor_s/str
   str name=facet.queryipod/str
   str name=facet.queryGB/str
   str name=facet.mincount1/str
   str name=facet.pivotcat,inStock/str
   str name=facet.range.otherafter/str
   str name=facet.rangeprice/str
   int name=f.price.facet.range.start0/int
   int name=f.price.facet.range.end600/int
   int name=f.price.facet.range.gap50/int
   str name=facet.rangepopularity/str
   int name=f.popularity.facet.range.start0/int
   int name=f.popularity.facet.range.end10/int
   int name=f.popularity.facet.range.gap3/int
   str name=facet.rangemanufacturedate_dt/str
   str
name=f.manufacturedate_dt.facet.range.startNOW/YEAR-10YEARS/str
   str name=f.manufacturedate_dt.facet.range.endNOW/str
   str name=f.manufacturedate_dt.facet.range.gap+1YEAR/str
   str name=f.manufacturedate_dt.facet.range.otherbefore/str
   str name=f.manufacturedate_dt.facet.range.otherafter/str

   
   str name=hlon/str
   str name=hl.flcontent features title name/str
   str name=hl.encoderhtml/str
   str name=hl.simple.prelt;bgt;/str
   str name=hl.simple.postlt;/bgt;/str
   str name=f.title.hl.fragsize0/str
   str name=f.title.hl.alternateFieldtitle/str
   str name=f.name.hl.fragsize0/str
   str name=f.name.hl.alternateFieldname/str
   str name=f.content.hl.snippets3/str
   str name=f.content.hl.fragsize200/str
   str name=f.content.hl.alternateFieldcontent/str
   str name=f.content.hl.maxAlternateFieldLength750/str

   
   str name=spellcheckon/str
   str name=spellcheck.extendedResultsfalse/str   
   str name=spellcheck.count5/str
   str name=spellcheck.alternativeTermCount2/str
   str name=spellcheck.maxResultsForSuggest5/str   
   str name=spellcheck.collatetrue/str
   str name=spellcheck.collateExtendedResultstrue/str  
   str name=spellcheck.maxCollationTries5/str
   str name=spellcheck.maxCollations3/str   
 /lst

 
 arr name=last-components
   strspellcheck/str
 /arr
  /requestHandler


thanks,
best regards








--
View this message in context: 
http://lucene.472066.n3.nabble.com/CopyField-can-t-copy-analyzers-and-Filters-tp4144803p4145018.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: CopyField can't copy analyzers and Filters

2014-07-01 Thread Daniel Collins

Ok, firstly to say you need to fix your problem but you can't modify the
schema, doesn't really help.  If the schema is setup badly, then no amount
of help at search time will ever get you the results you want...

Secondly, from what I can see in the schema, there is no AllChamp_fr,
AllChamp_en, etc?  There is only AllChamp, which you create by copying from
other places.  And that in itself seems odd to me, you are copying Cx_id
(which are longs) and Cx_val (which are text) into a single text_general
field, so lord knows what that's going to index like (really inefficiently
I would guess), and it won't be very accurate on the number values if you
ever want to do anything like range queries on those...

Back to Erick's response, take a step back and try to explain what the real
problem is, what fields you index, and what you want to achieve.

We have a similar situation with Languages, we have 3 fields per document
that are language specific, so we index them into language-specific fields.
 We then copyField them into a text_general (as well) so we have a
generically stemmed version if we want to do a more general query.  If we
need to explicitly search accurately for language-specific terms, then we
need to OR all the language fields.  That has a cost in creating the query,
but it is more efficient.



On 1 July 2014 16:04, benjelloun anass@gmail.com wrote:

 hello erick,

 unfortunately i can't modify the schema , me and my team analyzed carefully
 the problem,
 so all fields you seeing are required on schema.

 now i just tested to do different fields maybe it could work if i knew
 syntaxe of edismax:
 field name=AllChamp_ar type=text_ar multiValued=true indexed=true
 required=false stored=false/
 field name=AllChamp_fr type=text_fr multiValued=true indexed=true
 required=false stored=false/
 field name=AllChamp_en type=text_en multiValued=true indexed=true
 required=false stored=false/

 dynamicField name=*_en type=text_en indexed=true stored=true
 required=false multiValued=true/
 dynamicField name=*_fr type=text_fr indexed=true stored=true
 required=false multiValued=true/
 dynamicField name=*_ar type=text_ar indexed=true stored=true
 required=false multiValued=true/

 copyField source=*_ar dest=AllChamp_ar/
 copyField source=*_fr dest=AllChamp_fr/
 copyField source=*_en dest=AllChamp_en/


 and on config this is SearchHandler but i dont find any result:

 requestHandler name=/browse class=solr.SearchHandler
  lst name=defaults
str name=echoParamsexplicit/str


str name=wtvelocity/str
str name=v.templatebrowse/str
str name=v.layoutlayout/str
str name=titleSolritas/str


str name=defTypeedismax/str
str name=qf
AllChamp^2.0 AllChamp_ar^2.0 AllChamp_en^2.0 AllChamp_fr^5.0
/str
str name=dfAllChamp_fr/str
str name=mm100%/str
str name=q.alt*:*/str
str name=rows10/str
str name=fl*,score/str

str name=mlt.qf
  AllChamp^2.0 AllChamp_ar^2.0 AllChamp_en^2.0 AllChamp_fr^5.0
/str
str
 name=mlt.flAllChamp,AllChamp_fr,AllChamp_ar,AllChamp_en/str
int name=mlt.count3/int


str name=faceton/str
str name=facet.fieldcat/str
str name=facet.fieldmanu_exact/str
str name=facet.fieldcontent_type/str
str name=facet.fieldauthor_s/str
str name=facet.queryipod/str
str name=facet.queryGB/str
str name=facet.mincount1/str
str name=facet.pivotcat,inStock/str
str name=facet.range.otherafter/str
str name=facet.rangeprice/str
int name=f.price.facet.range.start0/int
int name=f.price.facet.range.end600/int
int name=f.price.facet.range.gap50/int
str name=facet.rangepopularity/str
int name=f.popularity.facet.range.start0/int
int name=f.popularity.facet.range.end10/int
int name=f.popularity.facet.range.gap3/int
str name=facet.rangemanufacturedate_dt/str
str
 name=f.manufacturedate_dt.facet.range.startNOW/YEAR-10YEARS/str
str name=f.manufacturedate_dt.facet.range.endNOW/str
str name=f.manufacturedate_dt.facet.range.gap+1YEAR/str
str name=f.manufacturedate_dt.facet.range.otherbefore/str
str name=f.manufacturedate_dt.facet.range.otherafter/str


str name=hlon/str
str name=hl.flcontent features title name/str
str name=hl.encoderhtml/str
str name=hl.simple.prelt;bgt;/str
str name=hl.simple.postlt;/bgt;/str
str name=f.title.hl.fragsize0/str
str name=f.title.hl.alternateFieldtitle/str
str name=f.name.hl.fragsize0/str
str name=f.name.hl.alternateFieldname/str
str name=f.content.hl.snippets3/str
str name=f.content.hl.fragsize200/str
str name=f.content.hl.alternateFieldcontent/str
str name=f.content.hl.maxAlternateFieldLength750/str


str name=spellcheckon/str
str

Re: CopyField can't copy analyzers and Filters

2014-07-01 Thread benjelloun

Hello,


for Cx_val, there is some fields which are multivalued :)
for AllChamp_fr, AllChamp_en..., i juste added them to the schema to test
if edismax work.





2014-07-01 17:13 GMT+02:00 Daniel Collins [via Lucene] 
ml-node+s472066n4145024...@n3.nabble.com:

 Ok, firstly to say you need to fix your problem but you can't modify the
 schema, doesn't really help.  If the schema is setup badly, then no amount
 of help at search time will ever get you the results you want...

 Secondly, from what I can see in the schema, there is no AllChamp_fr,
 AllChamp_en, etc?  There is only AllChamp, which you create by copying
 from
 other places.  And that in itself seems odd to me, you are copying Cx_id
 (which are longs) and Cx_val (which are text) into a single text_general
 field, so lord knows what that's going to index like (really inefficiently
 I would guess), and it won't be very accurate on the number values if you
 ever want to do anything like range queries on those...

 Back to Erick's response, take a step back and try to explain what the
 real
 problem is, what fields you index, and what you want to achieve.

 We have a similar situation with Languages, we have 3 fields per document
 that are language specific, so we index them into language-specific
 fields.
  We then copyField them into a text_general (as well) so we have a
 generically stemmed version if we want to do a more general query.  If we
 need to explicitly search accurately for language-specific terms, then we
 need to OR all the language fields.  That has a cost in creating the
 query,
 but it is more efficient.



 On 1 July 2014 16:04, benjelloun [hidden email]
 http://user/SendEmail.jtp?type=nodenode=4145024i=0 wrote:

  hello erick,
 
  unfortunately i can't modify the schema , me and my team analyzed
 carefully
  the problem,
  so all fields you seeing are required on schema.
 
  now i just tested to do different fields maybe it could work if i knew
  syntaxe of edismax:
  field name=AllChamp_ar type=text_ar multiValued=true
 indexed=true
  required=false stored=false/
  field name=AllChamp_fr type=text_fr multiValued=true
 indexed=true
  required=false stored=false/
  field name=AllChamp_en type=text_en multiValued=true
 indexed=true
  required=false stored=false/
 
  dynamicField name=*_en type=text_en indexed=true stored=true
  required=false multiValued=true/
  dynamicField name=*_fr type=text_fr indexed=true stored=true
  required=false multiValued=true/
  dynamicField name=*_ar type=text_ar indexed=true stored=true
  required=false multiValued=true/
 
  copyField source=*_ar dest=AllChamp_ar/
  copyField source=*_fr dest=AllChamp_fr/
  copyField source=*_en dest=AllChamp_en/
 
 
  and on config this is SearchHandler but i dont find any result:
 
  requestHandler name=/browse class=solr.SearchHandler
   lst name=defaults
 str name=echoParamsexplicit/str
 
 
 str name=wtvelocity/str
 str name=v.templatebrowse/str
 str name=v.layoutlayout/str
 str name=titleSolritas/str
 
 
 str name=defTypeedismax/str
 str name=qf
 AllChamp^2.0 AllChamp_ar^2.0 AllChamp_en^2.0 AllChamp_fr^5.0
 /str
 str name=dfAllChamp_fr/str
 str name=mm100%/str
 str name=q.alt*:*/str
 str name=rows10/str
 str name=fl*,score/str
 
 str name=mlt.qf
   AllChamp^2.0 AllChamp_ar^2.0 AllChamp_en^2.0 AllChamp_fr^5.0
 /str
 str
  name=mlt.flAllChamp,AllChamp_fr,AllChamp_ar,AllChamp_en/str
 int name=mlt.count3/int
 
 
 str name=faceton/str
 str name=facet.fieldcat/str
 str name=facet.fieldmanu_exact/str
 str name=facet.fieldcontent_type/str
 str name=facet.fieldauthor_s/str
 str name=facet.queryipod/str
 str name=facet.queryGB/str
 str name=facet.mincount1/str
 str name=facet.pivotcat,inStock/str
 str name=facet.range.otherafter/str
 str name=facet.rangeprice/str
 int name=f.price.facet.range.start0/int
 int name=f.price.facet.range.end600/int
 int name=f.price.facet.range.gap50/int
 str name=facet.rangepopularity/str
 int name=f.popularity.facet.range.start0/int
 int name=f.popularity.facet.range.end10/int
 int name=f.popularity.facet.range.gap3/int
 str name=facet.rangemanufacturedate_dt/str
 str
  name=f.manufacturedate_dt.facet.range.startNOW/YEAR-10YEARS/str
 str name=f.manufacturedate_dt.facet.range.endNOW/str
 str name=f.manufacturedate_dt.facet.range.gap+1YEAR/str
 str name=f.manufacturedate_dt.facet.range.otherbefore/str
 str name=f.manufacturedate_dt.facet.range.otherafter/str
 
 
 str name=hlon/str
 str name=hl.flcontent features title name/str
 str name=hl.encoderhtml/str
 str name=hl.simple.prelt;bgt;/str
 str name=hl.simple.postlt;/bgt;/str
 str

Re: CopyField can't copy analyzers and Filters

2014-06-30 Thread Steve McKay

Three fields: AllChamp_ar, AllChamp_fr, AllChamp_en. Then query them with 
dismax.

On Jun 30, 2014, at 11:53 AM, benjelloun anass@gmail.com wrote:

 here is my schema: 
 
 field name=AllChamp type=text_general multiValued=true indexed=true
 required=false stored=false/
 dynamicField name=*_en type=text_en indexed=true stored=true
 required=false multiValued=true/
 
 dynamicField name=*_fr type=text_fr indexed=true stored=true
 required=false multiValued=true/
 
 dynamicField name=*_ar type=text_ar indexed=true stored=true
 required=false multiValued=true/
 
 copyField source=*_ar dest=AllChamp/
 copyField source=*_fr dest=AllChamp/
 copyField source=*_en dest=AllChamp/
 
 when i index documents then search on this field AllChamp that don't do
 analyzer and filter.
 I know that CopyField can't copy analyzers and Filters, so how to keep
 analyzer and filter on Field: AllChamp?
 
 Exemple: 
 
 I search for : AllChamp:presenton  -- num result=0 
   AllChamp:présenton  -- num result=1 
 
 thanks for help, 
 best regards, 
 Anass BENJELLOUN 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/CopyField-can-t-copy-analyzers-and-Filters-tp4144803.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: copyField and storage requirements

2013-07-02 Thread Shawn Heisey

On 7/2/2013 12:22 PM, Ali, Saqib wrote:
 Newbie question:
 
 We have the following fields defined in the schema:
 
 field name=content type=text_general indexed=true stored=false/
 field name=teaser type=text_general indexed=false stored=true/
 copyField source=content dest=teaser maxChars=80/
 
 the content is field is about 500KB data.
 
 My question is whether Solr stores the entire contents of the that 500KB
 content field?
 
 We want to minimize the stored data in the Solr index, that is why we added
 the copyField teaser.

With that config, the entire 500KB will not be _stored_ .. but it will
affect the index size because you are indexing it.  Exactly what degree
that will be depends on the definition of the text_general type.

Thanks,
Shawn

Re: copyField and storage requirements

2013-07-02 Thread Ali, Saqib

Thanks Shawn.

Here is the text_general type definition. We would like to bring down the
storage requirement down to a minimum for those 500KB content documents. We
just need basic full-text search.

Thanks!!! :)




fieldType name=text_general class=solr.TextField
positionIncrementGap=100
analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true/
!-- in this example, we will only use synonyms at query
time
filter class=solr.SynonymFilterFactory
synonyms=index_synonyms.txt ignoreCase=true expand=false/
--
filter class=solr.LowerCaseFilterFactory/
/analyzer
analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true/
filter class=solr.SynonymFilterFactory
synonyms=synonyms.txt ignoreCase=true expand=true/
filter class=solr.LowerCaseFilterFactory/
/analyzer
/fieldType



On Tue, Jul 2, 2013 at 11:35 AM, Shawn Heisey s...@elyograg.org wrote:

 On 7/2/2013 12:22 PM, Ali, Saqib wrote:
  Newbie question:
 
  We have the following fields defined in the schema:
 
  field name=content type=text_general indexed=true stored=false/
  field name=teaser type=text_general indexed=false stored=true/
  copyField source=content dest=teaser maxChars=80/
 
  the content is field is about 500KB data.
 
  My question is whether Solr stores the entire contents of the that 500KB
  content field?
 
  We want to minimize the stored data in the Solr index, that is why we
 added
  the copyField teaser.

 With that config, the entire 500KB will not be _stored_ .. but it will
 affect the index size because you are indexing it.  Exactly what degree
 that will be depends on the definition of the text_general type.

 Thanks,
 Shawn

Re: copyField and storage requirements

2013-07-02 Thread Shawn Heisey


On 7/2/2013 1:58 PM, Ali, Saqib wrote:

Thanks Shawn.

Here is the text_general type definition. We would like to bring down the
storage requirement down to a minimum for those 500KB content documents. We
just need basic full-text search.

Thanks!!! :)




 fieldType name=text_general class=solr.TextField
positionIncrementGap=100
 analyzer type=index
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt
 enablePositionIncrements=true/
 !-- in this example, we will only use synonyms at query
time
 filter class=solr.SynonymFilterFactory
synonyms=index_synonyms.txt ignoreCase=true expand=false/
 --
 filter class=solr.LowerCaseFilterFactory/
 /analyzer
 analyzer type=query
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt
 enablePositionIncrements=true/
 filter class=solr.SynonymFilterFactory
synonyms=synonyms.txt ignoreCase=true expand=true/
 filter class=solr.LowerCaseFilterFactory/
 /analyzer
 /fieldType


Unless you have a huge number of synonyms or the synonyms that you have 
defined are used a LOT in your index, that should not result in a whole 
lot of term expansion.  I have no way to know how much actual space 
things will take, but from what I have seen, a 500KB input field will 
probably take a little bit less than 500KB of disk space, unless it is 
almost entirely composed of unique terms.


Thanks,
Shawn

Re: copyField generates multiple values encountered for non multiValued field

2013-06-06 Thread Robert Krüger

On Wed, Jun 5, 2013 at 9:12 PM, Jack Krupansky j...@basetechnology.com wrote:
 Look in the Solr log - the error message should tell you what the multiple
 values are. For example,

 95484 [qtp2998209-11] ERROR org.apache.solr.core.SolrCore  –
 org.apache.solr.common.SolrException: ERROR: [doc=doc-1] multiple values
 encountered for non multiValued field content_s: [def, abc]

 One of the values should be the value of the field that is the source of the
 copyField. Maybe the other value will give you a clue as to where it came
 from.

 Check your SolrJ code - maybe you actually do try to initialize a value in
 the field that is the copyField target.

I see the values in the stack trace:

org.apache.solr.common.SolrException: ERROR:
[doc=8f60d040-3462-4b28-998f-fd05a64f1cd8:/] multiple values
encountered for non multiValued field name2: [rename, rename]

It is just twice the value of source-field and I am not referencing
that field in my java code.

Re: copyField generates multiple values encountered for non multiValued field

2013-06-06 Thread Robert Krüger

I don't know what I have to do to use the atomic update feature but I
am not aware of using it. But the way you describe it, it means that
the copyField directive does not overwrite the existing field content
and that's an easy explanation to what is happening in my case. Then
the second update (which I do manually, i.e. read current state,
manipulate fields and then add the document with the same id) will
lead to this. That was not so obvious to me from the docs.

Thanks,

Robert

On Thu, Jun 6, 2013 at 12:18 AM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : I updated the Index using SolrJ and got the exact same error message

 there aren't a lot of specifics provided in this thread, so this may not
 be applicable, but if you mean you actaully using the atomic updates
 feature to update an existing document then the problem is that you still
 have the existing value in your name2 field, as well as another copy of
 the name field evaluated by copyField after the updates are applied...

 http://wiki.apache.org/solr/Atomic_Updates#Stored_Values


 -Hoss

Re: copyField generates multiple values encountered for non multiValued field

2013-06-06 Thread Jack Krupansky


1. Try a simple curl command to add the document.

2. Check to see if maybe there is a duplicate copyField directive in your 
schema. How many copyField directives do you have?


At least we know that it is exactly the same value duplicated and not some 
other value.


-- Jack Krupansky

-Original Message- 
From: Robert Krüger

Sent: Thursday, June 06, 2013 7:15 AM
To: solr-user@lucene.apache.org
Subject: Re: copyField generates multiple values encountered for non 
multiValued field


On Wed, Jun 5, 2013 at 9:12 PM, Jack Krupansky j...@basetechnology.com 
wrote:

Look in the Solr log - the error message should tell you what the multiple
values are. For example,

95484 [qtp2998209-11] ERROR org.apache.solr.core.SolrCore  –
org.apache.solr.common.SolrException: ERROR: [doc=doc-1] multiple values
encountered for non multiValued field content_s: [def, abc]

One of the values should be the value of the field that is the source of 
the

copyField. Maybe the other value will give you a clue as to where it came
from.

Check your SolrJ code - maybe you actually do try to initialize a value in
the field that is the copyField target.


I see the values in the stack trace:

org.apache.solr.common.SolrException: ERROR:
[doc=8f60d040-3462-4b28-998f-fd05a64f1cd8:/] multiple values
encountered for non multiValued field name2: [rename, rename]

It is just twice the value of source-field and I am not referencing
that field in my java code.

Re: copyField generates multiple values encountered for non multiValued field

2013-06-06 Thread Jack Krupansky

read current state, manipulate fields and then add the document with the 
same id)


Ahh... then you have an IMPLICIT reference to the field in your Java code - 
you explicitly told Solr that you wanted to start with all existing field 
values. Just because a field is the target of a copyField doesn't make it 
any different from any other field when reading. Although, it does beg the 
question of whether or not this field should be stored or not - that's a 
data modeling question that only you can resolve. Do queries need to 
retrieve this field?


Be sure to null out any values for any fields that are sourced by copy 
fields. Otherwise, yes, duplicated values would be exactly what you should 
expect.


Is there any reason that you can't simply use atomic update - create a new 
document with the same document id but with only set values for the fields 
to be changed? There is also add for multivalued fields.


There isn't great doc for this. Basically, the value for every non-ID field 
would be a Map object (HashMap) with a set key whose value is the new 
field value.


Here's a code fragment for setting one field:

   SolrInputDocument doc2 = new SolrInputDocument();
   MapString,String fpValue2 = new HashMapString, String();
   fpValue2.put(set,fp2);
   doc2.setField(FACTURES_PRODUIT, fpValue2);

You need a separate Map object for each field to be set or added for 
appending to a multivalued field. And you need a simple (non-Map) value for 
your ID field.


-- Jack Krupansky

-Original Message- 
From: Robert Krüger

Sent: Thursday, June 06, 2013 7:25 AM
To: solr-user@lucene.apache.org
Subject: Re: copyField generates multiple values encountered for non 
multiValued field


I don't know what I have to do to use the atomic update feature but I
am not aware of using it. But the way you describe it, it means that
the copyField directive does not overwrite the existing field content
and that's an easy explanation to what is happening in my case. Then
the second update (which I do manually, i.e. read current state,
manipulate fields and then add the document with the same id) will
lead to this. That was not so obvious to me from the docs.

Thanks,

Robert

On Thu, Jun 6, 2013 at 12:18 AM, Chris Hostetter
hossman_luc...@fucit.org wrote:


: I updated the Index using SolrJ and got the exact same error message

there aren't a lot of specifics provided in this thread, so this may not
be applicable, but if you mean you actaully using the atomic updates
feature to update an existing document then the problem is that you still
have the existing value in your name2 field, as well as another copy of
the name field evaluated by copyField after the updates are applied...

http://wiki.apache.org/solr/Atomic_Updates#Stored_Values


-Hoss

Re: copyField generates multiple values encountered for non multiValued field

2013-06-06 Thread Robert Krüger

On Thu, Jun 6, 2013 at 1:52 PM, Jack Krupansky j...@basetechnology.com wrote:
 read current state, manipulate fields and then add the document with the
 same id)

 Ahh... then you have an IMPLICIT reference to the field in your Java code -
 you explicitly told Solr that you wanted to start with all existing field
 values. Just because a field is the target of a copyField doesn't make it
 any different from any other field when reading. Although, it does beg the
 question of whether or not this field should be stored or not - that's a
 data modeling question that only you can resolve. Do queries need to
 retrieve this field?
you're right. in my concrete use case it does not need to to be stored.



 Be sure to null out any values for any fields that are sourced by copy
 fields. Otherwise, yes, duplicated values would be exactly what you should
 expect.
yes, I will do that.


 Is there any reason that you can't simply use atomic update - create a new
 document with the same document id but with only set values for the fields
 to be changed? There is also add for multivalued fields.

 There isn't great doc for this. Basically, the value for every non-ID field
 would be a Map object (HashMap) with a set key whose value is the new
 field value.

 Here's a code fragment for setting one field:

SolrInputDocument doc2 = new SolrInputDocument();
MapString,String fpValue2 = new HashMapString, String();
fpValue2.put(set,fp2);
doc2.setField(FACTURES_PRODUIT, fpValue2);

 You need a separate Map object for each field to be set or added for
 appending to a multivalued field. And you need a simple (non-Map) value for
 your ID field.

thanks for the info! the code is a lot older than solr 4.0, so that
option was not available at the time of its writing. I will check if
it makes sense to use that feature. most likely yes.

Robert

Re: copyField generates multiple values encountered for non multiValued field

2013-06-05 Thread Alexandre Rafalovitch

I think the suggestion I have seen is that copyField should be
index-only and - therefore - will not be returned. It is primarily
there to make searching easier by aggregating fields or to provide
alternative analyzer pipeline.

Can you make your copyField destination not stored?

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Wed, Jun 5, 2013 at 10:37 AM, Robert Krüger krue...@lesspain.de wrote:
 I have the exact same problem as the guy here:

 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201105.mbox/%3C3A2B3E42FCAA4BF496AE625426C5C6E4@Wurstsemmel%3E

 AFAICS he did not get an answer. Is this a known issue? What can I do
 other than doing what copyField should do in my application?

 I am using solr 4.0.0.

 Thanks,

 Robert

Re: copyField generates multiple values encountered for non multiValued field

2013-06-05 Thread Jack Krupansky

Try describing your own symptom in your own words - because his issue 
related to Solr 1.4. I mean, where exactly are you setting 
allowDuplicates=false?? And why do you think it has anything to do with 
adding documents to Solr? Solr 1.4 did not have atomic update, so sending 
the exact same document twice would not result in a change in the index 
(unless you had a date field with a value of NOW.) Copy field only uses 
values from the current document.


-- Jack Krupansky

-Original Message- 
From: Robert Krüger

Sent: Wednesday, June 05, 2013 10:37 AM
To: solr-user@lucene.apache.org
Subject: copyField generates multiple values encountered for non 
multiValued field


I have the exact same problem as the guy here:

http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201105.mbox/%3C3A2B3E42FCAA4BF496AE625426C5C6E4@Wurstsemmel%3E

AFAICS he did not get an answer. Is this a known issue? What can I do
other than doing what copyField should do in my application?

I am using solr 4.0.0.

Thanks,

Robert

Re: copyField generates multiple values encountered for non multiValued field

2013-06-05 Thread Robert Krüger

OK, I have two fields defined as follows:

field name=name type=string indexed=true stored=true
multiValued=false /
field name=name2 type=string_ci indexed=true
stored=true multiValued=false /

and this copyField directive

copyField source=name dest=name2/

I updated the Index using SolrJ and got the exact same error message
that is in the subject. However, while waiting for feedback I built a
workaround at the application level and now reconstructing the
original state, to be able to answer you, I have different behaviour.
What happens now is that the field name2 is populated with multiple
values although it is not defined as multiValued (see above).

Although this is strange, it is consistent with the earlier problem in
that copyField does not seem to overwrite the existing field values. I
may be using it incorrectly (it's the first time I am using copyField)
but the docs in the wiki did not say anything about an overwrite
option.

Cheers,

Robert

On Wed, Jun 5, 2013 at 5:16 PM, Jack Krupansky j...@basetechnology.com wrote:
Try describing your own symptom in your own words - because his issue
related to Solr 1.4. I mean, where exactly are you setting
allowDuplicates=false?? And why do you think it has anything to do with
adding documents to Solr? Solr 1.4 did not have atomic update, so sending
the exact same document twice would not result in a change in the index
(unless you had a date field with a value of NOW.) Copy field only uses
values from the current document.

-- Jack Krupansky

-Original Message- From: Robert Krüger
Sent: Wednesday, June 05, 2013 10:37 AM
To: solr-user@lucene.apache.org
Subject: copyField generates multiple values encountered for non
multiValued field

I have the exact same problem as the guy here:

http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201105.mbox/%3C3A2B3E42FCAA4BF496AE625426C5C6E4@Wurstsemmel%3E

AFAICS he did not get an answer. Is this a known issue? What can I do
other than doing what copyField should do in my application?

I am using solr 4.0.0.

Thanks,

Robert

Re: copyField generates multiple values encountered for non multiValued field

2013-06-05 Thread Jack Krupansky

Look in the Solr log - the error message should tell you what the multiple 
values are. For example,


95484 [qtp2998209-11] ERROR org.apache.solr.core.SolrCore  – 
org.apache.solr.common.SolrException: ERROR: [doc=doc-1] multiple values 
encountered for non multiValued field content_s: [def, abc]


One of the values should be the value of the field that is the source of the 
copyField. Maybe the other value will give you a clue as to where it came 
from.


Check your SolrJ code - maybe you actually do try to initialize a value in 
the field that is the copyField target.


-- Jack Krupansky

-Original Message- 
From: Robert Krüger

Sent: Wednesday, June 05, 2013 1:17 PM
To: solr-user@lucene.apache.org
Subject: Re: copyField generates multiple values encountered for non 
multiValued field


OK, I have two fields defined as follows:

 field name=name   type=string   indexed=true  stored=true
multiValued=false /
 field name=name2   type=string_ci   indexed=true
stored=true  multiValued=false /

and this copyField directive

copyField source=name dest=name2/

I updated the Index using SolrJ and got the exact same error message
that is in the subject. However, while waiting for feedback I built a
workaround at the application level and now reconstructing the
original state, to be able to answer you, I have different behaviour.
What happens now is that the field name2 is populated with multiple
values although it is not defined as multiValued (see above).

Although this is strange, it is consistent with the earlier problem in
that copyField does not seem to overwrite the existing field values. I
may be using it incorrectly (it's the first time I am using copyField)
but the docs in the wiki did not say anything about an overwrite
option.

Cheers,

Robert


On Wed, Jun 5, 2013 at 5:16 PM, Jack Krupansky j...@basetechnology.com 
wrote:

Try describing your own symptom in your own words - because his issue
related to Solr 1.4. I mean, where exactly are you setting
allowDuplicates=false?? And why do you think it has anything to do with
adding documents to Solr? Solr 1.4 did not have atomic update, so sending
the exact same document twice would not result in a change in the index
(unless you had a date field with a value of NOW.) Copy field only uses
values from the current document.

-- Jack Krupansky

-Original Message- From: Robert Krüger
Sent: Wednesday, June 05, 2013 10:37 AM
To: solr-user@lucene.apache.org
Subject: copyField generates multiple values encountered for non
multiValued field


I have the exact same problem as the guy here:

http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201105.mbox/%3C3A2B3E42FCAA4BF496AE625426C5C6E4@Wurstsemmel%3E

AFAICS he did not get an answer. Is this a known issue? What can I do
other than doing what copyField should do in my application?

I am using solr 4.0.0.

Thanks,

Robert

Re: copyField generates multiple values encountered for non multiValued field

2013-06-05 Thread Chris Hostetter


: I updated the Index using SolrJ and got the exact same error message

there aren't a lot of specifics provided in this thread, so this may not 
be applicable, but if you mean you actaully using the atomic updates 
feature to update an existing document then the problem is that you still 
have the existing value in your name2 field, as well as another copy of 
the name field evaluated by copyField after the updates are applied...

http://wiki.apache.org/solr/Atomic_Updates#Stored_Values


-Hoss

Re: copyField with * stops working with 4.2 (related to SOLR-3798 ?)

2013-03-13 Thread Steve Rowe

I committed a fix under SOLR-4567.

On Mar 13, 2013, at 12:50 AM, Steve Rowe sar...@gmail.com wrote:

 Yes, this is a regression, definitely my fault.  Sorry Alex!  
 
 The table on SOLR-3798 is missing this case: a glob matching one or more 
 explicit fields (as opposed to dynamic fields).
 
 I've filed a JIRA: https://issues.apache.org/jira/browse/SOLR-4567
 
 On Mar 13, 2013, at 12:20 AM, Jack Krupansky j...@basetechnology.com 
 wrote:
 And, the wiki does not note the decommissioning of a useful feature of 
 copyField. Although, the wiki is woefully incomplete when it comes to glob 
 patterns for fields.
 
 I agree - I wrote what I thought would be a good addition to the wiki just 
 above the copyField combinations table on SOLR-3798.  But it needs additional 
 verbiage to cover Alex's case.
 
 Reading the table in SOLR-3798 as carefully as I can, it seems to indicate 
 that your use case is supposed to be supported as case #9, leading me to 
 conclude that it may simply be a bug that your use case is failing in 4.2.
 
 9subset patternfield namecopyField source=*_src_sub_i 
 dest=title/YesYes
 
 Alex's case is different from what I call subset patterns on SOLR-3798, 
 since that's shorthand for subset of the language accepted by the pattern 
 for a referenced dynamic field.
 
 I'll make a copy of that table and add a case where the source value type can 
 be a glob matching one or more explicit fields. 
 
 Steve

Re: copyField with * stops working with 4.2 (related to SOLR-3798 ?)

2013-03-13 Thread Jack Krupansky

Thanks for the clarification! Although, maybe we need to come up with some 
simpler, more clear terminology.


-- Jack Krupansky

-Original Message- 
From: Steve Rowe

Sent: Wednesday, March 13, 2013 12:50 AM
To: solr-user@lucene.apache.org
Subject: Re: copyField with * stops working with 4.2 (related to SOLR-3798 
?)


Yes, this is a regression, definitely my fault.  Sorry Alex!

The table on SOLR-3798 is missing this case: a glob matching one or more 
explicit fields (as opposed to dynamic fields).


I've filed a JIRA: https://issues.apache.org/jira/browse/SOLR-4567

On Mar 13, 2013, at 12:20 AM, Jack Krupansky j...@basetechnology.com 
wrote:
And, the wiki does not note the decommissioning of a useful feature of 
copyField. Although, the wiki is woefully incomplete when it comes to glob 
patterns for fields.


I agree - I wrote what I thought would be a good addition to the wiki just 
above the copyField combinations table on SOLR-3798.  But it needs 
additional verbiage to cover Alex's case.


Reading the table in SOLR-3798 as carefully as I can, it seems to indicate 
that your use case is supposed to be supported as case #9, leading me to 
conclude that it may simply be a bug that your use case is failing in 4.2.


9subset patternfield namecopyField source=*_src_sub_i 
dest=title/YesYes


Alex's case is different from what I call subset patterns on SOLR-3798, 
since that's shorthand for subset of the language accepted by the pattern 
for a referenced dynamic field.


I'll make a copy of that table and add a case where the source value type 
can be a glob matching one or more explicit fields.


Steve

Re: copyField with * stops working with 4.2 (related to SOLR-3798 ?)

2013-03-12 Thread Jack Krupansky

Solr-4503 made the changes to copyField semantics. Indeed, it is not clear 
whether Solr-4503 (or even Solr-3798) was really intended to de-commit 
existing functionality. I mean, the normal procedure is to deprecate a 
feature long before removing it.


And, the wiki does not note the decommissioning of a useful feature of 
copyField. Although, the wiki is woefully incomplete when it comes to glob 
patterns for fields.


Reading the table in SOLR-3798 as carefully as I can, it seems to indicate 
that your use case is supposed to be supported as case #9, leading me to 
conclude that it may simply be a bug that your use case is failing in 4.2.


9subset patternfield namecopyField source=*_src_sub_i 
dest=title/YesYes


So, I'd go ahead and file this as a bug.

Steve?

https://issues.apache.org/jira/browse/SOLR-3798
https://issues.apache.org/jira/browse/SOLR-4503

The revision that made the change:

http://svn.apache.org/viewvc?view=revisionrevision=1453162

-- Jack Krupansky

-Original Message- 
From: Alexandre Rafalovitch

Sent: Tuesday, March 12, 2013 8:32 PM
To: solr-user@lucene.apache.org
Subject: copyField with * stops working with 4.2 (related to SOLR-3798 ?)

Hello,

I have an example schema which worked in 4.1 but is failing to load in 4.2
with: copyField source :'addr_*' is not an explicit field and doesn't
match a dynamicField.

I think this must be due to SOLR-3798, but I don't understand why even
after reading it through several times.

My schema (excerpt) is:
   field name=addr_from type=email indexed=true stored=true
required=true /
   field name=addr_to type=email multiValued=true indexed=true
stored=true required=true /
  copyField source=addr_* dest=text /

I thought this would have been a valid use case. Can someone with deeper
understanding of this aspect explain what I am missing.

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)

Re: copyField with * stops working with 4.2 (related to SOLR-3798 ?)

2013-03-12 Thread Steve Rowe

Yes, this is a regression, definitely my fault.  Sorry Alex!  

The table on SOLR-3798 is missing this case: a glob matching one or more 
explicit fields (as opposed to dynamic fields).

I've filed a JIRA: https://issues.apache.org/jira/browse/SOLR-4567

On Mar 13, 2013, at 12:20 AM, Jack Krupansky j...@basetechnology.com wrote:
 And, the wiki does not note the decommissioning of a useful feature of 
 copyField. Although, the wiki is woefully incomplete when it comes to glob 
 patterns for fields.

I agree - I wrote what I thought would be a good addition to the wiki just 
above the copyField combinations table on SOLR-3798.  But it needs additional 
verbiage to cover Alex's case.

 Reading the table in SOLR-3798 as carefully as I can, it seems to indicate 
 that your use case is supposed to be supported as case #9, leading me to 
 conclude that it may simply be a bug that your use case is failing in 4.2.
 
 9subset patternfield namecopyField source=*_src_sub_i 
 dest=title/YesYes

Alex's case is different from what I call subset patterns on SOLR-3798, since 
that's shorthand for subset of the language accepted by the pattern for a 
referenced dynamic field.

I'll make a copy of that table and add a case where the source value type can 
be a glob matching one or more explicit fields. 

Steve

Re: copyField vs single field

2013-02-06 Thread Otis Gospodnetic

The latter,  I believe,  but you lose the ability to give different weights
to matches on different fields.

Otis
Solr  ElasticSearch Support
http://sematext.com/
On Feb 6, 2013 2:34 PM, adm1n evgeni.evg...@gmail.com wrote:

 Hi,

 Let's assume I have to search for a string (textField) in 6-7 different
 fields (username, firstname, lastname, etc). Which one will have better
 performance:
 username:test OR firstname:test OR lastname:test
 or defining some copyField and searching within it like somecopyfield:test


 thanks.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/copyField-vs-single-field-tp4038832.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: copyField vs single field

2013-02-06 Thread Jack Krupansky

It is difficult to say for sure - unless somebody actually does a lot of 
benchmarking tests with various distributions of data in the fields and 
various field types (e.g., some are strings and some are text, and the 
cardinality of the string values.) I would suspect that the two would be 
roughly equivalent. I mean, if you search each field separately, that field 
has only its subset of the data, and the copy field has essentially the sum 
of the per-field subsets.


I would say that you should go with edismax dismax search (qf = list of 
fields and boosts) unless you have a clear reason to go the other way.


-- Jack Krupansky

-Original Message- 
From: Otis Gospodnetic

Sent: Wednesday, February 06, 2013 8:04 PM
To: solr-user@lucene.apache.org
Subject: Re: copyField vs single field

The latter,  I believe,  but you lose the ability to give different weights
to matches on different fields.

Otis
Solr  ElasticSearch Support
http://sematext.com/
On Feb 6, 2013 2:34 PM, adm1n evgeni.evg...@gmail.com wrote:


Hi,

Let's assume I have to search for a string (textField) in 6-7 different
fields (username, firstname, lastname, etc). Which one will have better
performance:
username:test OR firstname:test OR lastname:test
or defining some copyField and searching within it like somecopyfield:test


thanks.



--
View this message in context:
http://lucene.472066.n3.nabble.com/copyField-vs-single-field-tp4038832.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: CopyField issue on Solr4.1

2013-02-03 Thread Erick Erickson

Major changes (i.e. 3 - 4) have some such differences, but usually it's
for the better. But it is sometimes disconcerting! Also, sometimes there
are different defaults and you can get the old behavior back...

Although you haven't quite said what was different. I know the stock
typedefs have changed, so if it's just the schema.xml differences, you
should take control of those anyway G...

Best
Erick


On Thu, Jan 31, 2013 at 8:10 AM, anarchos78
rigasathanasio...@hotmail.comwrote:

 Yes, you are right! Is that normal? Many thanks



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/CopyField-issue-on-Solr4-1-tp4037373p4037685.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: CopyField issue on Solr4.1

2013-01-30 Thread Upayavira

Stored fields are now compressed in 4.1. There's other efficiencies too
in 4.0 that will also result in smaller indexes, but the compressed
stored fields is the most significant.

Upayavira

On Wed, Jan 30, 2013, at 01:59 PM, anarchos78 wrote:
 Hello,
 
 I am using Solr 3.6.1 and I am very satisfied. Now I want to move on
 solr4.1. So I took “schema.xml” and “solrconfig.xml” (with minor changes)
 and place them under my new solr4.1 configuration. The indexing was
 successful (DIH). But, I have noticed an issue. In “schema.xml” I have
 “copyField” directives in order to index same fields using different
 “types”. When I try to index using the same configuration on solr4.1, the
 index size is the half of the index size on solr3.6.1 (and when I query I
 get different results). Has anything changed on Solr4.1? I need little
 help
 on this.
 
 *The schema.xml:*
 ?xml version=1.0 encoding=UTF-8 ?
 
 config
   
  
 abortOnConfigurationError${solr.abortOnConfigurationError:true}/abortOnConfigurationError
   
  
   luceneMatchVersionLUCENE_41/luceneMatchVersion   
  
   dataDir${solr.data.dir:}/dataDir
   
   directoryFactory name=DirectoryFactory 

 class=${solr.directoryFactory:solr.StandardDirectoryFactory}/
  
   indexConfig
 
   /indexConfig
   
   jmx /
   
   updateHandler class=solr.DirectUpdateHandler2   
   /updateHandler
   
   
   query

 maxBooleanClauses2048/maxBooleanClauses
 
 
 filterCache class=solr.FastLRUCache
   size=2048
   initialSize=1024
   autowarmCount=512
   cleanupThread=true /
 
 queryResultCache class=solr.FastLRUCache
   size=2048
   initialSize=1024
   autowarmCount=512
   cleanupThread=true /
   
   documentCache class=solr.FastLRUCache
   size=2048
   initialSize=2048
 autowarmCount=512 /
 
 fieldValueCache class=solr.FastLRUCache
   size=2048
   initialSize=512
   autowarmCount=512
   cleanupThread=true / 
 
 
 enableLazyFieldLoadingtrue/enableLazyFieldLoading
 
 queryResultWindowSize150/queryResultWindowSize
 
 queryResultMaxDocsCached200/queryResultMaxDocsCached   

 listener event=newSearcher class=solr.QuerySenderListener
   arr name=queries
 lst
   str name=qχρησικτησια νομη/str
   str name=fqapofasi_taxonomy:ΠΟΛΙΤΙΚΕΣ/str
   str name=sortapofasi_date asc,ida desc,apofasi_tmima
 desc/str
   str name=start0/str
   str name=rows150/str
 /lst
 lst
   str name=qνομη/str
   str name=fqapofasi_taxonomy:ΠΟΛΙΤΙΚΕΣ/str
   str name=sortapofasi_date asc,ida desc,apofasi_tmima
 desc/str
   str name=start0/str
   str name=rows150/str
 /lst
 lst
   str name=qχρησικτησια νομη/str
   str name=fqapofasi_taxonomy:ΠΟΙΝΙΚΕΣ/str
   str name=sortapofasi_date asc,ida desc,apofasi_tmima
 desc/str
   str name=start0/str
   str name=rows150/str
 /lst
   /arr
 /listener
   
   listener event=firstSearcher class=solr.QuerySenderListener
   arr name=queries
 lst
   str name=qχρησικτησια νομη/str
   str name=fqapofasi_taxonomy:ΠΟΛΙΤΙΚΕΣ/str
   str name=sortapofasi_date asc,ida desc,apofasi_tmima
 desc/str
   str name=start0/str
   str name=rows150/str
 /lst
 lst
   str name=qνομη/str
   str name=fqapofasi_taxonomy:ΠΟΛΙΤΙΚΕΣ/str
   str name=sortapofasi_date asc,ida desc,apofasi_tmima
 desc/str
   str name=start0/str
   str name=rows150/str
 /lst
 lst
   str name=qχρησικτησια νομη/str
   str name=fqapofasi_taxonomy:ΠΟΙΝΙΚΕΣ/str
   str name=sortapofasi_date asc,ida desc,apofasi_tmima
 desc/str
   str name=start0/str
   str name=rows150/str
 /lst
   /arr
/listener

useColdSearcherfalse/useColdSearcher

maxWarmingSearchers2/maxWarmingSearchers
 
   /query
   
   requestDispatcher 
 requestParsers enableRemoteStreaming=true
 multipartUploadLimitInKB=2048000 / 
 httpCaching never304=true /
   /requestDispatcher
   
   requestHandler name=/dataimport
 class=org.apache.solr.handler.dataimport.DataImportHandler
   lst name=defaults

Re: CopyField issue on Solr4.1

2013-01-30 Thread Jack Krupansky

There are probably any number of changes between 3.x and 4.x to account for 
query differences. This includes bug fixes and in some cases new bugs, in 
areas such as the query parsers and various filters. The first step is to 
isolate a couple of examples of both false positive queries and false 
negative queries. Then look at the field types involved. Then use the Solr 
Admin Analysis UI to see how an index or query term analyzes differently. 
Post the details here and we can figure out what change is causing your 
query discrepencies.


-- Jack Krupansky

-Original Message- 
From: anarchos78

Sent: Wednesday, January 30, 2013 8:59 AM
To: solr-user@lucene.apache.org
Subject: CopyField issue on Solr4.1

Hello,

I am using Solr 3.6.1 and I am very satisfied. Now I want to move on
solr4.1. So I took “schema.xml” and “solrconfig.xml” (with minor changes)
and place them under my new solr4.1 configuration. The indexing was
successful (DIH). But, I have noticed an issue. In “schema.xml” I have
“copyField” directives in order to index same fields using different
“types”. When I try to index using the same configuration on solr4.1, the
index size is the half of the index size on solr3.6.1 (and when I query I
get different results). Has anything changed on Solr4.1? I need little help
on this.

*The schema.xml:*
?xml version=1.0 encoding=UTF-8 ?

config


abortOnConfigurationError${solr.abortOnConfigurationError:true}/abortOnConfigurationError


 luceneMatchVersionLUCENE_41/luceneMatchVersion

 dataDir${solr.data.dir:}/dataDir

 directoryFactory name=DirectoryFactory

class=${solr.directoryFactory:solr.StandardDirectoryFactory}/

 indexConfig

 /indexConfig

 jmx /

 updateHandler class=solr.DirectUpdateHandler2
 /updateHandler


 query

   maxBooleanClauses2048/maxBooleanClauses


   filterCache class=solr.FastLRUCache
size=2048
initialSize=1024
autowarmCount=512
cleanupThread=true /

   queryResultCache class=solr.FastLRUCache
size=2048
initialSize=1024
autowarmCount=512
cleanupThread=true /

documentCache class=solr.FastLRUCache
size=2048
initialSize=2048
   autowarmCount=512 /

   fieldValueCache class=solr.FastLRUCache
size=2048
initialSize=512
autowarmCount=512
cleanupThread=true /

   enableLazyFieldLoadingtrue/enableLazyFieldLoading

   queryResultWindowSize150/queryResultWindowSize

   queryResultMaxDocsCached200/queryResultMaxDocsCached

   listener event=newSearcher class=solr.QuerySenderListener
 arr name=queries
   lst
 str name=qχρησικτησια νομη/str
 str name=fqapofasi_taxonomy:ΠΟΛΙΤΙΚΕΣ/str
 str name=sortapofasi_date asc,ida desc,apofasi_tmima
desc/str
 str name=start0/str
 str name=rows150/str
   /lst
   lst
 str name=qνομη/str
 str name=fqapofasi_taxonomy:ΠΟΛΙΤΙΚΕΣ/str
 str name=sortapofasi_date asc,ida desc,apofasi_tmima
desc/str
 str name=start0/str
 str name=rows150/str
   /lst
   lst
 str name=qχρησικτησια νομη/str
 str name=fqapofasi_taxonomy:ΠΟΙΝΙΚΕΣ/str
 str name=sortapofasi_date asc,ida desc,apofasi_tmima
desc/str
 str name=start0/str
 str name=rows150/str
   /lst
 /arr
   /listener

listener event=firstSearcher class=solr.QuerySenderListener
 arr name=queries
   lst
 str name=qχρησικτησια νομη/str
 str name=fqapofasi_taxonomy:ΠΟΛΙΤΙΚΕΣ/str
 str name=sortapofasi_date asc,ida desc,apofasi_tmima
desc/str
 str name=start0/str
 str name=rows150/str
   /lst
   lst
 str name=qνομη/str
 str name=fqapofasi_taxonomy:ΠΟΛΙΤΙΚΕΣ/str
 str name=sortapofasi_date asc,ida desc,apofasi_tmima
desc/str
 str name=start0/str
 str name=rows150/str
   /lst
   lst
 str name=qχρησικτησια νομη/str
 str name=fqapofasi_taxonomy:ΠΟΙΝΙΚΕΣ/str
 str name=sortapofasi_date asc,ida desc,apofasi_tmima
desc/str
 str name=start0/str
 str name=rows150/str
   /lst
 /arr
  /listener

  useColdSearcherfalse/useColdSearcher

  maxWarmingSearchers2/maxWarmingSearchers

 /query

 requestDispatcher
   requestParsers enableRemoteStreaming=true
multipartUploadLimitInKB=2048000 /
   httpCaching never304=true /
 /requestDispatcher

 requestHandler name=/dataimport
class=org.apache.solr.handler.dataimport.DataImportHandler
lst name=defaults
str name=configdata-config.xml/str
/lst
 /requestHandler

 requestHandler name=/select class=solr.SearchHandler
lst name=defaults
  str name=defTypeedismax/str
  str name=qfcontent contentS^10/str
  str name=pfcontent^10 contentS^100/str
  str name=ps100/str
  str name=echoParamsexplicit/str
  int name=rows150/int
  str name=sortscore desc/str
  str name=defTypeedismax/str
  str name=qfcontent contentS^10/str
  str name=pfcontent^10 contentS^100/str
  str name=ps100/str
  str name=wtjson/str
  str name=hltrue/str
  str

Re: CopyField issue on Solr4.1

2013-01-30 Thread Shawn Heisey


On 1/30/2013 7:28 AM, Upayavira wrote:

Stored fields are now compressed in 4.1. There's other efficiencies too
in 4.0 that will also result in smaller indexes, but the compressed
stored fields is the most significant.


The compressed stored fields explains your smaller index.  As to why you 
get different results, are you doing the default relevancy ranking, or 
are you sorting by one of your fields?  If you are doing the default 
relevancy ranking, you may be getting different results because of 
scoring bugs that have been fixed since 3.6.1.  Try sorting your results 
by a field - add sort=fieldname asc or sort=fieldname desc to the 
URL and see if the results are what you expect.


If you are already sorting, that's another situation.  If you search for 
all documents on both indexes, is the numFound value about where it 
should be?


Thanks,
Shawn

Re: copyField - copy only specific words

2013-01-25 Thread Tomás Fernández Löbbe

I think the best way will be to pre-process the document (or use a custom
UpdateRequestProcessor). Other option, if you'll only use the cities
field for faceting/sorting/searching (you don't need the stored content)
would be to use a regular copyField and use a KeepWordFilter for the
cities field. However, with this approach it will be difficult to handle
multi-word cities like New York or Buenos Aires.

Tomás


On Fri, Jan 25, 2013 at 7:33 AM, b.riez...@pixel-ink.de 
b.riez...@pixel-ink.de wrote:

 Hi,

 i'd like to copy specific words from the keywords field to another field.
 Cause the data i get is all in one field i'd like to extract the cities
 (they are fixed, so i'll know them in advance) and put them in a seperate
 field.

 Can i generate a whitelist file and tell the copy field to check this file
 and only copy matching words to a new field?

 Thanks for your help
 Ben

RE: copyField - copy only specific words

2013-01-25 Thread Markus Jelsma

Hi

Use the KeepWordFilter on the destination field:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.KeepWordFilterFactory

Cheers
 
 
-Original message-
 From:b.riez...@pixel-ink.de b.riez...@pixel-ink.de
 Sent: Fri 25-Jan-2013 11:41
 To: solr-user@lucene.apache.org
 Subject: copyField - copy only specific words
 
 Hi,
 
 i'd like to copy specific words from the keywords field to another field.
 Cause the data i get is all in one field i'd like to extract the cities (they 
 are fixed, so i'll know them in advance) and put them in a seperate field.
 
 Can i generate a whitelist file and tell the copy field to check this file 
 and only copy matching words to a new field?
 
 Thanks for your help
 Ben

Re: copyField - copy only specific words

2013-01-25 Thread Alexandre Rafalovitch

Possibly with Shingles before the KeepWord filter to deal with multi-word
situations (though I am not sure if KeepWord allows space-separate tokens
in the file): http://stackoverflow.com/questions/14479473/

Regards,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Fri, Jan 25, 2013 at 8:17 AM, Markus Jelsma
markus.jel...@openindex.iowrote:

 Hi

 Use the KeepWordFilter on the destination field:

 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.KeepWordFilterFactory

 Cheers


 -Original message-
  From:b.riez...@pixel-ink.de b.riez...@pixel-ink.de
  Sent: Fri 25-Jan-2013 11:41
  To: solr-user@lucene.apache.org
  Subject: copyField - copy only specific words
 
  Hi,
 
  i'd like to copy specific words from the keywords field to another field.
  Cause the data i get is all in one field i'd like to extract the cities
 (they are fixed, so i'll know them in advance) and put them in a seperate
 field.
 
  Can i generate a whitelist file and tell the copy field to check this
 file and only copy matching words to a new field?
 
  Thanks for your help
  Ben

Re: copyField multiValued duplicates

2012-11-23 Thread Erick Erickson

Unless you stored all the original fields, I think you're stuck with
re-indexing all your docs

Best
Erick


On Mon, Nov 19, 2012 at 12:21 PM, Ravi Solr ravis...@gmail.com wrote:

 Hello,
   I have a couple of questions. I need an easy way to clean up a gaffe
 with copyFields (close to a million docs). Is there any way we could remove
 duplicates emitted via copyField while re-indexing ? Also is there a way to
 query multiValued fields to give only docs that have duplicated value ??

 The fields having issue are declared as follows

 fieldType name=keywordText class=solr.TextField
 sortMissingLast=true omitNorms=true positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.TrimFilterFactory /
 filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true/
 filter class=solr.SynonymFilterFactory
 tokenizerFactory=solr.KeywordTokenizerFactory

 synonyms=person-synonyms.txt,organization-synonyms.txt,location-synonyms.txt,subject-synonyms.txt
 ignoreCase=true expand=false /
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
   analyzer type=query
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.TrimFilterFactory /
 filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true /
 filter class=solr.SynonymFilterFactory
 tokenizerFactory=solr.KeywordTokenizerFactory

 synonyms=person-synonyms.txt,organization-synonyms.txt,location-synonyms.txt,subject-synonyms.txt
 ignoreCase=true expand=false /
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
 /fieldType


 fieldType name=text class=solr.TextField sortMissingLast=true
 omitNorms=true positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.TrimFilterFactory /
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.SynonymFilterFactory

 synonyms=person-synonyms.txt,organization-synonyms.txt,location-synonyms.txt,subject-synonyms.txt
 ignoreCase=true expand=true/
 !-- Case insensitive stop word removal.
 enablePositionIncrements=true ensures that a 'gap' is left to allow for
 accurate phrase queries. --
 filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true /
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=0 generateNumberParts=0 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=0
 protected=protwords.txt/
 filter class=solr.EnglishPorterFilterFactory
 protected=protwords.txt/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
   analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.TrimFilterFactory /
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.SynonymFilterFactory

 synonyms=person-synonyms.txt,organization-synonyms.txt,location-synonyms.txt,subject-synonyms.txt
 ignoreCase=true expand=true/
 filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true /
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=0 generateNumberParts=0 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=0
 protected=protwords.txt/
 filter class=solr.EnglishPorterFilterFactory
 protected=protwords.txt/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
 /fieldType

 field name=city type=keywordText indexed=true stored=true
 multiValued=true termVectors=true/

 field name=cityLower type=text indexed=true stored=true
 multiValued=true termVectors=false/


 copyField source=city dest=cityLower/

 Query results look as follows


 arr name=city
 strNo city/str
 /arr
 arr name=cityLower
strNo city/str
strNo city/str
strNo city/str

strNo city/str
strNo city/str
strNo city/str
strNo city/str
strNo city/str
strNo city/str

strNo city/str
 /arr

 Thanks,

 Ravi Kiran Bhaskar

Re: Copyfield query

2012-09-25 Thread Rafał Kuć

Hello!

As you can is in the http://wiki.apache.org/solr/SchemaXml#Copy_Fields
the actual copying is done before analysis and indexing, so it doesn't
matter if you store fields you use as source for your copy fields.  

-- 
Regards,
 Rafał Kuć
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch

 Dear All,

 I have a question on the CopyField directive in schema.xml.

 *Background:*
 Each record will have various types of titles say product_title,
 actual_title, user_title, working_title.

 field name=product_title type=text indexed=true stored=true
 multiValued=false /
 field name=actual_title type=text indexed=true stored=true
 multiValued=false /
 field name=user_title type=text indexed=true stored=true
 multiValued=false /
 field name=working_title type=text indexed=true stored=true
 multiValued=false /

 All the above fields are searchable as well as need to be returned as part
 of the search response hence indexed=true and stored=true is implemented.

 There is an additional requirement to search against user and working title
 and hence there is a composite field created called 'group_title'

 field name=group_title type=text indexed=true stored=false
 multiValued=true /

 copyField source=user_title  dest=group_title /
 copyField source=working_title dest=group_title /

 The group title need not be part of the search response, hence stored=false
 is implemented.

 Further, there is a requirement to search against all these titles for
 which a composite field that copies from all these fields is defined as
 'titles'

 field name=titles type=text indexed=true stored=false
 multiValued=true /

 copyField source=product_title dest=titles /
 copyField source=actual_title dest=titles /
 *copyField source=group_title dest=titles /*
 *
 *
 If you see above, instead of copying the individual fields to the titles,
 the group_title created above is copied to titles.
 My question is, because group_title is designed to be stored=false, will
 the above underlined code copy the content of 'group_title' to 'titles' or
 not?

 Many Thanks in advance,
 Sandeep

Re: copyField

2012-05-18 Thread Jack Krupansky

Did you also delete all existing documents from the index? Maybe your crawl 
did not re-index documents that were already in the index or that hadn't 
changed since the last crawl, leaving the old index data as it was before 
the change.


-- Jack Krupansky

-Original Message- 
From: Tolga

Sent: Friday, May 18, 2012 9:54 AM
To: solr-user@lucene.apache.org
Subject: copyField

Hi,

I've put the line copyField=* dest=text stored=true
indexed=true/ in my schema.xml and restarted Solr, crawled my
website, and indexed (I've also committed but do I really have to
commit?). But I still have to search with content:mykeyword at the admin
interface. What do I have to do so that I can search only with mykeyword?

Regards,

Re: copyField

2012-05-18 Thread Tolga

I'll make sure to do that. Thanks

 myPhone'dan gönderdim

18 May 2012 tarihinde 17:40 saatinde, Jack Krupansky 
j...@basetechnology.com şunları yazdı:

 Did you also delete all existing documents from the index? Maybe your crawl 
 did not re-index documents that were already in the index or that hadn't 
 changed since the last crawl, leaving the old index data as it was before the 
 change.
 
 -- Jack Krupansky
 
 -Original Message- From: Tolga
 Sent: Friday, May 18, 2012 9:54 AM
 To: solr-user@lucene.apache.org
 Subject: copyField
 
 Hi,
 
 I've put the line copyField=* dest=text stored=true
 indexed=true/ in my schema.xml and restarted Solr, crawled my
 website, and indexed (I've also committed but do I really have to
 commit?). But I still have to search with content:mykeyword at the admin
 interface. What do I have to do so that I can search only with mykeyword?
 
 Regards,

Re: copyField

2012-05-18 Thread Yury Kats

On 5/18/2012 9:54 AM, Tolga wrote:
 Hi,
 
 I've put the line copyField=* dest=text stored=true 
 indexed=true/ in my schema.xml and restarted Solr, crawled my 
 website, and indexed (I've also committed but do I really have to 
 commit?). But I still have to search with content:mykeyword at the admin 
 interface. What do I have to do so that I can search only with mykeyword?

Do you have the default field defined?

Re: copyField

2012-05-18 Thread Tolga

Default field? I'm not sure but I think I do. Will have to look. 

 myPhone'dan gönderdim

18 May 2012 tarihinde 18:11 saatinde, Yury Kats yuryk...@yahoo.com şunları 
yazdı:

 On 5/18/2012 9:54 AM, Tolga wrote:
 Hi,
 
 I've put the line copyField=* dest=text stored=true 
 indexed=true/ in my schema.xml and restarted Solr, crawled my 
 website, and indexed (I've also committed but do I really have to 
 commit?). But I still have to search with content:mykeyword at the admin 
 interface. What do I have to do so that I can search only with mykeyword?
 
 Do you have the default field defined?

Re: copyField

2012-05-18 Thread Yury Kats

On 5/18/2012 4:02 PM, Tolga wrote:
 Default field? I'm not sure but I think I do. Will have to look. 

http://wiki.apache.org/solr/SchemaXml#The_Default_Search_Field

Re: copyField

2012-05-18 Thread Tolga

Oh this one. Yes I have it. 

 myPhone'dan gönderdim

18 May 2012 tarihinde 23:14 saatinde, Yury Kats yuryk...@yahoo.com şunları 
yazdı:

 On 5/18/2012 4:02 PM, Tolga wrote:
 Default field? I'm not sure but I think I do. Will have to look. 
 
 http://wiki.apache.org/solr/SchemaXml#The_Default_Search_Field

Re: copyField after analyzer

2012-04-10 Thread Rafał Kuć

Hello!

It's not possible with copy fields right now. As you wrote - copy
fields are copied before analysis is done. 

-- 
Regards,
 Rafał Kuć
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch

 Hi,

 I want to copy/append different fields to one field, while applying a
 different analyzer for each field. 

 copyField source=cat dest=text/
 copyField source=name dest=text/
 copyField source=manu dest=text/
 copyField source=features dest=text/
 copyField source=includes dest=text/

 Lets assume i have specified different analyzers/filters for each of the
 above source fields cat, name, manu, features and includes. I read online
 that copyField copies before the analysis is done. Is it possible to
 copyField after applying the analyzer to all the source string ? Just to
 give you one use case, i would like to remove duplicates in some of the
 source fields like manna, includes, etc and i don't want to do it for all
 source fields.

 Thanks,
 Srini

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/copyField-after-analyzer-tp3900337p3900337.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: copyField question

2012-03-22 Thread Tomás Fernández Löbbe

I meant, how many values in total? A single document may have 20, but are
those 20 shared with other document (even if they have different score) or
each document will have 10-20 completely different values? I think Solr
could handle a couple hundred of fields, but I don't know how it would
behave with thousands (really, I don't know you should test it).

You should be using a dynamic field for creating those fields dynamically,
and make sure you have the omitNorms attribute set to true.

What do you need to use those fields for? searching? displaying?


On Wed, Mar 21, 2012 at 5:49 PM, ramdev.wud...@thomsonreuters.com wrote:

 Hi Tomás:
   I think there is simplicity in your solution ;)  A document would have
 Tens of different values. (at the most 20)Š

 So If were to follow your suggestion of naming a dynamic field with the
 value as the name of the field and the corresponding Score as the value.
 How would I go about changing the schema ?

 Thanks

 Ramdev


 On 3/21/12 3:24 PM, Tomás Fernández Löbbe tomasflo...@gmail.com wrote:

  However, If the multivalued complex data field is not possible. Is it
 possible to use copyField directive to copy fields if a certain score  is
 higher than a threshold ?
 I don't think that's possible out of the box, but you could use custom
 UpdateRequestProcessor for for that.
 
 How many different values do you have? tens? hundreds? thousands?...
 millions? If those are not too many, you could use dynamic fields, using
 the value as field name and the score as field value. Unless I'm
 oversimplifying your problem.
 
 Tomás
 
 
 On Wed, Mar 21, 2012 at 5:16 PM, ramdev.wud...@thomsonreuters.com
 wrote:
 
  Hi:
Is it it possible to store a value and a corresponding score in Solr
 as
  part of a single Field definition. And Can this field be a multivalued
  field ?
  I have several terms that are score. I would like to store them as part
 of
  a single field definition rather than having to create two different
 fields
  (one storing score and the other the value).
 
  However, If the multivalued complex data field is not possible. Is it
  possible to use copyField directive to copy fields if a certain score
 is
  higher than a threshold ?
 
 
  Thanks
 
  Ramdev

Re: copyField question

2012-03-22 Thread ramdev.wudali

Hi Tomas:

These fields are for searching only.

Currently we have around 1.8M docs indexed.and Assuming each Doc has about
20 of these additional fields to be created as dynamic fields (worst case
scenario), and also there are about 6K if these different values (I.e. If
we were to create static fields defs, there would be 6K fields).

I did create dynamic fields as you suggested, but only on a subset of docs
(10K). I have not extensive performance analysis on it or anything. (its a
rather simple  schema/index structure).


Thanks

Ramdev


On 3/22/12 7:42 AM, Tomás Fernández Löbbe tomasflo...@gmail.com wrote:

I meant, how many values in total? A single document may have 20, but are
those 20 shared with other document (even if they have different score) or
each document will have 10-20 completely different values? I think Solr
could handle a couple hundred of fields, but I don't know how it would
behave with thousands (really, I don't know you should test it).

You should be using a dynamic field for creating those fields dynamically,
and make sure you have the omitNorms attribute set to true.

What do you need to use those fields for? searching? displaying?


On Wed, Mar 21, 2012 at 5:49 PM, ramdev.wud...@thomsonreuters.com wrote:

 Hi Tomás:
   I think there is simplicity in your solution ;)  A document would have
 Tens of different values. (at the most 20)Š

 So If were to follow your suggestion of naming a dynamic field with the
 value as the name of the field and the corresponding Score as the value.
 How would I go about changing the schema ?

 Thanks

 Ramdev


 On 3/21/12 3:24 PM, Tomás Fernández Löbbe tomasflo...@gmail.com
wrote:

  However, If the multivalued complex data field is not possible. Is it
 possible to use copyField directive to copy fields if a certain score
is
 higher than a threshold ?
 I don't think that's possible out of the box, but you could use custom
 UpdateRequestProcessor for for that.
 
 How many different values do you have? tens? hundreds? thousands?...
 millions? If those are not too many, you could use dynamic fields,
using
 the value as field name and the score as field value. Unless I'm
 oversimplifying your problem.
 
 Tomás
 
 
 On Wed, Mar 21, 2012 at 5:16 PM, ramdev.wud...@thomsonreuters.com
 wrote:
 
  Hi:
Is it it possible to store a value and a corresponding score in
Solr
 as
  part of a single Field definition. And Can this field be a
multivalued
  field ?
  I have several terms that are score. I would like to store them as
part
 of
  a single field definition rather than having to create two different
 fields
  (one storing score and the other the value).
 
  However, If the multivalued complex data field is not possible. Is it
  possible to use copyField directive to copy fields if a certain score
 is
  higher than a threshold ?
 
 
  Thanks
 
  Ramdev

Re: copyField question

2012-03-21 Thread Tomás Fernández Löbbe

 However, If the multivalued complex data field is not possible. Is it
possible to use copyField directive to copy fields if a certain score  is
higher than a threshold ?
I don't think that's possible out of the box, but you could use custom
UpdateRequestProcessor for for that.

How many different values do you have? tens? hundreds? thousands?...
millions? If those are not too many, you could use dynamic fields, using
the value as field name and the score as field value. Unless I'm
oversimplifying your problem.

Tomás


On Wed, Mar 21, 2012 at 5:16 PM, ramdev.wud...@thomsonreuters.com wrote:

 Hi:
   Is it it possible to store a value and a corresponding score in Solr as
 part of a single Field definition. And Can this field be a multivalued
 field ?
 I have several terms that are score. I would like to store them as part of
 a single field definition rather than having to create two different fields
 (one storing score and the other the value).

 However, If the multivalued complex data field is not possible. Is it
 possible to use copyField directive to copy fields if a certain score  is
 higher than a threshold ?


 Thanks

 Ramdev

Re: copyField question

2012-03-21 Thread ramdev.wudali

Hi Tomás:
   I think there is simplicity in your solution ;)  A document would have
Tens of different values. (at the most 20)Š

So If were to follow your suggestion of naming a dynamic field with the
value as the name of the field and the corresponding Score as the value.
How would I go about changing the schema ?

Thanks

Ramdev


On 3/21/12 3:24 PM, Tomás Fernández Löbbe tomasflo...@gmail.com wrote:

 However, If the multivalued complex data field is not possible. Is it
possible to use copyField directive to copy fields if a certain score  is
higher than a threshold ?
I don't think that's possible out of the box, but you could use custom
UpdateRequestProcessor for for that.

How many different values do you have? tens? hundreds? thousands?...
millions? If those are not too many, you could use dynamic fields, using
the value as field name and the score as field value. Unless I'm
oversimplifying your problem.

Tomás


On Wed, Mar 21, 2012 at 5:16 PM, ramdev.wud...@thomsonreuters.com wrote:

 Hi:
   Is it it possible to store a value and a corresponding score in Solr
as
 part of a single Field definition. And Can this field be a multivalued
 field ?
 I have several terms that are score. I would like to store them as part
of
 a single field definition rather than having to create two different
fields
 (one storing score and the other the value).

 However, If the multivalued complex data field is not possible. Is it
 possible to use copyField directive to copy fields if a certain score
is
 higher than a threshold ?


 Thanks

 Ramdev

Re: copyField: multivalued field to joined singlevalue field

2012-02-16 Thread Yonik Seeley

On Thu, Feb 16, 2012 at 11:35 AM, flyingeagle-de
flyingeagle...@yahoo.de wrote:
 Hello,

 I want to copy all data from a multivalued field joined together in a single
 valued field.

 Is there any opportunity to do this by using solr-standards?

There is not currently, but it certainly makes sense.

Anyone know of an open issue for this yet?  If not, we should create one!

-Yonik
lucidimagination.com

Re: copyField: multivalued field to joined singlevalue field

2012-02-16 Thread Chris Hostetter


:  I want to copy all data from a multivalued field joined together in a single
:  valued field.
: 
:  Is there any opportunity to do this by using solr-standards?
: 
: There is not currently, but it certainly makes sense.

Part of it has just recently been commited to trunk actually...

https://issues.apache.org/jira/browse/SOLR-2802

https://builds.apache.org/job/Solr-trunk/javadoc/org/apache/solr/update/processor/ConcatFieldUpdateProcessorFactory.html

...with that, it's easy to say anytime multiple values are found for a 
single valued string field, join them together with a comma.

the only piece that's missing is to copy from a source field in an 
(earlier) UpdateProcessor.  

Theres a patch for this in SOLR-2599 but i haven't had a chance to look at 
it yet.




-Hoss

Re: CopyField copying to self

2011-10-05 Thread Gora Mohanty

On Thu, Oct 6, 2011 at 1:49 AM, Jamie Johnson jej2...@gmail.com wrote:
 I have a field named test_txt which I am populating in some cases, and
 not in others.  I also have a copy field directive to copy data from
 _txt to text_txt.  Thigns seem to work except I believe the field is
 also copying to itself.  Is there anyway to avoid this behavior?

Sorry, what do you mean by the field is also copying to itself?
What are you seeing that is leading you to think so?

Regards,
Gora

Re: copyField for big indexes

2011-08-22 Thread Erick Erickson

copyField should only be used if there's a good reason, that is
you need to tokenize/analyze stuff differently, for instance
faceting.

It's not so much a matter of the index size, as whether the
copyFields are necessary to get your needed functionality.

You're right that you can construct queries across several fields
as your example. Another strategy would be to just use
(e)dismax.

Best
Erick

On Mon, Aug 22, 2011 at 1:14 PM, Tom springmeth...@gmail.com wrote:
 Is it a good rule of thumb, that when dealing with large indexes copyField
 should not be used.  It seems to duplicate the indexing of data.

 You don't need copyField to be able to search on multiple fields.  Example,
 if I have two fields: title and post and I want to search on both, I could
 just query
 title:word OR post:word

 So it seems to me if you have lot's of data and a large indexes, copyField
 should be avoided.

 Any thoughts?

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/copyField-for-big-indexes-tp3275712p3275712.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: copyField for big indexes

2011-08-22 Thread Tom

Thanks Erick

--
View this message in context: 
http://lucene.472066.n3.nabble.com/copyField-for-big-indexes-tp3275712p3275816.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: copyField for big indexes

2011-08-22 Thread Bill Bell

It depends.

copyField may be good if you want to copy into a Soundex field, and then
boost the sounded field lower than the tokenized field.

What are you trying to do ?

On 8/22/11 11:14 AM, Tom springmeth...@gmail.com wrote:

Is it a good rule of thumb, that when dealing with large indexes copyField
should not be used.  It seems to duplicate the indexing of data.

You don't need copyField to be able to search on multiple fields.
Example,
if I have two fields: title and post and I want to search on both, I could
just query 
title:word OR post:word

So it seems to me if you have lot's of data and a large indexes, copyField
should be avoided.

Any thoughts?

--
View this message in context:
http://lucene.472066.n3.nabble.com/copyField-for-big-indexes-tp3275712p327
5712.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: copyField for big indexes

2011-08-22 Thread Tom

Bill,

  I was using it as a simple default search field.  I realise now that's not
a good reason to use copyField.  As I see it now, it should be used if you
want to search in a way that is different: use different analyzers, etc; not
for just searching on multiple fields in a single query.

Thanks

--
View this message in context: 
http://lucene.472066.n3.nabble.com/copyField-for-big-indexes-tp3275712p3276994.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: CopyField into another CopyField?

2011-07-05 Thread Chris Hostetter


: In solr, is it possible to 'chain' copyfields so that you can copy the value
: of one into another?
...
: copyField source=name dest=autocomplete /
: copyField source=autocomplete dest=ac_spellcheck /
: 
: Point being, every time I add a new field to the autocomplete, I want it to
: automatically also be added to ac_spellcheck without having to do it twice.

Sorry no, the IndexSchema won't recursively apply copyFields.  As i 
recall it was implemented this way partly for simplicity, and largly to 
protect people against the risk of infinite loops.  we could probably 
make a more sophisticated impl that detects infinite loops, but that check 
would slow things down and all solr could really do with it is throw an 
error.


-Hoss

Re: copyField generates multiple values encountered for non multiValued field

2011-06-21 Thread Chris Hostetter


: This is for debugging purposes, so I am sending the exact same data that are
: already stored in Solr's index.
...
: ERROR: [288400] multiple values encountered for non multiValued field
: field2 [fieldvalue, fieldvalue]
: 
: The scenario:
: - field1 is implicitly single value, type text, indexed and stored
: - field2 is generated via a copyField directive in schema.xml, implicitly
: single value, type string, indexed and stored
: 
: What appears to happen:
: - On the first add (SolrClient::addDocuments(array(SolrInputDocument
: theDocument))), regular fields like field1 get overwritten as intended
: - field2, defined with a copyField, but still single value, gets
: _appended_ instead
: - When I retrieve the updated document in a query and try to add it again,
: it won't let me because of the inconsistent multi-value state
...
: But: Solr appears to be generating the corrupted state itsself via
: copyField?
: What's going wrong? I'm pretty confused...

I think you are missunderstanding the error you are seeing.  Solr isn't 
creating any inconsistent state, the multiValued check does in fact happen 
after the copyFields.


Based on your description, this is what it sounds to me like you are 
doing and why you are getting your error...

Initially sending solr a doc that looks like this...

id=1; field1=fieldvalue

...which when copyFields are evaluated winds up looking like this...

id=1; field1=fieldvalue; field2=fieldvalue

...that document goes in the index, and you then execute a query that 
matches it, and fetch the stored values of that document from solr -- 
getting all three fields back (ie, field1, field2).

You then attempt to index that document again, sending all 3 fields...

id=1; field1=fieldvalue; field2=fieldvalue

...which when copyFields are evaluated winds up looking like this...

id=1; field1=fieldvalue; field2=fieldvalue; field2=fieldvalue

..and that's why you get the error you are seeing.

If i'm missunderstanding your retrieve the updated document in a query 
and try to add it again process, can you please provide some example 
configs and the exact steps to reproduce (using the post.jar, or curl, or 
something simple that doesn't require PECL)

-Hoss

Re: copyField generates multiple values encountered for non multiValued field

2011-05-31 Thread Alexander Kanarsky

Alexander,

I saw the same behavior in 1.4.x with non-multivalued fields when
updating the document in the index (i.e obtaining the doc from the
index, modifying some fields and then adding the document with the same
id back). I do not know what causes this, but it looks like the
copyField logic completely bypasses the multivalueness check and just
adds the value in addition to whatever already there (instead of
replacing the value). So yes, Solr renders itself into incorrect state
then (note that the index is still correct from the Lucene's
standpoint). 

-Alexander

 


On Wed, 2011-05-25 at 16:50 +0200, Alexander Golubowitsch wrote:
 Dear list,
  
 hope somebody can help me understand/avoid this.
  
 I am sending an add request with allowDuplicates=false to a Solr 1.4.1
 instance.
 This is for debugging purposes, so I am sending the exact same data that are
 already stored in Solr's index.
 I am using the PHP PECL libraries, which fail completely in giving me any
 hint on what goes wrong.
 
 Only sending the same add request again gives me a proper
 SolrClientException that hints:
  
 ERROR: [288400] multiple values encountered for non multiValued field
 field2 [fieldvalue, fieldvalue]
 
 The scenario:
 - field1 is implicitly single value, type text, indexed and stored
 - field2 is generated via a copyField directive in schema.xml, implicitly
 single value, type string, indexed and stored
 
 What appears to happen:
 - On the first add (SolrClient::addDocuments(array(SolrInputDocument
 theDocument))), regular fields like field1 get overwritten as intended
 - field2, defined with a copyField, but still single value, gets
 _appended_ instead
 - When I retrieve the updated document in a query and try to add it again,
 it won't let me because of the inconsistent multi-value state
 - The PECL library, in addition, appears to hit some internal exception
 (that it doesn't handle properly) when encountering multiple values for a
 single value field. That gives me zero results querying a set that includes
 the document via PHP, while the document can be retrieved properly, though
 in inconsistent state, any other way.
 
 But: Solr appears to be generating the corrupted state itsself via
 copyField?
 What's going wrong? I'm pretty confused...
 
 Thank you,
  Alex

Re: copyField of dates unworking?

2011-05-27 Thread Ahmet Arslan

   copyfield source=date dest=text/

The letter f should be capital. copyfield =copyField

Re: copyField of dates unworking?

2011-05-27 Thread Jack Repenning


On May 27, 2011, at 1:04 AM, Ahmet Arslan wrote:

 The letter f should be capital

Hah! Well-spotted! Thanks.

-==-
Jack Repenning
Technologist
Codesion Business Unit
CollabNet, Inc.
8000 Marina Boulevard, Suite 600
Brisbane, California 94005
office: +1 650.228.2562
twitter: http://twitter.com/jrep











PGP.sig
Description: This is a digitally signed message part

Re: copyField of dates unworking?

2011-05-26 Thread anass talby

it seems like reserved key words  can't be used as field names did you try
to changes your date field name?

On Thu, May 26, 2011 at 9:54 PM, Jack Repenning jrepenn...@collab.netwrote:

 Are there some sort of rules about what sort of fields can be copyFielded
 into other fields?

 My schema has (among other things):

 field name=date type=tdate   indexed=true
  stored=true  required=true  /
 field name=user type=string  indexed=true
  stored=true  required=true  /
 field name=text type=textgen indexed=true
  stored=true required=false
  multiValued=true /
  ...
   copyField source=user dest=text/
   copyfield source=date dest=text/
 

 The user field gets copied into text just fine, but the date field
 does not.

 In case they're handy, I've attached:
  - schema.xml - the complete schema
  - solr-usr-question.xml - a sample doc
  - solr-usr-answer.xml - the result in the searchbase


 -==-
 Jack Repenning
 Technologist
 Codesion Business Unit
 CollabNet, Inc.
 8000 Marina Boulevard, Suite 600
 Brisbane, California 94005
 office: +1 650.228.2562
 twitter: http://twitter.com/jrep














-- 
   Anass

Re: copyField of dates unworking?

2011-05-26 Thread Jack Repenning

On May 26, 2011, at 1:55 PM, anass talby wrote:

 it seems like reserved key words  can't be used as field names did you try
 to changes your date field name?

Interesting thought, but it didn't seem to help.

I changed the schema so it has both a date and a eventDate field (so as not 
to invalidate my current data), and changed the copyField statement to 
from=eventDate. Then I added an eventData field to the test document 
mentioned earlier, with a one-second difference so I could be sure which was 
which. I added that doc, but the text field still doesn't have either date 
field.

Any other thoughts why I can't copyField a date into a textgen?

{
  responseHeader:{
status:0,
QTime:5,
params:{
  indent:on,
  start:0,
  q:text:\example for list question\,
  version:2.2,
  rows:10}},
  response:{numFound:1,start:0,docs:[
  {
id:jackrepenningdev-p1-svn-solr-user-question-1,
item:r10,
itemNumber:10,
user:jackrepenning,
date:2011-05-26T20:34:19Z,
eventDate:2011-05-26T20:34:20Z,
log:example for list question,
organization:jackrepenningdev,
project:p1,
system:versioncontrol,
subsystem:svn,
class:operation,
className:commit,
text:[
  r10,
  jackrepenning,
  M /trunk/cvsdude/solr/conf/schema.xml,
  example for list question],
paths:[/trunk/cvsdude/solr/conf/schema.xml],
changes:[M /trunk/cvsdude/solr/conf/schema.xml]}]
  }}

-==-
Jack Repenning
Technologist
Codesion Business Unit
CollabNet, Inc.
8000 Marina Boulevard, Suite 600
Brisbane, California 94005
office: +1 650.228.2562
twitter: http://twitter.com/jrep











PGP.sig
Description: This is a digitally signed message part

Re: copyField

2011-05-05 Thread Ahmet Arslan

 if i define different fields with different boosts and then
 copy them into
 another field and make a search by using this universal
 field, the boosting
 will be done? 

No. copyField just copies raw content.

Re: copyField at search time / multi-language support

2011-03-29 Thread lboutros

Tom,

to solve this kind of problem, if I understand it well, you could extend the
query parser to support something like meta-fields. I'm currently developing
a QueryParser Plugin to support a specific syntax. The support of
meta-fields to search on different fields (multiple languages) is one of the
functionalities that this parser will contain.

Ludovic.

2011/3/29 Markus Jelsma-2 [via Lucene] 
ml-node+2747011-315348515-383...@n3.nabble.com

 I haven't tried this as an UpdateProcessor but it relies on Tika and that
 LanguageIdentifier works well, except for short texts.

  Thanks Markus.
 
  Do you know if this patch is good enough for production use? Thanks.
 
  Andy
 
  --- On Tue, 3/29/11, Markus Jelsma [hidden 
  email]http://user/SendEmail.jtp?type=nodenode=2747011i=0by-user=t
 wrote:
   From: Markus Jelsma [hidden 
   email]http://user/SendEmail.jtp?type=nodenode=2747011i=1by-user=t

   Subject: Re: copyField at search time / multi-language support
   To: [hidden 
   email]http://user/SendEmail.jtp?type=nodenode=2747011i=2by-user=t
   Cc: Andy [hidden 
   email]http://user/SendEmail.jtp?type=nodenode=2747011i=3by-user=t

   Date: Tuesday, March 29, 2011, 1:29 AM
   https://issues.apache.org/jira/browse/SOLR-1979
  
Tom,
   
Could you share the method you use to perform language
  
   detection? Any open
  
source tools that do that?
   
Thanks.
   
--- On Mon, 3/28/11, Tom Mortimer [hidden 
email]http://user/SendEmail.jtp?type=nodenode=2747011i=4by-user=t

  
   wrote:
 From: Tom Mortimer [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=2747011i=5by-user=t

 Subject: copyField at search time /
  
   multi-language support
  
 To: [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=2747011i=6by-user=t
 Date: Monday, March 28, 2011, 4:45 AM
 Hi,

 Here's my problem: I'm indexing a corpus with
  
   text in a
  
 variety of
 languages. I'm planning to detect these at index
  
   time and
  
 send the
 text to one of a suitably-configured field (e.g.
 mytext_de for
 German, mytext_cjk for Chinese/Japanese/Korean
  
   etc.)
  
 At search time I want to search all of these
  
   fields.
  
 However, there
 will be at least 12 of them, which could lead to
  
   a very
  
 long query
 string. (Also I need to use the standard query
  
   parser
  
 rather than
 dismax, for full query syntax.)

 Therefore I was wondering if there was a way to
  
   copy fields
  
 at search
 time, so I can have my mytext query in a single
  
   field and
  
 have it
 copied to mytext_de, mytext_cjk etc. Something
  
   like:
copyQueryField source=mytext

 dest=mytext_de /

copyQueryField source=mytext

 dest=mytext_cjk /

   ...

 If this is not currently possible, could someone
  
   give me
  
 some pointers
 for hacking Solr to support it? Should I
  
   subclass
  
 solr.SearchHandler?
 I know nothing about Solr internals at the
  
   moment...
  
 thanks,
 Tom


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://lucene.472066.n3.nabble.com/copyField-at-search-time-multi-language-support-tp2746017p2747011.html
  To start a new topic under Solr - User, email
 ml-node+472068-1765922688-383...@n3.nabble.com
 To unsubscribe from Solr - User, click 
 herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=472068code=Ym91dHJvc2xAZ21haWwuY29tfDQ3MjA2OHw0Mzk2MDUxNjE=.




-
Jouve
France.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/copyField-at-search-time-multi-language-support-tp2746017p2747386.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: copyField at search time / multi-language support

2011-03-29 Thread Erick Erickson

This may not be all that helpful, but have you looked at edismax?
https://issues.apache.org/jira/browse/SOLR-1553

It allows the full Solr query syntax while preserving the goodness of
dismax.

This is standard equipment on 3.1, which is being released even as we
speak, and I also know it's being used in production situations.

If going to 3.1 is not an option, I know people have applied that patch
to 1.4.1, but haven't done it myself.

Best
Erick

On Mon, Mar 28, 2011 at 4:45 AM, Tom Mortimer t...@flax.co.uk wrote:
 Hi,

 Here's my problem: I'm indexing a corpus with text in a variety of
 languages. I'm planning to detect these at index time and send the
 text to one of a suitably-configured field (e.g. mytext_de for
 German, mytext_cjk for Chinese/Japanese/Korean etc.)

 At search time I want to search all of these fields. However, there
 will be at least 12 of them, which could lead to a very long query
 string. (Also I need to use the standard query parser rather than
 dismax, for full query syntax.)

 Therefore I was wondering if there was a way to copy fields at search
 time, so I can have my mytext query in a single field and have it
 copied to mytext_de, mytext_cjk etc. Something like:

   copyQueryField source=mytext dest=mytext_de /
   copyQueryField source=mytext dest=mytext_cjk /
  ...

 If this is not currently possible, could someone give me some pointers
 for hacking Solr to support it? Should I subclass solr.SearchHandler?
 I know nothing about Solr internals at the moment...

 thanks,
 Tom

Re: copyField destination does not exist

2011-03-28 Thread Geert-Jan Brits

The error is saying you have a copyfield-directive in schema.xml that wants
to copy the value of a field to the destination field 'text' that doesn't
exist (which indeed is the case given your supplied fields) Search your
schema.xml for 'copyField'. There's probably something configured related to
copyfield functionality that you don't want.  Perhaps you de-commented the
copyfield-portion of schema.xml by accident?

hth,
Geert-Jan

2011/3/28 Merlin Morgenstern merli...@fastmail.fm

 Hi there,

 I am trying to get solr indexing mysql tables. Seems like I have
 misconfigured schema.xml:

 HTTP ERROR: 500

 Severe errors in solr configuration.

 -
 org.apache.solr.common.SolrException: copyField destination :'text' does
 not exist
at

  org.apache.solr.schema.IndexSchema.registerCopyField(IndexSchema.java:685)


 My config looks like this:

  fields
field name=id type=string indexed=true stored=true
required=true/
field name=phrase type=text indexed=true stored=true
required=true/
field name=country type=text indexed=true stored=true
required=true/
  /fields

  uniqueKeyid/uniqueKey
  !-- field for the QueryParser to use when an explicit fieldname is
  absent --
  defaultSearchFieldphrase/defaultSearchField


 What is wrong within this config? The type schould be OK.

 --
 http://www.fastmail.fm - Choose from over 50 domains or use your own

Re: copyField at search time / multi-language support

2011-03-28 Thread Gora Mohanty

On Mon, Mar 28, 2011 at 2:15 PM, Tom Mortimer t...@flax.co.uk wrote:
 Hi,

 Here's my problem: I'm indexing a corpus with text in a variety of
 languages. I'm planning to detect these at index time and send the
 text to one of a suitably-configured field (e.g. mytext_de for
 German, mytext_cjk for Chinese/Japanese/Korean etc.)


 At search time I want to search all of these fields. However, there
 will be at least 12 of them, which could lead to a very long query
 string. (Also I need to use the standard query parser rather than
 dismax, for full query syntax.)

Sorry, unable to understand this. Are you detecting the language,
and based on that, indexing to one of mytext_de, mytext_cjk, etc.,
or does each field have mixed languages? If the former, why could
you not also detect the language at query time (or, have separate
query sources for users of different languages), and query the
appropriate field based on the known language to be searched?

 Therefore I was wondering if there was a way to copy fields at search
 time, so I can have my mytext query in a single field and have it
 copied to mytext_de, mytext_cjk etc. Something like:

   copyQueryField source=mytext dest=mytext_de /
   copyQueryField source=mytext dest=mytext_cjk /
  ...

 If this is not currently possible, could someone give me some pointers
 for hacking Solr to support it? Should I subclass solr.SearchHandler?
 I know nothing about Solr internals at the moment...
[...]

This is not possible as far as I know, and would be quite inefficient.

Regards,
Gora

1 2 >

1 - 100 of 135 matches

Mail list logo