Re: Question by solr queries optimization

2020-12-23 Thread Adrien Grand
Hi Alex,

Indeed Solr would automatically rewrite this query to `id:%key^3` since
versions 7.1 / 8.0.

This happens via BooleanQuery#rewrite, you can check out the JIRA where
this was implemented: https://issues.apache.org/jira/browse/LUCENE-7925.

On Wed, Dec 23, 2020 at 3:13 PM Alex Bulygin 
wrote:

> Good day to all! Perhaps a stupid question, I'm not very experienced in
> using solr, please, tell me, if I send such a request to solr id: (% key
> or% key or% key) and the keys are equal, will there be any optimization of
> such a request ? Or tell me from the code where such an optimization can
> take place? Hope for help
>
> --
> Bulygin Alex
>


-- 
Adrien


Question by solr queries optimization

2020-12-23 Thread Alex Bulygin

Good day to all! Perhaps a stupid question, I'm not very experienced in using 
solr, please, tell me, if I send such a request to solr id: (% key or% key or% 
key) and the keys are equal, will there be any optimization of such a request ? 
Or tell me from the code where such an optimization can take place? Hope for 
help
--
Bulygin Alex

Re: Question on Solr

2017-02-15 Thread Susheel Kumar
Hello Prathib,

This is how I would go.  Will index these XML's as flat records/plain data
in Solr and then during query time search these records.  Converting xml's
to plain data in the form of key/ value pair will be done during ingestion
time and then during query if you have to present results into XML format,
you can again apply the XML transformation.

Basically search XML snippets is more or less a text search which is what
Solr is about.  You can utilise nested documents in Solr to fit your need.

Thanks,
Susheel

On Tue, Feb 14, 2017 at 7:39 PM, Prathib Kumar  wrote:

> Hi,
>
> We are evaluating solr to see if it can help to do a search of the xml
> snippets from the whole xml doc.
>
> For Ex:
> Document-1:
>
> 
>Prathib
>Java
>san jose
> CA
> 
>
> Document-2:
> 
>Joe
>C++
>chennai
> TN
> 
>
> Document-3:
> 
>Ramu
>Python
>LosAngeles
> CA
> 
>
>
> My Search string is another XML doc which could be like.
>
> Query-1:
> 
>  san jose
> 
>
> Query-2:
> 
>CA
> 
>
> I have broken this down for simplicity, in reality our xmls are nested and
> have many attributes on each tag.
>
> To continue the evaluation of solr, can you please help me from where I
> could start the analysis ?
>
> Note : currently our xml document doesnt adhere to any schema but we could
> create a schema if required.
>
>
>
> Regards
> Prathib Kumar.
>
>


Question on Solr

2017-02-14 Thread Prathib Kumar
Hi,

We are evaluating solr to see if it can help to do a search of the xml
snippets from the whole xml doc.

For Ex:
Document-1:


   Prathib
   Java
   san jose
CA


Document-2:

   Joe
   C++
   chennai
TN


Document-3:

   Ramu
   Python
   LosAngeles
CA



My Search string is another XML doc which could be like.

Query-1:

 san jose


Query-2:

   CA


I have broken this down for simplicity, in reality our xmls are nested and
have many attributes on each tag.

To continue the evaluation of solr, can you please help me from where I
could start the analysis ?

Note : currently our xml document doesnt adhere to any schema but we could
create a schema if required.



Regards
Prathib Kumar.


Question regarding Solr 4.7 solr joining across multiple cores and sorting

2014-12-10 Thread Parnit Pooni
Hi,
I'm running into an issue attempting to sort, here is the scenario.

I have my mainIndex which looks something like this.

 id description   name
  1   description1 name1
  2   description2 name2

I also have a subIndex which looks something like this

id   metric
1  4
2  5

What I am trying to do is join the two index's on the id column and have my
results sorted based on a column from the subIndex with a query like the
following.

testServer/solr/MainIndex/select?defType=edismaxq=*fq={!join from=id
to=id fromIndex=subIndex}id:*sort=metric desc

desired result

 2   description2 name2
 1   description1 name1

I'm aware that you lose all information from the subIndex the moment the
parser sees . What are my options should I want to join two indexes and
sort on a column not present in the main index?

Thanks in advance.
Parnit


Question about SOLR soft commit

2012-11-26 Thread Illu
 

Hello there,

 I'm confused about soft commit. There is very little explanation about 
this on wiki, I hope to know some more details.



 Thanks in advance.

 

Best Regards,

Illu

RE: Question about solr config files encoding.

2012-07-05 Thread Uwe Schindler
Config fiules are XML and I changed them to be handled by the XML parser 
(InputStreams), so XML parser reads encoding from Header.

But JSON is defined to be UTF-8, so we must supply the encoding 
(IOUtils.UTF8_CHARSET).

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Dawid Weiss [mailto:dawid.we...@gmail.com]
 Sent: Thursday, July 05, 2012 5:00 PM
 To: dev@lucene.apache.org
 Subject: Question about solr config files encoding.
 
 Guys should the encoding of config files really be platform-dependent?
 Currently Solr tests fail massively on setup because of things like
 this:
 
 public OpenExchangeRates(InputStream ratesStream) throws IOException {
   parser = new JSONParser(new InputStreamReader(ratesStream));
 
 this reader, when confronted with UTF-16 as file.encoding results in funky
 exceptions like:
 
 Caused by: org.apache.noggit.JSONParser$ParseException: JSON Parse
 Error: char=笊,position=0 BEFORE='笊'
 AFTER='†≤楳捬慩浥爢㨠≔桩猠摡瑡⁩猠捯汬散瑥搠晲潭⁶慲楯畳⁰牯癩摥牳⁡
 湤⁰牯癩摥搠晲'
  at org.apache.noggit.JSONParser.err(JSONParser.java:221)
  at org.apache.noggit.JSONParser.next(JSONParser.java:620)
  at org.apache.noggit.JSONParser.nextEvent(JSONParser.java:661)
  at
 org.apache.solr.schema.OpenExchangeRatesOrgProvider$OpenExchangeRates.
 init(OpenExchangeRatesOrgProvider.java:189)
  at
 org.apache.solr.schema.OpenExchangeRatesOrgProvider.reload(OpenExchang
 eRatesOrgProvider.java:129)
 
 Can we fix the encoding of these input files to UTF-8 or something?
 According to JSON RFC:
 
 http://tools.ietf.org/html/rfc4627#section-3
 
 JSON text SHALL be encoded in Unicode.  The default encoding is
UTF-8.
 
Since the first two characters of a JSON text will always be ASCII
characters [RFC0020], it is possible to determine whether an octet
stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
at the pattern of nulls in the first four octets.
 
00 00 00 xx  UTF-32BE
00 xx 00 xx  UTF-16BE
xx 00 00 00  UTF-32LE
xx 00 xx 00  UTF-16LE
xx xx xx xx  UTF-8
 
 We could just enforce/require UTF-8? Alternatively, auto-detect this from a
 binary stream as a custom Reader class.
 
 Dawid
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Question about solr config files encoding.

2012-07-05 Thread Dawid Weiss
 But JSON is defined to be UTF-8, so we must supply the encoding 
 (IOUtils.UTF8_CHARSET).

That RFC says it can be any unicode... this said I agree with you that
we can probably assume it's UTF-8 and not worry about anything else.

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Question about solr config files encoding.

2012-07-05 Thread Uwe Schindler
3.  Encoding

   JSON text SHALL be encoded in Unicode.  The default encoding is
   UTF-8.

   Since the first two characters of a JSON text will always be ASCII
   characters [RFC0020], it is possible to determine whether an octet
   stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
   at the pattern of nulls in the first four octets.

   00 00 00 xx  UTF-32BE
   00 xx 00 xx  UTF-16BE
   xx 00 00 00  UTF-32LE
   xx 00 xx 00  UTF-16LE
   xx xx xx xx  UTF-8

:-)

I think we can safely assume it is UTF-8, otherwise we must do the same shit 
like XML parsers with mark() on BufferedInputStream Most libraries out 
there can only read UTF-8 and SOLR itself produces only UTF8 JSON, right? Those 
tests only check response from solr.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: dawid.we...@gmail.com [mailto:dawid.we...@gmail.com] On Behalf Of
 Dawid Weiss
 Sent: Thursday, July 05, 2012 5:35 PM
 To: dev@lucene.apache.org
 Subject: Re: Question about solr config files encoding.
 
  But JSON is defined to be UTF-8, so we must supply the encoding
 (IOUtils.UTF8_CHARSET).
 
 That RFC says it can be any unicode... this said I agree with you that we can
 probably assume it's UTF-8 and not worry about anything else.
 
 Dawid
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Question about solr config files encoding.

2012-07-05 Thread Yonik Seeley
On Thu, Jul 5, 2012 at 10:59 AM, Dawid Weiss dawid.we...@gmail.com wrote:
 According to JSON RFC:

 http://tools.ietf.org/html/rfc4627#section-3

 JSON text SHALL be encoded in Unicode.

One of my little pet peeves with the RFC - I think this was a bad
requirement.  JSON should have been text, and then their should have
been an optional way to detect encoding if other mechanisms don't
cover it (like HTTP headers, etc).  This effectively means that
something like
[hi] is not valid JSON for many of you reading this email (if your
email client is internally representing it as something other than
unicode encoded for example).


 We could just enforce/require UTF-8?

Yes, Solr has normally always required/assumed UTF-8 for config files.
 It's simply an oversight in any places that don't.

-Yonik
http://lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Question about solr config files encoding.

2012-07-05 Thread Uwe Schindler
I just add:

Solr's XML files are parsed according to XML spec, so you can choose any
charset, you only have to define it according to XML spec! Also XML POST to
updatehandler can be any encoding (it does not need to be declared in header
anymore, the ?xml... header is fine). There is already a test! I Fixed all
this in endless sessions, but I was happy to do it, as my favourite data
format is: XML :-) [I refuse to fix this for DIH, but that's another story,
SOLR-2347].

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik
 Seeley
 Sent: Thursday, July 05, 2012 5:43 PM
 To: dev@lucene.apache.org
 Subject: Re: Question about solr config files encoding.
 
 On Thu, Jul 5, 2012 at 10:59 AM, Dawid Weiss dawid.we...@gmail.com
 wrote:
  According to JSON RFC:
 
  http://tools.ietf.org/html/rfc4627#section-3
 
  JSON text SHALL be encoded in Unicode.
 
 One of my little pet peeves with the RFC - I think this was a bad
requirement.
 JSON should have been text, and then their should have been an optional
way
 to detect encoding if other mechanisms don't cover it (like HTTP headers,
etc).
 This effectively means that something like [hi] is not valid JSON for
many of
 you reading this email (if your email client is internally representing it
as
 something other than unicode encoded for example).
 
 
  We could just enforce/require UTF-8?
 
 Yes, Solr has normally always required/assumed UTF-8 for config files.
  It's simply an oversight in any places that don't.
 
 -Yonik
 http://lucidimagination.com
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Question about solr config files encoding.

2012-07-05 Thread Uwe Schindler
 updatehandler can be any encoding (it does not need to be declared in
header

...HTTP header..., sorry

  -Original Message-
  From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik
  Seeley
  Sent: Thursday, July 05, 2012 5:43 PM
  To: dev@lucene.apache.org
  Subject: Re: Question about solr config files encoding.
 
  On Thu, Jul 5, 2012 at 10:59 AM, Dawid Weiss dawid.we...@gmail.com
  wrote:
   According to JSON RFC:
  
   http://tools.ietf.org/html/rfc4627#section-3
  
   JSON text SHALL be encoded in Unicode.
 
  One of my little pet peeves with the RFC - I think this was a bad
 requirement.
  JSON should have been text, and then their should have been an
  optional
 way
  to detect encoding if other mechanisms don't cover it (like HTTP
  headers,
 etc).
  This effectively means that something like [hi] is not valid JSON
  for
 many of
  you reading this email (if your email client is internally
  representing it
 as
  something other than unicode encoded for example).
 
 
   We could just enforce/require UTF-8?
 
  Yes, Solr has normally always required/assumed UTF-8 for config files.
   It's simply an oversight in any places that don't.
 
  -Yonik
  http://lucidimagination.com
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Question about solr config files encoding.

2012-07-05 Thread Dawid Weiss
Sure, I don't have a problem with XML. I'll assume UTF-8 for json and
go through the issues later today.

Dawid

On Thu, Jul 5, 2012 at 5:47 PM, Uwe Schindler u...@thetaphi.de wrote:
 I just add:

 Solr's XML files are parsed according to XML spec, so you can choose any
 charset, you only have to define it according to XML spec! Also XML POST to
 updatehandler can be any encoding (it does not need to be declared in header
 anymore, the ?xml... header is fine). There is already a test! I Fixed all
 this in endless sessions, but I was happy to do it, as my favourite data
 format is: XML :-) [I refuse to fix this for DIH, but that's another story,
 SOLR-2347].

 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de


 -Original Message-
 From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik
 Seeley
 Sent: Thursday, July 05, 2012 5:43 PM
 To: dev@lucene.apache.org
 Subject: Re: Question about solr config files encoding.

 On Thu, Jul 5, 2012 at 10:59 AM, Dawid Weiss dawid.we...@gmail.com
 wrote:
  According to JSON RFC:
 
  http://tools.ietf.org/html/rfc4627#section-3
 
  JSON text SHALL be encoded in Unicode.

 One of my little pet peeves with the RFC - I think this was a bad
 requirement.
 JSON should have been text, and then their should have been an optional
 way
 to detect encoding if other mechanisms don't cover it (like HTTP headers,
 etc).
 This effectively means that something like [hi] is not valid JSON for
 many of
 you reading this email (if your email client is internally representing it
 as
 something other than unicode encoded for example).


  We could just enforce/require UTF-8?

 Yes, Solr has normally always required/assumed UTF-8 for config files.
  It's simply an oversight in any places that don't.

 -Yonik
 http://lucidimagination.com

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



have a question on solr query

2011-04-15 Thread Ramamurthy, Premila
I have a field DestinationId and it can take values '123 123' or '456'
I need the results of rows which  not have space in the values.


I need the row which has '456' alone to be returned.

Can you help.

Thanks
Premila




Re: have a question on solr query

2011-04-15 Thread Erick Erickson
Your problem statement is kinda sparse on details. Have you looked at
the KeywordAnalyzer?

If you don't see that as relevant, can you provide some more examples
of the kinds of data you expect to put the field and queries that should
and should not match?

Best
Erick

On Tue, Apr 12, 2011 at 11:24 AM, Ramamurthy, Premila 
premila.ramamur...@travelocity.com wrote:

  I have a field DestinationId and it can take values ‘123 123’ or ‘456’

 I need the results of rows which  not have space in the values.





 I need the row which has ‘456’ alone to be returned.



 Can you help.



 Thanks

 Premila







Doc Question for Solr Cell

2009-08-10 Thread Eric Pugh
I was refreshing my mind on the newly updated parameters on Solr Cell,  
and noticed that the Configuration section on http://wiki.apache.org/solr/ExtractingRequestHandler 
 is out of date.  Before I fixed it, I wanted to confirm that


requestHandler name=/update/extract  
class=org.apache.solr.handler.extraction.ExtractingRequestHandler

 lst name=defaults
 str name=ext.map.Last-Modifiedlast_modified/str
 bool name=ext.ignore.und.fltrue/bool /lst

Should be changed to map.Last-Modified only, and that the  
ignore.und.fl capability is now implemented via uprefix:


uprefix=prefix - Prefix all fields that are not defined in the  
schema with the given prefix. This is very useful when combined with  
dynamic field definitions. Example: uprefix=ignored_ would effectively  
ignore all unknown fields generated by Tika given the example schema  
containsdynamicField name=ignored_* type=ignored/


Eric



-
Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com
Free/Busy: http://tinyurl.com/eric-cal






Re: Doc Question for Solr Cell

2009-08-10 Thread Grant Ingersoll


On Aug 10, 2009, at 5:28 AM, Eric Pugh wrote:

I was refreshing my mind on the newly updated parameters on Solr  
Cell, and noticed that the Configuration section on http://wiki.apache.org/solr/ExtractingRequestHandler 
 is out of date.  Before I fixed it, I wanted to confirm that


requestHandler name=/update/extract  
class=org.apache.solr.handler.extraction.ExtractingRequestHandler

 lst name=defaults
 str name=ext.map.Last-Modifiedlast_modified/str
 bool name=ext.ignore.und.fltrue/bool /lst

Should be changed to map.Last-Modified only, and that the  
ignore.und.fl capability is now implemented via uprefix:


uprefix=prefix - Prefix all fields that are not defined in the  
schema with the given prefix. This is very useful when combined with  
dynamic field definitions. Example: uprefix=ignored_ would  
effectively ignore all unknown fields generated by Tika given the  
example schema containsdynamicField name=ignored_* type=ignored/


That is my understanding, yes.


Re: Doc Question for Solr Cell

2009-08-10 Thread Yonik Seeley
On Mon, Aug 10, 2009 at 5:28 AM, Eric
Pughep...@opensourceconnections.com wrote:
 I was refreshing my mind on the newly updated parameters on Solr Cell, and
 noticed that the Configuration section on
 http://wiki.apache.org/solr/ExtractingRequestHandler is out of date.  Before
 I fixed it, I wanted to confirm that

 requestHandler name=/update/extract
 class=org.apache.solr.handler.extraction.ExtractingRequestHandler
         lst name=defaults
         str name=ext.map.Last-Modifiedlast_modified/str
         bool name=ext.ignore.und.fltrue/bool /lst

 Should be changed to map.Last-Modified only, and that the ignore.und.fl
 capability is now implemented via uprefix:

Yep.
Before 1.4 is released I had wanted to add good default mappings for
common document types along with the fields in the example schema.
And then just cut-n-paste the config from the exampe schema.  It would
be great if you had any recommendations for such default mappings.

-Yonik
http://www.lucidimagination.com