Suggester - how to return exact match?

2013-11-20 Thread Mirko
Hi,
we implemented a Solr suggester (http://wiki.apache.org/solr/Suggester)
that uses a file based dictionary. We use the results of the suggester to
populate a dropdown field of a search field on a webpage.

Our dictionary (autosuggest.txt) contains:

foo
bar

Our suggester has the following behavior:

We can make a request with the search query "fo" and get a response with
the suggestion "foo". This is great.

However, if we make a request with the query "foo" (an exact match) we get
no suggestions. We would expect that the response returns the suggestion
"foo".

How can we configure the suggester to return also the perfect match as a
suggestion?

This is the config for our search component:


spellCheck

  default
  org.apache.solr.spelling.suggest.Suggester
 autosuggest.txt

  

Thanks for help!
Mirko


Re: Suggester - how to return exact match?

2013-11-21 Thread Mirko
Hi,
I'd like to clarify our use case a bit more.

We want to return the exact search query as a suggestion only if it is
present in the index. So in my example we would expect to get the
suggestion "foo" for the query "foo" but no suggestion "abc" for the query
"abc" (because "abc" is not in the dictionary).

For me this use case seems quite common. Say, we have three products in our
store: "foo", "foo 1", "foo 2". If the user types "foo" in the product
search, we want to suggest all our products in the dropdown.

Is this something we can do with the Solr suggester?
Mirko


2013/11/20 Developer 

> May be there is a way to do this but it doesn't make sense to return the
> same
> search query as a suggestion (Search query is not a suggestion as it might
> or might not be present in the index).
>
> AFAIK you can use various look up algorithm to get the suggestion list and
> they lookup the terms based on the query value (some alogrithm implements
> fuzzy logic too). so searching Foo will return FooBar, Foo2 but not foo.
>
> You should fetch the suggestion only if the numfound is greater than 0 else
> you don't have any suggestion.
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Suggester-how-to-return-exact-match-tp4102203p4102259.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Parse eDisMax queries for keywords

2013-11-21 Thread Mirko
Hi,
We would like to implement special handling for queries that contain
certain keywords. Our particular use case:

In the example query "Footitle season 1" we want to discover the keywords
"season" , get the subsequent number, and boost (or filter for) documents
that match "1" on field name="season".

We have two fields in our schema:

























Our idea was to use a Keyword tokenizer and a Regex on the "season" field
to extract the season number from the complete query.

However, we use a ExtendedDisMax query parser in our search handler:



edismax

title season






The problem is that the eDisMax tokenizes the query, so that our field
"season" receives the tokens ["Foo", "season", "1"] without any order,
instead of the complete query.

How can we pass the complete query (untokenized) to the season field? We
don't understand which tokenizer is used here and why our "season" field
received tokens instead of the complete query.

Or is there another approach to solve this use case with Solr?

Thanks,
Mirko


Re: Parse eDisMax queries for keywords

2013-11-25 Thread Mirko
Hi Jack,
thanks for your reply. Ok in this case I agree that "enriching" the query
in the application layer is a good idea. We are still a bit puzzled how the
enriched query should look like. I'll post here when we found a solution.
If somebody has suggestions, I'd be happy to hear them.

Mirko


2013/11/21 Jack Krupansky 

> The query parser does its own tokenization and parsing before your
> analyzer tokenizer and filters are called, assuring that only one white
> space-delimited token is analyzed at a time.
>
> You're probably best off having an application layer preprocessor for the
> query that "enriches" the query in the manner that you're describing.
>
> Or, simply settle for a "heuristic" approach that may give you 70% of what
> you want using only existing Solr features on the server side.
>
> -- Jack Krupansky
>
> -Original Message- From: Mirko
> Sent: Thursday, November 21, 2013 5:30 AM
> To: solr-user@lucene.apache.org
> Subject: Parse eDisMax queries for keywords
>
>
> Hi,
> We would like to implement special handling for queries that contain
> certain keywords. Our particular use case:
>
> In the example query "Footitle season 1" we want to discover the keywords
> "season" , get the subsequent number, and boost (or filter for) documents
> that match "1" on field name="season".
>
> We have two fields in our schema:
>
> 
>  multiValued="false"/>
>
> 
>
> mapping="mapping-ISOLatin1Accent.txt"/>
>
>
>
>
> 
>
>  multiValued="false"/>
>
> 
> 
> 
>
>
> 
>
> 
>
>
> Our idea was to use a Keyword tokenizer and a Regex on the "season" field
> to extract the season number from the complete query.
>
> However, we use a ExtendedDisMax query parser in our search handler:
>
> 
>
>edismax
>
>title season
>
>
>
> 
>
>
> The problem is that the eDisMax tokenizes the query, so that our field
> "season" receives the tokens ["Foo", "season", "1"] without any order,
> instead of the complete query.
>
> How can we pass the complete query (untokenized) to the season field? We
> don't understand which tokenizer is used here and why our "season" field
> received tokens instead of the complete query.
>
> Or is there another approach to solve this use case with Solr?
>
> Thanks,
> Mirko
>


Re: Suggester - how to return exact match?

2013-11-25 Thread Mirko
Thanks! We solved this issue in the front-end now. I.e. we add the exact
match to the list of suggestions there.

Mirko


2013/11/22 Developer 

> Might not be a perfect solution but you can use edgengram filter and copy
> all
> your field data to that field and use it for suggestion.
>
>  positionIncrementGap="100">
>   
> 
> 
>  maxGramSize="250" />
>   
>   
> 
> 
>   
> 
>
> http://localhost:8983/solr/core1/select?q=name:iphone
>
> The above query will return
> iphone
> iphone5c
> iphone4g
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Suggester-how-to-return-exact-match-tp4102203p4102521.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Automatically build spellcheck dictionary on replicas

2013-12-03 Thread Mirko
Hi all,
We use a Solr SpellcheckComponent with a file-based dictionary. We run a
master and some replica slave servers. To update the dictionary, we copy
the dictionary txt file to the master, from where it is automatically
replicated to all slaves. However, it seems we need to run the
"spellcheck.build" query on all servers individually.

Is there a way to automatically build the spellcheck dictionary on all
servers without calling "spellcheck.build" on all slaves individually?

We use Solr 4.0.0

Thanks,
Mirko


Re: Automatically build spellcheck dictionary on replicas

2013-12-03 Thread Mirko
Yes, I have that, but it doesn't help. It seems Solr still needs the query
with the "spellcheck.build" parameter to build the spellchecker index.


2013/12/3 Kydryavtsev Andrey 

> Did you try to add
>   true
>  parameter to your slave's spellcheck configuration?
>
> 03.12.2013, 12:04, "Mirko" :
> > Hi all,
> > We use a Solr SpellcheckComponent with a file-based dictionary. We run a
> > master and some replica slave servers. To update the dictionary, we copy
> > the dictionary txt file to the master, from where it is automatically
> > replicated to all slaves. However, it seems we need to run the
> > "spellcheck.build" query on all servers individually.
> >
> > Is there a way to automatically build the spellcheck dictionary on all
> > servers without calling "spellcheck.build" on all slaves individually?
> >
> > We use Solr 4.0.0
> >
> > Thanks,
> > Mirko
>


Solr Suggester ranked by boost

2013-12-04 Thread Mirko
I want to implement a Solr Suggester (http://wiki.apache.org/solr/Suggester)
that ranks suggestions by document boost factor.

As I understand the documentation, the following config should work:

Solrconfig.xml:

...


true
7
true


suggest





default
suggesttext
org.apache.solr.spelling.suggest.Suggester
org.apache.solr.spelling.suggest.fst.WFSTLookupFactory
true


...

Schema.xml:

...

...

...

I added three documents with a document boost:

{

"add": {
  "commitWithin": 5000,
  "overwrite": true,
  "boost": 3.0,
  "doc": {
"id": "1",
"suggesttext": "text bb"
  }
},
"add": {
  "commitWithin": 5000,
  "overwrite": true,
  "boost": 2.0,
  "doc": {
"id": "2",
"suggesttext": "text cc"
  }
},
"add": {
  "commitWithin": 5000,
  "overwrite": true,
  "boost": 1.0,
  "doc": {
"id": "3",
"suggesttext": "text aa"
  }
}

}

A query the suggest handler (with spellcheck.q=te) gives the following
response:

{
  "responseHeader":{
"status":0,
"QTime":6},
  "command":"build",
  "response":{"numFound":3,"start":0,"docs":[
  {
"id":"1",
"suggesttext":["text bb"]},
  {
"id":"2",
"suggesttext":["text cc"]},
  {
"id":"3",
"suggesttext":["text aa"]}]
  },
  "spellcheck":{
"suggestions":[
  "te",{
"numFound":3,
"startOffset":0,
"endOffset":2,
"suggestion":["text aa",
  "text bb",
  "text cc"]}]}}

The search results are ranked by boost as expected. However, the
suggestions are not ranked by boost (but alphabetically instead). I also
tried the TSTLookup and FSTLookup lookup implementations with the same
result.

Any ideas what I'm missing?

Thanks,
Mirko


Re: Automatically build spellcheck dictionary on replicas

2013-12-04 Thread Mirko
Ok, thanks for pointing that out!


2013/12/3 Kydryavtsev Andrey 

> Yep, sorry, it doesn't work for file-based dictionaries:
>
> > In particular, you still need to index the dictionary file once by
> issuing a search with &spellcheck.build=true on the end of the URL; if you
> system doesn't update that dictionary file, then this only needs to be done
> once. This manual step may be required even if your configuration sets
> build=true and reload=true.
>
> http://wiki.apache.org/solr/FileBasedSpellChecker
>
> 03.12.2013, 21:27, "Mirko" :
> > Yes, I have that, but it doesn't help. It seems Solr still needs the
> query
> > with the "spellcheck.build" parameter to build the spellchecker index.
> >
> > 2013/12/3 Kydryavtsev Andrey 
> >
> >>  Did you try to add
> >>true
> >>   parameter to your slave's spellcheck configuration?
> >>
> >>  03.12.2013, 12:04, "Mirko"  >:
> >>>  Hi all,
> >>>  We use a Solr SpellcheckComponent with a file-based dictionary. We
> run a
> >>>  master and some replica slave servers. To update the dictionary, we
> copy
> >>>  the dictionary txt file to the master, from where it is automatically
> >>>  replicated to all slaves. However, it seems we need to run the
> >>>  "spellcheck.build" query on all servers individually.
> >>>
> >>>  Is there a way to automatically build the spellcheck dictionary on all
> >>>  servers without calling "spellcheck.build" on all slaves individually?
> >>>
> >>>  We use Solr 4.0.0
> >>>
> >>>  Thanks,
> >>>  Mirko
>


solr + cocoon problem

2007-01-16 Thread mirko
Hi,

I am trying to implement a cocoon based application using solr for searching.
In particular, I would like to forward the request from my response page to
solr.  I have tried several alternatives, but none of them worked for me.

One which would seem a logical way to me is to have response page, which is
forwarded to solr with cocoon's file generator.  It works fine if I perform
queries which contain only alphanumeric characters, but it gives the following
error if I try to query for a string containing nonalphanum characters:

http://hostname/cocoon/mywebapp/response?q=a+b

java.io.IOException: Server returned HTTP response code: 505 for URL:
http://hostname/solr/select/?q=a b


The interesting thing is that if I access http://hostname/solr/select/?q=a b
directly it works.


The relevant part of my sitemap.xmap:


  http://hostname/solr/select/?q={request-param:q}";
type="file" >
  
  


Any ideas on how to implement a cocoon layer above solr?

thanks,
mirko

ps. I realize this question might be more of a cocoon question, but I am
posting it here because I have gotten the idea from
http://wiki.apache.org/solr/XsltResponseWriter to use cocoon on top of solr) 
So, I assume some of you have already had run into similar issues and/or knows
the solution...


Re: solr + cocoon problem

2007-01-17 Thread mirko
Hi,

I agree, this is not a legal URL.  But the thing is that cocoon itself is
sending the unescaped URL.  That is why I thought I am not using the right
tools from cocoon.

mirko


Quoting Chris Hostetter <[EMAIL PROTECTED]>:

>
> : java.io.IOException: Server returned HTTP response code: 505 for URL:
> : http://hostname/solr/select/?q=a b
> :
> :
> : The interesting thing is that if I access http://hostname/solr/select/?q=a
> b
> : directly it works.
>
> i don't know anything about cocoon, but that is not a legal URL, URLs
> can't have spaces in them ... if you type a space into your browser, it's
> probably being nice and URL escaping it for you (that's what most browsers
> seem to do now a days)
>
> i'm guessing Cocoon automaticaly un-escapes the input to your app, and you
> need to re-URL escape it before sending it to Solr.
>
>
>
>
> -Hoss
>




Re: solr + cocoon problem

2007-01-17 Thread mirko
Thanks Thorsten,

that really was helpful.  Cocoon's url-encode module does solve my problem.

mirko


Quoting Thorsten Scherler <[EMAIL PROTECTED]>:

> On Wed, 2007-01-17 at 10:25 -0500, [EMAIL PROTECTED] wrote:
> > Hi,
> >
> > I agree, this is not a legal URL.  But the thing is that cocoon itself is
> > sending the unescaped URL.
>
> ...because you told it so.
>
> You use
>  src="http://hostname/solr/select/?q={request-param:q}";
> type="file" >
>
> The request param module will not escape the param by default.
>
> salu2
>




SolrSearchGenerator for Cocoon (2.1)

2007-03-27 Thread mirko
Hi,

I looked at the SolrSearchGenerator (this is the part which is of interest to
me), but I could not get it work for Cocoon 2.1 yet.

It seems that the there is no getParameters method for the
org.apache.cocoon.environment interface:
http://cocoon.apache.org/2.1/apidocs/org/apache/cocoon/environment/Request.html
I guess you using the getParameterNames and getParameter methods instead should
do the trick.

Or am I missing something?

mirko



Quoting Thorsten Scherler <[EMAIL PROTECTED]>:

> On Mon, 2007-03-26 at 09:30 -0400, Winona Salesky wrote:
> > Thanks Chris, I'll take another look at the forest plugin.
>
> Have a look as well at http://wiki.apache.org/solr/SolrForrest
> it points out the cocoon components.
>
> salu2
> --
> Thorsten Scherler thorsten.at.apache.org
> Open Source Java & XMLconsulting, training and solutions
>




Re: Filter query doesn't always work...

2007-03-27 Thread mirko
Hi,

you might want to use the sint (sortable integer) fieldtype instead.  If you use
 the integer fieldtype I guess the range queries are treated as string prefixes
(like in [Ab TO Ch]).

You can find some documentation about it in the example schema.xml:
http://svn.apache.org/viewvc/lucene/solr/trunk/example/solr/conf/schema.xml

mirko


Quoting escher2k <[EMAIL PROTECTED]>:

>
> I have a strange problem, and I don't seem to see any issue with the data. I
> am filtering
> on a field called reviews_positive_6_mos. The field is declared as an
> integer.
>
> If I specify -
> (a) fq=reviews_positive_6mos%3A[*+TO+*] => 36033 records are retrieved.
> (b) fq=reviews_positive_6mos%3A[*+TO+100] => 35996 records are retrieved.
> (c) fq=reviews_positive_6mos%3A[80+TO+100] => 0 records are retrieved.
> (d) fq=reviews_positive_6mos%3A[80+TO+*] => 9 records are retrieved.
> (e) fq=reviews_positive_6mos%3A[100+TO+100] => 764 records are retrieved.
>
> I am not sure what could be wrong in cases (c) and (d), especially when
> there is a lot of data where
> reviews_positive_6mos = 100. Any suggestions would be most appreciated.
>
> Thanks.
> --
> View this message in context:
>
http://www.nabble.com/Filter-query-doesn%27t-always-work...-tf3474766.html#a9698269
> Sent from the Solr - User mailing list archive at Nabble.com.
>




numFound for facet results

2007-04-30 Thread mirko
Hi,

could you tell me what is the (simplest|elegant|fast) way of implementing
the following:

I use faceted browsing, but I limit the number of facet counts to 5 (i.e.,
facet.limit=5).

1. I would like to be able to show if there are more facet values
(this can be achieved with the trick for asking 6 values and only displaying 5
and if the 6th is non-empty obviously there are more than 5 :)

2. I would like to be able to tell how many facet values are there
total.  (This would be a value like numFound for the results).
Is there such a thing or a workaround like for 1.

thanks,
mirko


problem with schema.xml

2007-06-08 Thread mirko
Hi,

I just started playing around with Solr 1.2.  It has some nice improvements.
I noticed that errors in the schema.xml get reported in a verbose way now, but
the following steps cause a problem for me:

1. start with a correct schema.xml - Solr works fine
2. edit it in a way that is no longer correct (say, remove the  closing
tag - Solr works fine
3. restart the webapp (through the Tomcat manager interface) - Solr complains
that the schema.xml does not parse, fine.
4. now restart again (without fixing the schema.xml!) - Solr won't even start up
5. fix the above problem (add the closing tag) and restart via Tomcat's manager
- the webapp cannot restart showing that there is a problem:
FAIL - Application at context path /furness could not be started

The following steps might seem artificial, but assume you don't manage to fix
all the typos in your schema.xml for the first attempt.  It seems after restart
Solr gets stuck in some state and I cannot get it up and running by Tomcat's
manager, only by restarting Tomcat.

Am I missing something?
Thanks,
mirko


Re: problem with schema.xml

2007-06-08 Thread mirko
Hi Ryan,

I have my .war file located outside the webapps folder (I am using multiple
Solr instances with a config as suggested on the wiki:
http://wiki.apache.org/solr/SolrTomcat).

Nevertheless, I touched the .war file, the config file, the directory under
webapps, but nothing seems to be working.

Any other suggestions?  Is someone else experiencing the same problem?
thanks,
mirko


Quoting Ryan McKinley <[EMAIL PROTECTED]>:

> I don't use tomcat, so I can't be particularly useful.  The behavior you
> describe does not happen with resin or jetty...
>
> My guess is that tomcat is caching the error state.  Since fixing the
> problem is outside the webapp directory, it does not think it has
> changed so it stays in a broken state.
>
> if you "touch" the .war file, does it restart ok?
>
> but i'm just guessing...
>
>


Indexing XML files

2006-12-05 Thread mirko
Hi,

I am trying to index an xml file as a field in lucene, see example below:


 
  As You Like it
  Shakespeare, William
  here goes the xml...
 


I can index the title and author fields because they are strings, but the
record field is an xml itself and I bump into some problems as I cannot
directly input an xml file using the post.sh script (solr complains).


I wonder what would be the correct (and relatively simple) way of doing it. 
Ideally, I would like to store the xml as is, and index only the content
removing the xml-tags (I believe there is HTMLStripWhitespaceAnalyzer for
that).
And output the result as an xml (so, simple escaping does not work for me).


So far, I had the idea of escaping the xml record and then unescaping it for
inner storage and using the analyzer for indexing (which would possible
require creating a class like XMLField or such).

thanks,
mirko


Re: Indexing XML files

2006-12-05 Thread mirko
Hi,

Thanks for the quick response.  Now, I have one more question.
Is it possible to get the result for a query back in the following form
(considering the input is the escaped xml, what you mentioned before):


 
  0
  0
 

 
  
   As You Like It (Promptbook of McVicars 1860)Shakespeare, William,
   ...
  
 


Note, that the here the xml data is not escaped.  If yes, what do I have to do
to get such results back?  Would  need to be replaced with a type, say,
 which has a different write method?  Or will I only be able to display
escaped xml within  (and any other types).  If so, why?

thanks,
mirko


Quoting Chris Hostetter <[EMAIL PROTECTED]>:

>
> Since XML is the transport for sending data to Solr, you need to make sure
> all field values are XML escaped.
>
> If you wanted to index a plain text "title" and that tile contained an
> ampersand character
>
>   Sense & Sensability
>
> ...you would need to XML escape that as...
>
>   Sense & Sensability
>
> ...Solr internally will treat that consistently as the JAva string "Sense
> & Sensability" and when it comes time to return that string back to your
> query clients, will output it in whatever form is appropraite for your
> ResponseWriter -- if that's XML, then it will be XML escaped again, if
> it's JSON or something ike it, it can probably be left alone.
>
> The same holds tru for any other characters you wna to include in your
> field values: Solr doens't care that they *value* itself is an XML string,
> just that you properly escape the value in your XML  message to
> Solr...
>
>  
>   
>As You Like it
>Shakespeare, William
><myxml>here goes the
> xml...</myxml>
>   
>  
>
> ...does that make sense?
>
> : Ideally, I would like to store the xml as is, and index only the content
> : removing the xml-tags (I believe there is HTMLStripWhitespaceAnalyzer for
> : that).
> : And output the result as an xml (so, simple escaping does not work for me).
>
> the escaping is just to send the data to Solr -- once sent, Solr will
> process the unescaped string when deailing with analyzers, etc exactly as
> you'd expect.
>
>
> -Hoss
>




Re: Indexing XML files

2006-12-05 Thread mirko
You are right, it is escaped.  But my question is: (how) can I
make it unescaped?

mirko


Quoting Yonik Seeley <[EMAIL PROTECTED]>:

...
>
> I bet it is escaped, but your browser has helpfully displayed it as
> unescaped.
> Try doing CTRL-U in firefox to see the real source for the reply.
>
>
> -Yonik
>




Re: Indexing XML files

2006-12-05 Thread mirko
Hi,

the idea is to apply XSLT transformation on the result.  But it seems that
I would have to apply two transformations in a row, one which unescapes the
escaped node and a second which performs the actual transformation...

mirko


Quoting Yonik Seeley <[EMAIL PROTECTED]>:

> On 12/5/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> > You are right, it is escaped.  But my question is: (how) can I
> > make it unescaped?
>
> For what purpose?
> If you use an XML parser, the values it gives back to you will be unescaped.
>
> -Yonik
>




Re: Indexing XML files

2006-12-07 Thread mirko
Thank you all for the quick responses.  They were very helpful.

My XML is well-formed, so I ended up implementing my own FieldType:

public class XMLField extends TextField {
  public void write(XMLWriter xmlWriter, String name, Fieldable f) throws
IOException {
xmlWriter.writePrim("xml", name, f.stringValue(), false);
  }
}

I looked at the XSD and there is one thing I don't understand:

If the desired way is to conform to the XSD (and hence the types used in XSD),
then how would it possible to use user-defined fieldtypes as plugins?  Wouldn't
they violate the same principle?

thanks,
mirko


Quoting Chris Hostetter <[EMAIL PROTECTED]>:
...
> I think Walters got the right idea ... as a general rule, we want to make
> the XmlResponseWriter "bullet proof" so that no matter waht data you put
> into your index, it is garunteed to produce a well formed XML document
> that conforms to a specified DTD, or XSD (see SOLR-17 for one we already
> have but we haven't figured out what to do with yet)
>
...

> if you're interested in writing a bit of custom java code you could in
> fact write a new FieldType (which could easily subclass TextField) with a
> custom "write" method that just outputs the raw value directly, and then
> load your field type as a plugin...
>
>   http://wiki.apache.org/solr/SolrPlugins
>
> -Hoss
>




Create field date using name file

2015-03-02 Thread Mirko Torrisi

Hi folks,

Hopefully this is an easy question but I couldn't do it after several 
hours..


I created a new field (adding indexed="true" stored="true"/>) and I'd like to use name file value to 
fill out it.
The name files are like: TEXT_CRE_MMGG_X-XXX-XXX.txt or 
TEXT_CRE_MMGG_X-XXX.txt (where every X are random numbers).


I'd like to use a date field type to be able to use some group functions.


Thank in advance.
Have a nice week,

Mirko


Re: Create field date using name file

2015-03-02 Thread Mirko Torrisi
I forgot to add that the txt files are divided in directory following 
this rule: //MM/**files**.


Regards,
Mirko


Invalid Date String:'1992-07-10T17'

2015-03-10 Thread Mirko Torrisi

Hi all,

I am very new with Solr (and Lucene) and I use the last version of it.
I do not understand why I obtain this:

   Exception in thread "main"
   org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
   from server at http://localhost:8983/solr/Collection1: Invalid Date
   String:'1992-07-10T17'
at
   
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:558)
at
   
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:214)
at
   
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:210)
at
   
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:91)
at
   org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:302)
at Update.main(Update.java:18)


Here the code that creates this error:

SolrQuery query = new SolrQuery();
String a = "speechDate:1992-07-10T17:33:18Z";
query.set("fq", a);
//query.setQuery( a );  <-- I also tried using this one.



According to 
https://cwiki.apache.org/confluence/display/solr/Working+with+Dates, it 
should be right. I tried with others date, or just |-MM-DD, with no 
success.



My goal is to group these speeches (hopefully using date math syntax). I 
would like to know if you suggest me to use date or tdate or other 
because I have not understood the difference.



Thanks in advance,|

Mirko||


Re: Invalid Date String:'1992-07-10T17'

2015-03-11 Thread Mirko Torrisi
Thanks very much for each of your replies. These resolved my problem and 
teach me something important.
I have just discovered that I have another problem but I guess that I 
have to open another discussion.


Cheers,

Mirko

On 10/03/15 20:30, Chris Hostetter wrote:

":" is a syntactically significant character to the query parser, so it's
getting confused by it in the text of your query.

you're seeing the same problem as if you tried to search for "foo:bar" in
the "yak" field using q=yak:foo:bar

you either need to backslash escape the ":" characters, or wrap the date
in quotes, or use a diff parser that doesn't treat colons as special
characters (but remember that since you are building this up as a java
string, you have to deal with *java* string escaping as well...

String a = "speechDate:1992-07-10T17\\:33\\:18Z";
String a = "speechDate:\"1992-07-10T17:33:18Z\"";
String a = "speechDate:" + 
ClientUtils.escapeQueryChars("1992-07-10T17:33:18Z");
String a = "{!field f=speechDate}1992-07-10T17:33:18Z";

: My goal is to group these speeches (hopefully using date math syntax). I would

Unless you are truely seraching for only documents that have an *exact*
date value matching your input (down to the millisecond) then seraching or
a single date value is almost certainly not what you want -- you most
likely want to do a range search...

   String a = "speechDate:[1992-07-10T00:00:00Z TO 1992-07-11T00:00:00Z]";

(which doesn't require special escaping, because the query parser is smart
enough to know that ":" aren't special inside of the "[..]")

: like to know if you suggest me to use date or tdate or other because I have
: not understood the difference.

the difference between date and tdate has to do with how you wnat to trade
index size (on disk & in ram) with search speed for range queries like
these -- tdate takes up a little more room in the index, but came make
range queries faster.


-Hoss
http://www.lucidworks.com/




how to store _text field

2015-03-12 Thread Mirko Torrisi

Hi folks,

I googled and tried without success so I ask you: how can I modify the 
setting of a field to store it ?


It is interesting to note that I did not add _text field so I guess it 
is a default one. Maybe it is normal that it is not showed on the result 
but actually this is my real problem. It could be grand also to copy it 
in a new field but I do not know how to do it with the last Solr (5) and 
the new kind of schema. I know that I have to use curl but I do not know 
how to use it to copy a field.


Thank you in advance!
Cheers,

 Mirko


Re: how to store _text field

2015-03-13 Thread Mirko Torrisi

Hi Alexandre,

I need to visualize the content of _txt. For some reasons, actual it is 
not showed in the results (the "response").
I guess that it doesn't happen because it isn't stored (for some default 
setting that I'd like to change).


Thanks for your help,

Mirko

On 13/03/15 00:27, Alexandre Rafalovitch wrote:

Wait, step back. This is confusing. What's your real problem you are
trying to solve?

Regards,
Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 12 March 2015 at 19:50, Mirko Torrisi  wrote:

Hi folks,

I googled and tried without success so I ask you: how can I modify the
setting of a field to store it ?

It is interesting to note that I did not add _text field so I guess it is a
default one. Maybe it is normal that it is not showed on the result but
actually this is my real problem. It could be grand also to copy it in a new
field but I do not know how to do it with the last Solr (5) and the new kind
of schema. I know that I have to use curl but I do not know how to use it to
copy a field.

Thank you in advance!
Cheers,

  Mirko




Re: how to store _text field

2015-03-19 Thread Mirko Torrisi

Hi Erick,

I'm sorry for this delay but I've just seen this reply.

I'm using the last version of solr and the default setting is to use the 
new kind of indexing, it doesn't use schema.xml and for that I have no 
idea about how set "store" for this field.
The content is grabbed because I've obtained results using the search 
function but it is not showed because it is not setted to "store".


I hope to be clear.
Thanks very much.

All the best,

Mirko

On 14/03/15 17:58, Erick Erickson wrote:

Right, your schema.xml file will define, perhaps, some "dynamic
fields". First insure that stored="true" is specified. If you change
this, you have to re-index the docs.

Second, insure that your "fl" parameter with the field is specified on
the requests, something like q=*:*&fl=eoe_txt.

Third, insure that you are actually sending content to that field when
you index docs.

If none of this helps, show us the definition from schema.xml and a
sample input document and a query that illustrate the problem please.

Best,
Erick

On Fri, Mar 13, 2015 at 1:20 AM, Mirko Torrisi
 wrote:

Hi Alexandre,

I need to visualize the content of _txt. For some reasons, actual it is not
showed in the results (the "response").
I guess that it doesn't happen because it isn't stored (for some default
setting that I'd like to change).

Thanks for your help,

Mirko


On 13/03/15 00:27, Alexandre Rafalovitch wrote:

Wait, step back. This is confusing. What's your real problem you are
trying to solve?

Regards,
 Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 12 March 2015 at 19:50, Mirko Torrisi 
wrote:

Hi folks,

I googled and tried without success so I ask you: how can I modify the
setting of a field to store it ?

It is interesting to note that I did not add _text field so I guess it is
a
default one. Maybe it is normal that it is not showed on the result but
actually this is my real problem. It could be grand also to copy it in a
new
field but I do not know how to do it with the last Solr (5) and the new
kind
of schema. I know that I have to use curl but I do not know how to use it
to
copy a field.

Thank you in advance!
Cheers,

   Mirko






Addtion to solr wiki editor list

2015-04-19 Thread Mirko Cegledi
Hi there!

I'd like to be added to the list of people who are able to edit the solr
wiki at https://wiki.apache.org/solr. I'm working as a Java developer for a
german company using Solr (and like it a lot) a lot and I would like to be
able to correct things as soon as I find them without going to the
IRC-channel to get things changed.

My wiki name should be campfire.

Thanks in advance


Re: how to store _text field

2015-04-28 Thread Mirko Torrisi
Hi guys,

I used the Erick's suggestions (thanks again!!) to create a new field and
copy in it the _text content.

curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field" : { "name":"content", "type":"string", "indexed":true,
"stored":true}, "add-copy-field" : { "source":"_text", "dest": [
"content"]}}' http://localhost:8983/solr/Test/schema

That seems a good way but I discovered the presence of "bias" in every
content field. Indeed, they start with a string of this kind:

 \n \n stream_content_type text/plain  \n stream_size 1556  \n
Content-Encoding UTF-8  \n X-Parsed-By
org.apache.tika.parser.DefaultParser  \n X-Parsed-By
org.apache.tika.parser.txt.TXTParser  \n Content-Type text/plain;
charset=UTF-8  \n resourceName /home/mirko/Desktop/data
sample/sample1/TEXT_CRE_20110608_3-114-500.txt

Now I need to cut off this part but I have no idea also because the path
(present in the last part) has a dynamic length.

For someone could be a problem to have two field with the same content
(double space needed). I have not this problem because I use Solrj to
import, modify and export each document. Maybe I could use it to do also
this but hopefully you know a cleaner method.

Cheers,
Mirko


Mirko

On 19 March 2015 at 20:11, Erick Erickson  wrote:

> Hmm, not all that sure. That's one thing about schemaless indexing, it
> has to guess. It does the best it can, but it's quite possible that it
> guesses wrong.
>
> If this is a "mananged schema", you can use the REST API commands to
> make whatever field you want. Or you can start over with a concrete
> schema.xml and use _that_. Otherwise, I'm not sure what to say without
> actually being on your system.
>
> Wish I could help more.
> Erick
>
> On Thu, Mar 19, 2015 at 5:39 AM, Mirko Torrisi
>  wrote:
> > Hi Erick,
> >
> > I'm sorry for this delay but I've just seen this reply.
> >
> > I'm using the last version of solr and the default setting is to use the
> new
> > kind of indexing, it doesn't use schema.xml and for that I have no idea
> > about how set "store" for this field.
> > The content is grabbed because I've obtained results using the search
> > function but it is not showed because it is not setted to "store".
> >
> > I hope to be clear.
> > Thanks very much.
> >
> > All the best,
> >
> > Mirko
> >
> >
> > On 14/03/15 17:58, Erick Erickson wrote:
> >>
> >> Right, your schema.xml file will define, perhaps, some "dynamic
> >> fields". First insure that stored="true" is specified. If you change
> >> this, you have to re-index the docs.
> >>
> >> Second, insure that your "fl" parameter with the field is specified on
> >> the requests, something like q=*:*&fl=eoe_txt.
> >>
> >> Third, insure that you are actually sending content to that field when
> >> you index docs.
> >>
> >> If none of this helps, show us the definition from schema.xml and a
> >> sample input document and a query that illustrate the problem please.
> >>
> >> Best,
> >> Erick
> >>
> >> On Fri, Mar 13, 2015 at 1:20 AM, Mirko Torrisi
> >>  wrote:
> >>>
> >>> Hi Alexandre,
> >>>
> >>> I need to visualize the content of _txt. For some reasons, actual it is
> >>> not
> >>> showed in the results (the "response").
> >>> I guess that it doesn't happen because it isn't stored (for some
> default
> >>> setting that I'd like to change).
> >>>
> >>> Thanks for your help,
> >>>
> >>> Mirko
> >>>
> >>>
> >>> On 13/03/15 00:27, Alexandre Rafalovitch wrote:
> >>>>
> >>>> Wait, step back. This is confusing. What's your real problem you are
> >>>> trying to solve?
> >>>>
> >>>> Regards,
> >>>>  Alex.
> >>>> 
> >>>> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
> >>>> http://www.solr-start.com/
> >>>>
> >>>>
> >>>> On 12 March 2015 at 19:50, Mirko Torrisi  >
> >>>> wrote:
> >>>>>
> >>>>> Hi folks,
> >>>>>
> >>>>> I googled and tried without success so I ask you: how can I modify
> the
> >>>>> setting of a field to store it ?
> >>>>>
> >>>>> It is interesting to note that I did not add _text field so I guess
> it
> >>>>> is
> >>>>> a
> >>>>> default one. Maybe it is normal that it is not showed on the result
> but
> >>>>> actually this is my real problem. It could be grand also to copy it
> in
> >>>>> a
> >>>>> new
> >>>>> field but I do not know how to do it with the last Solr (5) and the
> new
> >>>>> kind
> >>>>> of schema. I know that I have to use curl but I do not know how to
> use
> >>>>> it
> >>>>> to
> >>>>> copy a field.
> >>>>>
> >>>>> Thank you in advance!
> >>>>> Cheers,
> >>>>>
> >>>>>Mirko
> >>>
> >>>
> >
>


Near-Realtime-Search, CommitWithin and AtomicUpdates

2020-06-16 Thread Mirko Sertic
Hi@all,
 
I'm using Solr 6.6 and trying to validate my setup for AtomicUpdates and
Near-Realtime-Search.
 
Some questions are bogging my mind, so maybe someone can give me a hint
to make things clearer.
 
I am posting regular updates to a collection using the UpdateHandler and
Solr Command Syntax, including updates and deletes. These changes are
commited using the commitWithin configuration every 30 seconds.
 
Now I want to use AtomicUpdates on MultiValue'd fields, so I post the
"add" commands for these fields only. Sometimes I have to post multiple
Solr commands affecting the same document, but within the same
commitWithin interval. The question is now, what is the final new value
of the field after the atomic update add operations? From my point of
view the final value should be the old value plus the newly added
values, which is commited to the index in the next commitWithin period.
So can I combine multiple AtomicUpdate commands affecting the same
document within the same commitWithin interval?
 
Another thing that is bogging me: can I combine multiple AtomicUpdates
for the same document with CopyFields? Does Solr use some kind of
dirty-read or pending uncommited changes to get the right value of the
source field, or is the source always the last commited value?
 
So in summary, does Solr AtomicUpdates use some kind of dirty-read
mechanism do do its "magic" ?
 
Thanks in advance,
Mirko