Re: Request header is too large.

2012-07-28 Thread Alexandre Rafalovitch
Where is the error message? On the database side?

If it is repeatable, I would just put the two on separate machines and
wireshark the http conversation. The problem might become apparent then
from visual inspection.

Regards,
Alex
On Jul 28, 2012 1:24 PM, "Xue-Feng Yang"  wrote:

> Hi all,
>
> When run DIH  indexing data from database, I run into the following error.
>
> Anyone knows what is the problem?
>
> Thanks,
>
> Xufeng
>
> ///
>
> SEVERE: GRIZZLY0040: Request header is too large.
> java.nio.BufferOverflowException
> at
> com.sun.grizzly.tcp.http11.InternalInputBuffer.fill(InternalInputBuffer.java:765)
> at
> com.sun.grizzly.tcp.http11.InternalInputBuffer.parseHeader(InternalInputBuffer.java:669)
> at
> com.sun.grizzly.tcp.http11.InternalInputBuffer.parseHeaders(InternalInputBuffer.java:555)
> at
> com.sun.grizzly.http.ProcessorTask.parseRequest(ProcessorTask.java:881)
> at com.sun.grizzly.http.ProcessorTask.doProcess(ProcessorTask.java:692)
> at com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1019)
> at
> com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:225)
> at
> com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137)
> at
> com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104)
> at
> com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90)
> at
> com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79)
> at
> com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54)
> at
> com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59)
> at com.sun.grizzly.ContextTask.run(ContextTask.java:71)
> at
> com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532)
> at
> com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513)
> at java.lang.Thread.run(Thread.java:662)


Re: Query term completion via the suggester

2012-07-28 Thread Ahmet Arslan
> 

this should be : 


Geocoding with Solr

2012-07-28 Thread Spadez
Hi!

I am using Solr as my main search system for my site. Currently, I am using
google to turn a place name (such as a postcode or city) into a long / lat
co-ordinate. Then I am supplying this long / lat to Solr so it can perform a
spacial search.

I am really new to this, but I dont like my reliance on google for this. Is
it possible to import a database into Solr which linked both cities and
postcodes to a co-ordinate, and use Solr for this as well?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Geocoding-with-Solr-tp3997913.html
Sent from the Solr - User mailing list archive at Nabble.com.


Query term completion via the suggester

2012-07-28 Thread Michael Belenki
Hi,

I am trying to configure the suggester for solr 3.6 as described under the
http://wiki.apache.org/solr/Suggester but the configuration does not work.
I cannot figure out what I am doing wrong...

After starting Solr-Server I am getting an exception
"org.apache.solr.common.SolrException: no field name specified in
query and no default specified via 'df' param". If I try to do a query to
get a query suggestion
"http://localhost:8983/solr/suggest?q=comp&df=autocomplete";, Solr only
returns documents but no suggestions for query completion.


In the schema.xml the field is defined as following: "". The text spell type is: 


  





  



The request handler is defined is following:



true
a_suggest
true
5
true



suggest



The corresponding suggest component:

   

a_suggest
org.apache.solr.spelling.suggest.Suggester
org.apache.solr.spelling.suggest.fst.FSTLookup
autocomplete
true
100

   

best regards,

Michael


Re: Bulk Indexing

2012-07-28 Thread Sohail Aboobaker
We have auto commit on and will basically send it in a loop after
validating each record, we send it to search service. And keep doing it in
a loop. Mikhail / Lan, are you suggesting that instead of sending it in a
loop, we should collect them in an array and do a commit at the end? Is
this better than doing it in a loop with auto commit.

Also, where can I find some reference on Master / Slave configuration.

Thanks.


Request header is too large.

2012-07-28 Thread Xue-Feng Yang
Hi all,

When run DIH  indexing data from database, I run into the following error.

Anyone knows what is the problem?

Thanks,

Xufeng

///

SEVERE: GRIZZLY0040: Request header is too large.
java.nio.BufferOverflowException
    at 
com.sun.grizzly.tcp.http11.InternalInputBuffer.fill(InternalInputBuffer.java:765)
    at 
com.sun.grizzly.tcp.http11.InternalInputBuffer.parseHeader(InternalInputBuffer.java:669)
    at 
com.sun.grizzly.tcp.http11.InternalInputBuffer.parseHeaders(InternalInputBuffer.java:555)
    at com.sun.grizzly.http.ProcessorTask.parseRequest(ProcessorTask.java:881)
    at com.sun.grizzly.http.ProcessorTask.doProcess(ProcessorTask.java:692)
    at com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1019)
    at 
com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:225)
    at 
com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137)
    at 
com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104)
    at 
com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90)
    at com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79)
    at 
com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54)
    at 
com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59)
    at com.sun.grizzly.ContextTask.run(ContextTask.java:71)
    at 
com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532)
    at 
com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513)
    at java.lang.Thread.run(Thread.java:662)

Re: Solr 4.0-ALPHA and ModifiableSolrParams

2012-07-28 Thread Federico Valeri
OK I definitely need a response parser.
Thank you!

2012/7/28 Erik Hatcher 

> And by parser, what is meant is a ResponseParser.  There is an example in
> one of the Solr 4 test cases that goes like this:
>
>   public void testGetRawFile() throws SolrServerException, IOException {
> SolrServer server = getSolrServer();
> //assertQ(req("qt", "/admin/file")); TODO file bug that
> SolrJettyTestBase extends SolrTestCaseJ4
> QueryRequest request = new QueryRequest(params("file","schema.xml"));
> request.setPath("/admin/file");
> final AtomicBoolean readFile = new AtomicBoolean();
> request.setResponseParser(new ResponseParser() {
>   @Override
>   public String getWriterType() {
> return "mock";//unfortunately this gets put onto params wt=mock
> but it apparently has no effect
>   }
>
>   @Override
>   public NamedList processResponse(InputStream body, String
> encoding) {
> try {
>   if (body.read() >= 0)
> readFile.set(true);
> } catch (IOException e) {
>   throw new RuntimeException(e);
> }
> return null;
>   }
>
>   @Override
>   public NamedList processResponse(Reader reader) {
> throw new UnsupportedOperationException("TODO
> unimplemented");//TODO
>   }
> });
>
> server.request( request );//runs request
> //request.process(server); but we don't have a NamedList response
> assertTrue(readFile.get());
>   }
>
> So... you can read the JSON, but you'll need to do something like the
> above.
>
> Erik
>
>
> On Jul 28, 2012, at 08:11 , in.abdul wrote:
>
> > Solrj can support only xml writer and binary writer . It not possible get
> > the response in Json . If your requirement is to get response in Json
> then
> > you have to write parser ..
> > Syed Abdul kather
> > send from Samsung S3
> > On Jul 28, 2012 1:29 AM, "Federico Valeri [via Lucene]" <
> > ml-node+s472066n3997784...@n3.nabble.com> wrote:
> >
> >> Hi, I'm trying to get a JSON response with this Java code:
> >>
> >> SolrServer solr = new HttpSolrServer("http://localhost:8080/solr";);
> >> ModifiableSolrParams params = new ModifiableSolrParams();
> >> params.set("qt", "/select");
> >> params.set("q", "contenuto:(" + query + ")");
> >> params.set("hl", "true");
> >> params.set("hl.fl", "id,contenuto");
> >> params.set("wt", "json");
> >> QueryResponse response = solr.query(params);
> >> log.debug(response.toString());
> >>
> >> but from log i see "&wt=javabin" in query
> >> and "..docs=[SolrDocument{id=452011.." in response ..
> >>
> >> instead I would expect "&wt=json" and "..docs=[{id=452011.."
> >>
> >> What I am missing?
> >>
> >>
> >> --
> >> If you reply to this email, your message will be added to the discussion
> >> below:
> >>
> >>
> http://lucene.472066.n3.nabble.com/Solr-4-0-ALPHA-and-ModifiableSolrParams-tp3997784.html
> >> To unsubscribe from Lucene, click here<
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=472066&code=aW4uYWJkdWxAZ21haWwuY29tfDQ3MjA2NnwxMDczOTUyNDEw
> >
> >> .
> >> NAML<
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
> >
> >>
> >
> >
> >
> >
> > -
> > THANKS AND REGARDS,
> > SYED ABDUL KATHER
> > --
> > View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-4-0-ALPHA-and-ModifiableSolrParams-tp3997784p3997858.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>
>


Re: Autocomplete terms from the middle of name/description of a Doc

2012-07-28 Thread Rajani Maski
Hi,

   One approach for this can be to get fact.prefix results for prefix based
suggests and for suggesting names from middle of doc what you can do is
index that name field with white space and edge ngram filter; search on
that field with prefix key word and fl=title only.. Then concatenate both :
facet prefix results and doc fields obtained for that search.

Ex: user searched for "lcd"
query should be  :  q=name_edgramed=lcd&facet.prefix= lcd &fl=
name_edgramed.

You will get documents matched results having this keyword and also faceted
results with this prefix.

--Rajani







On Thu, Jul 26, 2012 at 12:21 AM, Chantal Ackermann <
c.ackerm...@it-agenten.com> wrote:

>
> > Suppose I have a product with a title='kMix Espresso maker'. If I
> tokenize
> > this and put the result in product_tokens I should get
> > '[kMix][Espresso][maker]'.
> >
> > If now I try to search with facet.field='product_tokens' and
> > facet.prefix='espresso' I should get only 'espresso' while I want 'kMix
> > Espresso maker'.
>
> Yes, you are probably right. I did use this approach at somepoint. Your
> remark has made me check my code again.
> I was using n_gram in the end.
>
> (facet.prefix on tokenized fields might work in certain circumstances
> where you can get the actual value from the string field (or its facet) in
> parallel.)
>
> This is the jquery autocomplete plugin instantiation:
>
> $(function() {
> $("#qterm").autocomplete({
> minLength: 1,
> source: function(request,response) {
> jQuery.ajax({
> url: "/solr/select",
> dataType: "json",
> data: {
> q : "title_ngrams:\"" +
> request.term + "\"",
> rows: 0,
> facet: "true",
> "facet.field": "title",
> "facet.mincount": 1,
> "facet.sort": "index",
> "facet.limit": 10,
> "fq": "end_date:[NOW TO *]"
> wt: "json"
> },
> success: function( data ) {
> /*var result = jQuery.map(
> data.facet_counts.facet_fields.title, function( item, index ) {
> if (index%2)
> return null;
> else return {
> //label:
> item,
> value: item
> }
> });*/
> var result = [];
> var facets =
> data.facet_counts.facet_fields.title;
> var j = 0;
> for (i=0; i result[j] = facets[i];
> j = j+1;
> }
> response(result);
> }
> });
> }
> });
>
> And here the fieldtype ngram for "title_ngram". "title" is a string type
> field.
>
> 
> 
> 
>  class="solr.KeywordTokenizerFactory" />
>  min="1" max="500" />
> 
>  class="solr.ISOLatin1AccentFilterFactory" />
>  class="solr.WordDelimiterFilterFactory" splitOnCaseChange="1"
>  splitOnNumerics="1"
> stemEnglishPossessive="1" generateWordParts="1"
>  generateNumberParts="1" catenateAll="1"
> preserveOriginal="1" />
>  class="solr.LowerCaseFilterFactory" />
>  class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15"
> side="front"/>
>  class="solr.RemoveDuplicatesTokenFilterFactory" />
> 
> 
>  class="solr.KeywordTokenizerFactory" />
> 
>  class="solr.ISOLatin1AccentFilterFactory" />
>  class="solr.Wo

Re: Solr 4.0-ALPHA and ModifiableSolrParams

2012-07-28 Thread Erik Hatcher
And by parser, what is meant is a ResponseParser.  There is an example in one 
of the Solr 4 test cases that goes like this:

  public void testGetRawFile() throws SolrServerException, IOException {
SolrServer server = getSolrServer();
//assertQ(req("qt", "/admin/file")); TODO file bug that SolrJettyTestBase 
extends SolrTestCaseJ4
QueryRequest request = new QueryRequest(params("file","schema.xml"));
request.setPath("/admin/file");
final AtomicBoolean readFile = new AtomicBoolean();
request.setResponseParser(new ResponseParser() {
  @Override
  public String getWriterType() {
return "mock";//unfortunately this gets put onto params wt=mock but it 
apparently has no effect
  }

  @Override
  public NamedList processResponse(InputStream body, String 
encoding) {
try {
  if (body.read() >= 0)
readFile.set(true);
} catch (IOException e) {
  throw new RuntimeException(e);
}
return null;
  }

  @Override
  public NamedList processResponse(Reader reader) {
throw new UnsupportedOperationException("TODO unimplemented");//TODO
  }
});

server.request( request );//runs request
//request.process(server); but we don't have a NamedList response
assertTrue(readFile.get());
  }

So... you can read the JSON, but you'll need to do something like the above.

Erik


On Jul 28, 2012, at 08:11 , in.abdul wrote:

> Solrj can support only xml writer and binary writer . It not possible get
> the response in Json . If your requirement is to get response in Json then
> you have to write parser ..
> Syed Abdul kather
> send from Samsung S3
> On Jul 28, 2012 1:29 AM, "Federico Valeri [via Lucene]" <
> ml-node+s472066n3997784...@n3.nabble.com> wrote:
> 
>> Hi, I'm trying to get a JSON response with this Java code:
>> 
>> SolrServer solr = new HttpSolrServer("http://localhost:8080/solr";);
>> ModifiableSolrParams params = new ModifiableSolrParams();
>> params.set("qt", "/select");
>> params.set("q", "contenuto:(" + query + ")");
>> params.set("hl", "true");
>> params.set("hl.fl", "id,contenuto");
>> params.set("wt", "json");
>> QueryResponse response = solr.query(params);
>> log.debug(response.toString());
>> 
>> but from log i see "&wt=javabin" in query
>> and "..docs=[SolrDocument{id=452011.." in response ..
>> 
>> instead I would expect "&wt=json" and "..docs=[{id=452011.."
>> 
>> What I am missing?
>> 
>> 
>> --
>> If you reply to this email, your message will be added to the discussion
>> below:
>> 
>> http://lucene.472066.n3.nabble.com/Solr-4-0-ALPHA-and-ModifiableSolrParams-tp3997784.html
>> To unsubscribe from Lucene, click 
>> here
>> .
>> NAML
>> 
> 
> 
> 
> 
> -
> THANKS AND REGARDS,
> SYED ABDUL KATHER
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-4-0-ALPHA-and-ModifiableSolrParams-tp3997784p3997858.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr 4.0-ALPHA and ModifiableSolrParams

2012-07-28 Thread in.abdul
Solrj can support only xml writer and binary writer . It not possible get
the response in Json . If your requirement is to get response in Json then
you have to write parser ..
Syed Abdul kather
send from Samsung S3
On Jul 28, 2012 1:29 AM, "Federico Valeri [via Lucene]" <
ml-node+s472066n3997784...@n3.nabble.com> wrote:

> Hi, I'm trying to get a JSON response with this Java code:
>
> SolrServer solr = new HttpSolrServer("http://localhost:8080/solr";);
> ModifiableSolrParams params = new ModifiableSolrParams();
> params.set("qt", "/select");
> params.set("q", "contenuto:(" + query + ")");
> params.set("hl", "true");
> params.set("hl.fl", "id,contenuto");
> params.set("wt", "json");
> QueryResponse response = solr.query(params);
> log.debug(response.toString());
>
> but from log i see "&wt=javabin" in query
> and "..docs=[SolrDocument{id=452011.." in response ..
>
> instead I would expect "&wt=json" and "..docs=[{id=452011.."
>
> What I am missing?
>
>
> --
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/Solr-4-0-ALPHA-and-ModifiableSolrParams-tp3997784.html
>  To unsubscribe from Lucene, click 
> here
> .
> NAML
>




-
THANKS AND REGARDS,
SYED ABDUL KATHER
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-0-ALPHA-and-ModifiableSolrParams-tp3997784p3997858.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: querying using filter query and lots of possible values

2012-07-28 Thread Daniel Brügge
Hi,

thanks for this hint. Will check this out. Sounds promising.

Daniel

On Sat, Jul 28, 2012 at 3:18 AM, Chris Hostetter
wrote:

>
> : the list of IDs is constant for a longer time. I will take a look at
> : these join thematic.
> : Maybe another solution would be to really create a whole new
> : collection or set of documents containing the aggregated documents (from
> the
> : ids) from scratch and to execute queries on this collection. Then this
> : would take
> : some time, but maybe it's worth it because the querying will thank you.
>
> Another avenue to consider...
>
>
> http://lucene.apache.org/solr/api-4_0_0-ALPHA/org/apache/solr/schema/ExternalFileField.html
>
> ...would allow you to map values in your "source_id" to some numeric
> values (many to many) and these numeric values would then be accessible in
> functions -- so you could use something like fq={!frange ...} to select
> all docs with value 67 where your extenral file field says that value 67
> is mapped ot the following thousand source_id values.
>
> the external field fields can then be modified at any time just by doing a
> commit on your index.
>
>
>
> -Hoss
>


Re: Bulk Indexing

2012-07-28 Thread Mikhail Khludnev
Lan,

I assume that some particular server can freeze on such bulk. But overall
message seems not absolutely correct to me. Solr has a lot of mechanisms to
survive in such cases.
Bulk indexing is absolutely right (if you submit single request with long
iterator of SolrInputDocs). This indexing thread can occupy single cpu
core, keeping others ready for searches. Such indexing occupies
ramBufferSizeMB of heap. After limit is exceeded new segment is flushed to
disk, which require some IO and can impact searchers. (misconfigured merge
can ruin everything, of course)
Commit should been executed from business consideration not performance
ones. Commit leads to creating new searcher and warming it, these actions
can be memory and cpu expensive (almost single thread activity).
I did some experiments on 40 M index at desktop box. Constantly adding 1K
docs/sec with autocommit more than once per minute, doesn't have
significant impact on search latency.
Generally, yes. Master-Slave scheme has more performance, for sure.

On Sat, Jul 28, 2012 at 4:01 AM, Lan  wrote:

> I assume your're indexing on the same server that is used to execute search
> queries. Adding 20K documents in bulk could cause the Solr Server to 'stop
> the world' where the server would stop responding to queries.
>
> My suggestion is
> - Setup master/slave to insulate your clients from 'stop the world' events
> during indexing.
> - Update in batches with a commit at the end of the batch.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Bulk-Indexing-tp3997745p3997815.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics